Supported in part by the IIS program from NSF, award #0916448 (while at FSU), #1212310 (after transering to Utah), NSF link 1, NSF link 2.
[Overview] [Papers and Talks] [Source Code]
[Dataset] [Contacts] 1. Finding Frequent Items in Probabilistic Data, Important Notice If you use this library for your work, please kindly cite our paper. Thanks! If you find any bugs or any suggestions/comments,
we are very happy to hear from you! Library Description The library is developed in GNU C++. It also comes with the data Generator in Matlab. To compile, simply go to each folder and type Make. We also have an efficient C++ implementation of the space saving algorithm (An Integrated Efficient Solution for Computing Frequent and Top-k Elements in Data Streams, by Metwally et al., ACM TODS, 2006). Download PHitter Library [tar.gz] Quick Install The subfolder's names are self-explain. Each
subfolder contains a Makefile for easy-compilation. All the main test program
has a verbose help output to explain what parameters it expects. We have generated and experimented with the datasets described in the paper. In the source-code released above, it also contains the generator for the synthetic data sets. For real data sets, please follow the description in our paper. Overview
Papers and Talks
Source Code
Dataset
Contacts