Supported in part by the IIS program from NSF, award #0916448 (while at FSU), #1212310 (after transering to Utah), NSF link 1, NSF link 2.
[Overview] [Papers and Talks] [Source Code]
[Dataset] [Contacts] 1. Finding Frequent Items in Probabilistic Data, Important Notice If you use this library for your work, please kindly cite our paper. Thanks! If you find any bugs or any suggestions/comments,
we are very happy to hear from you! Library Description The library is developed in GNU C++. It also comes with the data Generator in Matlab. To compile, simply go to each folder and type Make. We also have an efficient C++ implementation of the space saving algorithm (An Integrated Efficient Solution for Computing Frequent and Top-k Elements in Data Streams, by Metwally et al., ACM TODS, 2006). Download PHitter Library [tar.gz] Quick Install The subfolder's names are self-explain. Each
subfolder contains a Makefile for easy-compilation. All the main test program
has a verbose help output to explain what parameters it expects. We have generated and experimented with the datasets described in the paper. In the source-code released above, it also contains the generator for the synthetic data sets. For real data sets, please follow the description in our paper. Overview
Papers and Talks
Source Code