Journal Publications:
- A Near-Optimal
Algorithm for Computing the Entropy of a Stream
To appear: ACM Transactions on Algorithms (with A. Chakrabarti and G. Cormode). - On the Hardness of
Approximating Stopping and Trapping Sets in LDPC
Codes
To appear: IEEE Transactions on Information Theory (with O. Milenkovic) - Sub-linear
Estimation of Entropy and Information Distances
ACM Transactions on Algorithms, 5 (2009), no. 4, pg. 1-16. (with S. Guha and S. Venkatasubramanian) - Stream Order and
Order Statistics: Quantile Estimation in Random-Order
Streams
SIAM Journal of Computing, 38 (2009), no. 5, 2044-2059 (with S. Guha) - Graph
Distances in the Data Stream Model
SIAM Journal of Computing, 38 (2008), no. 5, pg. 1709-1727 (with J. Feigenbaum, S. Kannan, S. Suri, and J. Zhang) - Estimating
Statistical Aggregates on Probabilistic Data Streams
ACM Trans. Database Syst, 33 (2008), no. 4, pg. 1-30 (with T. S. Jayram, S. Muthukrishnan, and E. Vee) - Sketching
Information Divergences
Journal of Machine Learning, 72 (2008), no. 1-2, pg. 5-19 (with S. Guha and P. Indyk) - Distance
distribution of binary codes and the error probability of
decoding
IEEE Transactions on Information Theory, 51 (2005), no. 12, pg. 4237-4246. (with A. Barg) - On Graph Problems in a
Semi-Streaming Model
Theoretical Computer Science, 348 (2005), no. 2-3, pg. 207-216. (with J. Feigenbaum, S. Kannan, S. Suri and J. Zhang)
Conference Publications (N.B. Links point to journal/extended versions if appropriate):
- Space-Efficient
Estimation of Robust Statistics and Distribution
Testing
ICS 2010 (with S. Chien and K. Ligett) - Probabilistic
Histograms for Probabilistic Data
VLDB 2009 (with G. Cormode, A. Deligiannakis, M. Garofalakis) - The Oil Searching
Problem
ESA 2009 (with K. Onak and R. Panigrahy) - Annotations in Data
Streams
ICALP 2009 (with A. Chakrabarti and G. Cormode). - Sampling to
Estimate Conditional Functional Dependencies
SIGMOD 2009 (with G. Cormode, L. Golab, F. Korn, D. Srivastava, and X. Zhang) - Finding Metric
Structure in Information Theoretic Clustering
COLT 2008 (with K. Chaudhuri) - Tight Lower Bounds for
Multi-Pass Stream Computation via Pass Elimination
ICALP 2008 (with S. Guha) - Approximation
Algorithms for Clustering Uncertain Data
PODS 2008 (with G. Cormode) - Robust Lower Bounds
for Communication and Stream Computation
STOC 2008 (with A. Chakrabarti and G. Cormode) Slides - Sorting and
Selection with Random Costs
LATIN 2008 (with S. Angelov and K. Kunal) - Declaring
Independence via the Sketching of Sketches
SODA 2008 (with P. Indyk) - On the Hardness of
Approximating Stopping and Trapping Sets in LDPC
Codes
ITW 2007 (with O. Milenkovic) - Lower Bounds for
Quantile Estimation in Random-Order and Multi-Pass
Streaming
ICALP 2007 (with S. Guha) Slides - Checking and
Spot-Checking of Heaps
ICALP 2007 (with M. Chu and S. Kannan) Slides - Sketching
Information Divergences
COLT 2007 (with S. Guha and P. Indyk)
Invited to Special Issue of Machine Learning Journal - Estimating
Statistical Aggregates on Probabilistic Data Streams
PODS 2007 (with T. S. Jayram, S. Muthukrishnan, and E. Vee)
Invited to Special Issue of ACM Transactions on Database Systems - Space-Efficient
Sampling
AISTATS 2007 (with S. Guha) - Island Hopping and
Path Colouring with Applications to WDM Network
Design
SODA 2007 (with F.B. Shepherd) Slides - A Near-Optimal
Algorithm for Computing the Entropy of a Stream
SODA 2007 (with A. Chakrabarti and G. Cormode) Slides - Spatial Scan
Statistics: Approximations and Performance Study
KDD 2006 (with D. Agarwal, J. Phillips, S. Venkatasubramanian and Z. Zhu) - Approximate
Quantiles and the Order of the Stream
PODS 2006 (with S. Guha) - Streaming and
Sublinear Approximation of Entropy and Information
Distances
SODA 2006 (with S. Guha and S. Venkatasubramanian) - More on
Reconstructing Strings from Random Traces: Insertions and
Deletions
ISIT 2005 (with S. Kannan) Slides - Approximating the
Best-Fit Tree Under L_p Norms
Approx 2005 (with B. Harb and S. Kannan) Slides - Finding Matchings
in the Streaming Model
Approx 2005 Slides - Graph
Distances in the Streaming Model: The Value of Space
SODA 2005. (with J. Feigenbaum, S. Kannan, S. Suri, and J. Zhang) Slides - A Problem in
Scheduling: Your Time Starts Now...
FUN with Algorithms 2004 - List decoding of
concatenated codes: improved performance estimates
ISIT 2004 (with A. Barg) Slides - On Graph Problems in a
Semi-Streaming Model
ICALP 2004 (with J. Feigenbaum, S. Kannan, S. Suri and J. Zhang)
Invited to Special Issue of Theoretical Computer Science - Reconstructing
Strings from Random Traces
SODA 2004 (with T. Batu, S. Kannan and S. Khanna) - More on the
Reliability Function of the Binary Symmetric Channel
ISIT 2003 (with A. Barg) Slides - Distance
distribution of binary codes and the error probability of
decoding
WCC 2003 International Workshop on Coding and Cryptography (with A. Barg) - Physical Modeling
of Musical Instruments using Bond Graphs
Brazilian Symposium on Computer Music 1999 (with E. Miranda and P. Gawthrop)
- Better
Bounds for Frequency Moments in Random-Order Streams
Manuscript (with A. Andoni, K. Onak and R. Panigrahy) - Graph Mining
in Streams
Entry for the Encyclopedia of Database Systems - Processing Data
Streams
Ph.D. Thesis (Advisor: Sampath Kannan) - Open
Questions In Data Streams and Related Topics
Questions arising at the IITK Workshop on Algorithms for Data Streams 2006.
Co-authors: Deepak Agarwal, Alexandr Andoni, Stan Angelov, Sasha Barg, Tugkan Batu, Amit Chakrabarti, Kamalika Chaudhuri, Steve Chien, Matthew Chu, Graham Cormode, Antonios Deligiannakis, Joan Feigenbaum, Minos Garofalakis, Peter Gawtrop, Lukasz Golab, Sudipto Guha, Boulos Harb, Piotr Indyk, T. S. Jayram, Sampath Kannan, Sanjeev Khanna, Flip Korn, Keshav Kunal, Katrina Ligett, Olgica Milenkovic, Eduardo Miranda, S. Muthukrishnan, Krzysztof Onak, Rina Panigrahy, Jeff Phillips, Bruce Shepherd, Divesh Srivastava, Sid Suri, Erik Vee, Suresh Venkatasubramanian, Jian Zhang, Xi Zhang, and Zhengyuan Zhu.
NB: Since most of the papers above are published, the copyright has been transferred to the respective publishers. Therefore, the papers cannot be duplicated for commercial purposes. See below for ACM's copyright notice.
Copyright © 200x by the Association for Computing
Machinery, Inc. Permission to make digital or hard copies
of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or
distributed for profit or commercial advantage and that new
copies bear this notice and the full citation on the first
page. Copyrights for components of this work owned by
others than ACM must be honored. Abstracting with credit is
permitted.