Group Publications:

  • Cleaning Search Results using Term Distance Features. With J. Attenberg. 4th Workshop on Adversarial Information Retrieval on the Web (in conjunction with WWW), April 2008. PDF
  • Geographic Web Usage Estimation by Monitoring DNS Caches. With H. Akcan and H. Broennimann. 1st International Workshop on Location and the Web (in conjunction with WWW), April 2008. PDF
  • Analysis of Geographic Queries in a Search Engine Log. With Q. Gan, J. Attenberg, and A. Markowetz. 1st International Workshop on Location and the Web (in conjunction with WWW), April 2008. PDF
  • Using Graphics Processors for High-Performance IR Query Processing. With S. Ding, J. He, and H. Yan. 17th International World Wide Web Conference (WWW), Poster Session, April 2008. PDF
  • Performance of Compressed Inverted List Caching in Search Engines. With J. Zhang and X.Long. 17th International World Wide Web Conference (WWW), April 2008. PDF
  • Algorithms for Low-Latency Remote File Synchronization. With H. Yan and U. Irmak. IEEE Infocom Conference, April 2008. PDF
  • Improving Web Spam Classifiers Using Link Structure. With Q. Gan. 3rd Workshop on Adversarial Information Retrieval on the Web (held in conjunction with WWW), May 2007. PDF
  • Efficient Search in Large Textual Collections with Redundancy. With J. Zhang. 16th International World Wide Web Conference (WWW), May 2007. PDF
  • Optimized Inverted List Assignment in Distributed Search Engine Architectures. With J. Zhang. 21st IEEE International Parallel & Distributed Processing Symposium (IPDPS'07), March 2007. PDF
  • Efficient Query Subscription Processing for Prospective Search Engines. With U. Irmak, S. Mihaylov, S. Ganguly, and R. Izmailov. USENIX Annual Technical Conference, May 2006. PDF
  • Efficient Query Processing in Geographic Web Search Engines. With Y. Chen and A. Markowetz. ACM Intern. Conference on Management of Data (SIGMOD), June 2006. PDF
  • Approximate Maximum Weighted Branchings. With A. Bagchi and A. Bhargava. Information Processing Letters, accepted for publication, February 2006. PDF (preliminary version)
  • Interactive Wrapper Generation with Minimal User Effort. With U. Irmak. 15th International World Wide Web Conference (WWW), May 2006. PDF
  • Efficient Query Evaluation on Large Textual Collections in a Peer-to-Peer Environment. With J. Zhang. 5th IEEE International Conference on Peer-to-Peer Computing, August 2005. PDF
  • Design and Implementation of a Geographic Search Engine. With A. Markowetz, Y. Chen, X. Long, and B. Seeger. 8th International Workshop on the Web and Databases (WebDB), June 2005. PDF (Note: an extended version is available as Technical Report TR-CIS-2005-03, Polytechnic University, February 2005. PDF )
  • Hierarchical Substring Caching for Efficient Content Distribution to Low-Bandwidth Clients. With U. Irmak. 14th International World Wide Web Conference (WWW), May 2005. PDF
  • Three-Level Caching for Efficient Query Processing in Large Web Search Engines. With X. Long. 14th International World Wide Web Conference (WWW), May 2005. PDF
  • Improved Single-Round Protocols for Remote File Synchronization. With U. Irmak and S. Mihaylov. IEEE Infocom Conference, March 2005. PDF (Note: an earlier version with some of the results appeared at the 4th New York Metro Area Networking Workshop (NYMAN), September 2004.)
  • Optimal Peer Selection for P2P Downloading and Streaming. With M. Adler, R. Kumar, K. Ross, D. Rubenstein, and D. Yao. IEEE Infocom Conference, March 2005. PDF
  • The Perron-Frobenius Theorem and Some of its Applications. With U. Pillai and S. Cha. IEEE Signal Processing Magazine 2, 2005, pp. 62-75.
  • Approximation Algorithms for Array Partitioning Problems. With S. Muthukrishnan. Journal of Algorithms 54, 2005, pp. 85-104. PDF
  • Local Methods for Estimating PageRank Values. With Y. Chen and Q. Gan. 13th Conference on Information and Knowledge Management (CIKM), November 2004. PDF (Note: an earlier version appeared at the 3rd Workshop on Web Dynamics in conjunction with WWW 2004.)
  • Compressing File Collections with a TSP-Based Approach. With D. Trendafilov and N. Memon. Technical Report TR-CIS-2004-02, Polytechnic University, April 2004. PDF
  • Improved File Synchronization Techniques for Maintaining Large Replicated Collections over Slow Networks. With P. Noel and D. Trendafilov. IEEE International Conference on Data Engineering (ICDE), March 2004. PDF (Talk: PPT PDF, 6 per page )
  • Server-Friendly Delta Compression for Efficient Web Access. With A. Savant. Eighth International Workshop on Web Content Caching and Distribution (WCW), September 2003. PDF (Talk: PPT PDF, 6 per page )
  • Optimized Query Execution in Large Search Engines with Global Page Ordering. With X. Long. International Conference on Very Large Data Bases (VLDB), September 2003. PDF (Talk: PPT PDF, 6 per page )
  • ODISSEA: A Peer-to-Peer Architecture for Scalable Web Search and Information Retrieval. With C. Mathur, J. Wu, J. Zhang, A. Delis, M. Kharrazi, X Long, and K. Shanmugasundaram. 6th International Workshop on the Web and Databases (WebDB), June 2003. PDF (Talk: PPT PDF, 6 per page )
    Technical Report (23 pages): PDF
    WWW 2003 Poster Version (2 pages): PDF
  • On the Scalability of an Image Transcoding Proxy Server. With A. Savant and N. Memon. International Conference on Image Processing, September 2003. PDF
  • Inferring Tree Topologies Using Flow Tests. With S. Muthukrishnan and R. Vingralek. SIAM Symposium on Discrete Algorithms (SODA), January 2003 (short paper). PDF
  • Cluster-Based Delta Compression of Collections of Files. With Z. Ouyang, N. Memon, and D. Trendafilov. International Conference on Web Information Systems Engineering (WISE), December 2002. PDF (Talk: PPT PDF, 6 per page )
  • I/O-Efficient Techniques for Computing Pagerank. With Y. Chen and Q. Gan. ACM Conference on Information and Knowledge Engineering (CIKM), November 2002. PDF
  • An Efficient Distributed Algorithm for Constructing Small Dominating Sets. With L. Jia and R. Rajaraman. Distributed Computing 14, 2002, pp. 193-205. (Note: an earlier version appeared at the 20th ACM Symposium on Principles of Distributed Computing (PODC), August 2001. PDF )
  • zdelta: An Efficient Delta Compression Tool. With D. Trendafilov and N. Memon. Technical Report TR-CIS-2002-02, Polytechnic University, June 2002. PDF
  • Algorithms for Delta Compression and Remote File Synchronization With N. Memon. Invited chapter in Handbook of Lossless Compression . Edited by K. Sayood, Academic Press, August 2002. PDF (preliminary version)
  • Design and Implementation of a High-Performance Distributed Web Crawler. With V. Shkapenyuk. IEEE International Conference on Data Engineering (ICDE), February 2002. Postscript PDF (Talk: PPT PDF, 6 per page )