Torsten Suel
Professor
Computer Science & Engineering
Polytechnic School of Engineering
New York University

I am a faculty member in the Department of Computer Science and Engineering at the School of Engineering of New York University. I received a Diplom degree from the Technical University of Braunschweig in Germany, and a Ph.D. from the University of Texas at Austin. Before joining the faculty in the Fall of 1998, I held postdoctoral positions at the NEC Research Institute and Bell Labs. I was on leave during 2008, working at Yahoo! Research in Santa Clara, CA.


COURSE OFFERINGS IN SPRING 2014:
CS308 - Introduction to Database Systems     course syllabus (PDF)
CS6913 - Web Search Engines     course syllabus (PDF)


RESEARCH:

My main research areas are web search technology, algorithms, databases, data compression, and distributed computation. I currently spend most of my time working with my graduate and undergraduate students in the Web Exploration and Search Technology Lab, who are building web search engines and other search tools and experimenting with new techniques and architectures. In addition, I still sometimes do some work in the algorithms and database areas, and I am also interested in data compression, networking, and distributed systems. For more information, visit the homepage of my research group or look at my complete list of papers.

One of my current main interests is search engine architecture, and particularly how to improve the efficiency of query processing in large engines, which have to process billions of queries per day over trillions of documents. This work is funded by the National Science Foundation under the grant "III-1117829: Efficient Query Processing in Large Search Engines" -- see here for the project homepage.

Student Research Opportunities: I usually have a few research topics available for highly motivated students who want to do research under my supervision, e.g., as part of a senior project or MS thesis. But please read this first before contacting me about research opportunities.


TEACHING:

I will be teaching CS6083 (Database Systems) in the Fall 2011 semester. Here are some of the courses I have taught at Poly in recent years:


CURRENT PHD STUDENTS:
PHD GRADUATES:
  • Josh Attenberg "Novel Techniques for Improving Classification Systems by Incorporating Experts", 2013. (Employment: Etsy)
  • Yen-Yu Chen "Geographic Search Engines", 2006. (Employment: Shanda)
  • Shuai Ding "Index Compression and Efficient Query Processing in Large Web Search Engines", 2013. (Employment: Google)
  • Qingqing Gan "Mining the Web to Improve Search Engine Performance", 2008. (Employment: Microsoft)
  • Jinru He "Indexing and Querying over Versioned Text", 2013. (Employment: Facebook)
  • Utku Irmak "Algorithms for Information Extraction and Dissemination on the World-Wide Web", 2006. (Employment: LinkedIn)
  • Xiaohui Long "Efficient Query Processing in Large Web Search Engines", 2006. (Employment: MSN Search)
  • Hao Yan "Index Compression and Redundancy Elimination in Large Textual Collections", 2010. (Employment: LinkedIn)
  • Jiangong Zhang "Indexing and Query Processing in Distributed Search Engines", 2008. (Employment: Amazon)


RECENT PAPERS:

  • Automated Decision Support for Human Tasks in a Collaborative System: The Case of Deletion in Wikipedia. B. Gelley and T. Suel. Proceedings of WikiSym, August 2013. (available soon) PDF

  • A Candidate Filtering Mechanism for Fast Top-K Query Processing on Modern CPUs. C. Dimopoulos, S. Nepomnyachiy, and T. Suel. 36th Annual ACM SIGIR Conference, July 2013. PDF

  • Optimizing Top-k Document Retrieval Strategies for Block-Max Indexes. C. Dimopoulos, S. Nepomnyachiy, and T. Suel. 6th ACM Conference on Web Search and Data Mining, February 2013. PDF

  • Optimizing Positional Index Structures for Versioned Document Collections. J. He and T. Suel. 35th Annual ACM SIGIR Conference, July 2012. PDF

  • To Index or not to Index: Time-Space Trade-Offs in Search Engines with Positional Ranking Functions. D. Arroyuelo, S. Gonzalez, M. Marin, M. Oyarzun, and T. Suel. 35th Annual ACM SIGIR Conference, July 2012. PDF

  • Text vs. Space: Efficient Geo-Search Query Processing. M. Christoforaki, J. He, C. Dimopoulos, A. Markowetz, and T. Suel. 20th ACM Conference on Information and Knowledge Management, October 2011. PDF

  • Scalable Manipulation of Archival Web Graphs. Y. Avcular and T. Suel. Workshop on Large-Scale and Distributed Systems for Information Retrieval. October 2011. PDF

  • Faster Temporal Range Queries over Versioned Text. J. He and T. Suel. 34th Annual ACM SIGIR Conference, July 2011. PDF

  • Faster Top-k Document Retrieval Using Block-Max Indexes. S. Ding and T. Suel. 34th Annual ACM SIGIR Conference, July 2011. PDF

  • Batch Query Processing for Web Search Engines. S. Ding, J. Attenberg, R. Baeza-Yates, and T. Suel. 4th ACM Conference on Web Search and Data Mining, February 2011. PDF

  • Improved Index Compression Techniques for Versioned Document Collections. With J. He and J. Zeng. 19th ACM Conference on Information and Knowledge Management, October 2010. PDF

  • Efficient Term Proximity Search with Term-Pair Indexes. With H. Yan, S. Shi, F. Zhang, and J. Wen. 19th ACM Conference on Information and Knowledge Management, October 2010. PDF

  • Scalable Techniques for Document Identifier Assignment in Inverted Indexes. With S. Ding and J. Attenberg. 19th International World Wide Web Conference (WWW), April 2010. PDF

  • Compact Full-Text Indexing of Versioned Document Collections. With J. He and H. Yan. 18th ACM Conference on Information and Knowledge Management, November 2009. PDF

  • Modeling and Predicting User Behavior in Sponsored Search. With J. Attenberg and S. Pandey. 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), June 2009. PDF

  • Compressing Term Positions in Web Indexes. With H. Yan and S. Ding. 32nd Annual ACM SIGIR Conference, June 2009. PDF

  • Using Graphics Processors for High-Performance IR Query Processing. With S. Ding, J. He, and H. Yan. 18th International World Wide Web Conference (WWW), April 2009. PDF [An earlier shorter version appeared as a poster at the 17th WWW, April 2008]

  • Inverted Index Compression and Query Processing with Optimized Document Ordering. With H. Yan and S. Ding. 18th International World Wide Web Conference (WWW), April 2009. PDF

  • Improved Techniques for Result Caching in Web Search Engines. With Q. Gan. 18th International World Wide Web Conference (WWW), April 2009. PDF

  • Top-k Aggregation Using Intersection of Ranked Inputs. with R. Kumar, K. Punera, and S. Vassilvitskii. Second ACM International Conference on Web Search and Data Mining (WSDM), February 2009. PDF

  • Cleaning Search Results using Term Distance Features. With J. Attenberg. 4th Workshop on Adversarial Information Retrieval on the Web (in conjunction with WWW), April 2008. PDF

  • Geographic Web Usage Estimation by Monitoring DNS Caches. With H. Akcan and H. Broennimann. 1st International Workshop on Location and the Web (in conjunction with WWW), April 2008. PDF

  • Analysis of Geographic Queries in a Search Engine Log. With Q. Gan, J. Attenberg, and A. Markowetz. 1st International Workshop on Location and the Web (in conjunction with WWW), April 2008. PDF

  • Performance of Compressed Inverted List Caching in Search Engines. With J. Zhang and X.Long. 17th International World Wide Web Conference (WWW), April 2008. PDF

  • Algorithms for Low-Latency Remote File Synchronization. With H. Yan and U. Irmak. IEEE Infocom Conference, April 2008. PDF

  • Improving Web Spam Classifiers Using Link Structure. With Q. Gan. 3rd Workshop on Adversarial Information Retrieval on the Web (held in conjunction with WWW), May 2007. PDF

  • Efficient Search in Large Textual Collections with Redundancy. With J. Zhang. 16th International World Wide Web Conference (WWW), May 2007. PDF

  • Optimized Inverted List Assignment in Distributed Search Engine Architectures. With J. Zhang. 21st IEEE International Parallel & Distributed Processing Symposium (IPDPS'07), March 2007. PDF

  • Efficient Query Subscription Processing for Prospective Search Engines. With U. Irmak, S. Mihaylov, S. Ganguly, and R. Izmailov. USENIX Annual Technical Conference, May 2006. PDF

  • Efficient Query Processing in Geographic Web Search Engines. With Y. Chen and A. Markowetz. ACM Intern. Conference on Management of Data (SIGMOD), June 2006. PDF

  • Approximate Maximum Weighted Branchings. With A. Bagchi and A. Bhargava. Information Processing Letters, 99(2), 2006. PDF (preliminary version)

  • Interactive Wrapper Generation with Minimal User Effort. With U. Irmak. 15th International World Wide Web Conference (WWW), May 2006. PDF

  • Efficient Query Evaluation on Large Textual Collections in a Peer-to-Peer Environment. With J. Zhang. 5th IEEE International Conference on Peer-to-Peer Computing, August 2005. PDF

  • Design and Implementation of a Geographic Search Engine. With A. Markowetz, Y. Chen, X. Long, and B. Seeger. 8th International Workshop on the Web and Databases (WebDB), June 2005. PDF (Note: an extended version is available as Technical Report TR-CIS-2005-03, Polytechnic University, February 2005. PDF)

  • Hierarchical Substring Caching for Efficient Content Distribution to Low-Bandwidth Clients. With U. Irmak. 14th International World Wide Web Conference (WWW), May 2005. PDF

  • Three-Level Caching for Efficient Query Processing in Large Web Search Engines. With X. Long. 14th International World Wide Web Conference (WWW), May 2005. PDF

  • Improved Single-Round Protocols for Remote File Synchronization. With U. Irmak and S. Mihaylov. IEEE Infocom Conference, March 2005. PDF (Note: an earlier version with some of the results appeared at the 4th New York Metro Area Networking Workshop (NYMAN), September 2004.)

  • Optimal Peer Selection for P2P Downloading and Streaming. With M. Adler, R. Kumar, K. Ross, D. Rubenstein, and D. Yao. IEEE Infocom Conference, March 2005. PDF

  • The Perron-Frobenius Theorem and Some of its Applications. With U. Pillai and S. Cha. IEEE Signal Processing Magazine 2, 2005, pp. 62-75.

  • Approximation Algorithms for Array Partitioning Problems. With S. Muthukrishnan. Journal of Algorithms 54, 2005, pp. 85-104. PDF

  • Local Methods for Estimating PageRank Values. With Y. Chen and Q. Gan. 13th Conference on Information and Knowledge Management (CIKM), November 2004. PDF (Note: an earlier version appeared at the 3rd Workshop on Web Dynamics in conjunction with WWW 2004.)

  • Compressing File Collections with a TSP-Based Approach. With D. Trendafilov and N. Memon. Technical Report TR-CIS-2004-02, Polytechnic University, April 2004. PDF

  • Improved File Synchronization Techniques for Maintaining Large Replicated Collections over Slow Networks. With P. Noel and D. Trendafilov. IEEE International Conference on Data Engineering (ICDE), March 2004. PDF (Talk: PPT PDF, 6 per page)


CONTACT INFORMATION:

Office: 10.046 (2 MetroTech Center)
Phone: (718) 260 3354
Fax: (718) 260 3609
Email: suel (at) poly.edu
US Mail: CSE Department
Polytechnic Institute of NYU
6 MetroTech Center
Brooklyn, NY 11201