ATTENTION: THIS WEB SITE HAS MOVED.
The pages you are looking at are no longer being maintained.
Please go to http://www.poly.edu/cis/
to visit the new site of the Department of Computer and Information Science
at Polytechnic University.
Databases And Information Retrieval
(Profs. Delis,
Hellerstein,
Memon,
Suel)
The fourth major research concentration in the department is concerned
with the management, querying and analysis of large data sets, and
includes the areas of database systems, data mining, information
retrieval, and web search and exploration. Work is performed in
several labs and research groups, with emphasis on algorithmic and
architectural issues.
Client-Server Databases: Prof. Delis and his students in the
Database Systems Lab are working on architectural and performance
issues for client-server databases. Most modern databases are
organized following variants of the Client-Server model, where a
number of clients (e.g., PCs) interact with one or more servers that
use database engines to retrieve data and serve it to the clients.
Prof. Delis' work, supported by an NSF Career Award, focuses on
performance issues in such architectures, where a naive implementation
quickly leads to a performance bottleneck at the server. He has
studied the scalability of the standard two-tier Client-Server model,
and has proposed a three-tier model that employs a number of
optimization techniques, including caching, prefetching, and client
clustering, to scale to larger numbers of clients.
Query Processing and Optimization: Database systems have to
be able to efficiently process highly complicated queries on large
amounts of data. To achieve this, systems use a variety of techniques,
such as highly optimized index structures for accessing the data, or
query optimization for finding the best way to execute a query.
Several faculty members are working on new techniques in this
area, including index structures, cost estimation techniques,
approximate query answers, and efficient operations in spatial
databases. In particular, Prof. Delis has studied the performance
characteristics of common index structures for disk- and
memory-resident data, and the efficient implementation of temporal
query operations. Prof. Hellerstein is working on new techniques
for selectivity and cost estimation of database queries, and has
worked on efficient coding schemes for parallel disk architectures
(RAID) and methods for generating random range queries. Prof. Suel is
working on problems in selectivity estimation, data partitioning,
sampling and approximation techniques for query results, and
query processing in spatial databases.
Intelligent Information Retrieval and Text Mining:
Prof. Hellerstein is working on problems in intelligent information
retrieval, such as learning to automatically categorize documents by
topic, and learning to extract information from documents. Her work
in this area, supported by a grant from the National Science
Foundation, focuses on learning-based approaches based on
fundamental results from Computational Learning Theory. Prof.
Hellerstein is also leading a reading group of faculty and students
focusing on current developments in information retrieval and
machine learning.
Web Search and Analysis: One of the most fundamental problems
facing the World Wide Web is how to efficiently find the desired
information among the more than one billion currently accessible web
pages. A large amount of industrial and academic work over the last
few years has focused on this problem, and powerful search engines
(such as AltaVista and Google) have been built using massive amounts
of hardware. However, the basic search problem is far from resolved,
and new challenges arise constantly as the web evolves.
Prof. Suel and his students are working on techniques for improving
the efficiency of web search. Besides improving the quality of the
search results, it is also important to improve computing and
storage efficiency in order to keep up with the growth of the web
and allow a deployment on more modest hardware. Closely related
problems of interest are those of exploring or
analyzing the structure and properties of the web and of efficiently
storing and archiving the content of the web. Search engines need
to be able to store massive amounts of encountered pages, and many
techniques used by current search engines to rank results are based
on analyzing and exploiting the hyperlink structure of the web.
Prof. Suel's research in this area, performed in the recently opened
Web Exploration and Search
Technology Lab (WestLab), looks at a number of
problems in this context, ranging from system building to formal
algorithm design and analysis, and includes the storage and
compression of large web page collections, efficient data
acquisition (crawling), analysis of the web graph and structure,
and support for powerful query operations on web archives. Parts
of this work are also performed in collaboration with Profs.
Delis, Hellerstein, and Memon.