![]() |
![]() |
TR-CIS-2003-01 (06/20/2003)
Torsten Suel, Chandan Mathur, Jo-Wen Wu, Jiangong Zhang,
Alex Delis, Mehdi Kharrazi, Xiaohui Long, Kulesh Shanmugasundaram
Abstract
We consider the problem of building a P2P-based search engine for massive
document collections. We describe a prototype system called ODISSEA
(Open DIStributed Search Engine Architecture) that is currently under
development in our group. ODISSEA provides a highly distributed global indexing
and query execution service that can be used for content residing inside
or outside of a P2P network. ODISSEA is different from many other approaches
to P2P search in that it assumes a two-tier search engine architecture and
a global index structure distributed over the nodes of the system.
We give an overview of the proposed system and discuss the basic design
choices. Our main focus is on efficient query execution, and we discuss
how recent work on top-k queries in the database community can be applied
in a highly distributed environment. We also give some preliminary simulation
results on a real search engine log and a terabyte-size web page collection
that indicate good scalability for our approach.