CATT Short Course on Web Search Engines and Web Data Mining


Following are electronic versions of the slides used in the one-day CATT Short Course on Web Search Engines and Web Data Mining held on June 11, 2004 at Polytechnic University.

Presentation Slides:

Part I
(introduction, web and search basics, intro to information retrieval, historical developments)
Part II
(search engine architecture, crawling basics, indexing, query processing, ranking, basic link-ranking)
Part III
(search tools and their architectures, software tools, database support for IR and search, web mining application scenarios, search engine manipulation)
Part IV - PRELIMINARY VERSION
(advanced crawling, refresh and focused crawling, optimized query processing, parallel architectures, advanced link ranking)

A somewhat outdated list of web resources and pointers to the literature can be found here
.

Please sent mail to suel@poly.edu for inquiries and feedback. If you would like to use some of the material in one of your classes, please drop me a line letting me know about the course and topic.