Note: there is no required textbook, but you might consider getting the first three books below, or at least one of them. They are all good books.
Strongly Recommended Books:
Introduction to Information Retrieval, by C. Manning, P. Raghavan, H.
Schuetze. Free online version of book.
This is a new book on information retrieval that gives a good introduction
into parsing, indexing, and ranking and querying. It is highly recommended
to read at least the first few chapters for background. This course will
not focus too much on advanced IR, but you still need to at least know
the basics.
Mining the Web, by S. Chakrabarti, Morgan Kaufmann 2002.
The best book specialized on web search. Covers many topics in this course.
The course will not follow the book in its organization, but it is recommended
to read this book to get a more complete perspective. Good coverage of
link-based techniques, web exploration, and machine learning techniques, but
less emphasis on systems issues, web search architecture, query execution etc.
See here for the Amazon catalog entry.
Managing Gigabytes : Compressing and Indexing Documents and Images,
by I. Witten, A. Moffat, and T. Bell. Morgan Kaufmann 1999.
Very good book on compression, index construction, and query execution.
Contains about a third of the material covered in the course.
See here
for the Amazon catalog entry.
Python Essential Reference (4th Edition) , by D. Beazley. Addison Wesley, 2009.
There are many other books on Python, and you can also get a lot of
documentation at the official
Python web site, but this is the most
concise book. Get the latest edition, and read the intro and use the
rest as reference. See
here for the Amazon catalog entry.
Other Good Books Related to Course Topic:
Modern Information Retrieval, by R. Baeza-Yates and B. Ribeiro-Neto.
Addison-Wesley 1999.
This is a good book on "classical" information retrieval, with a few sections
on web-related problems towards the end. In the course, I will introduce
information retrieval techniques as needed, but we will not have time for
a broad introduction.
See here for the Amazon catalog entry.
Search Engines: Information Retrieval in Practice, by B. Croft, D.
Metzler, and T. Strohman. Addison-Wesley 2009.
This is a another very good information retrieval book which just came out.
See here for the Amazon catalog entry.
Foundations of Statistical Natural Language Processing, by C. Manning
and Hinrich Schutze. MIT Press 1999.
Very good book on the related field of natural language processing,
with emphasis on statistical methods closely related to the perspective
taken in IR.
See here
for the Amazon catalog entry.