Introduction- Basics of Information Retrieval and Introduction to Search Engines - Boolean Retrieval-: Boolean queries, Building simple indexes, Processing Boolean queries : Term Vocabulary and Posting Lists- Choosing document units, Selection of terms, Skip lists, Positional postings and Phrase queries
Data structures for dictionaries, Wildcard queries, Permuterm and K-gram indexes, Spelling correction, Phonetic correction. Index Construction- Single pass scheme, Distributed indexing, Map Reduce, Dynamic indexing; Index Compression - Statistical properties of terms.
Dictionary compression, Postings file compression, Variable byte codes, Gamma codes. Vector Space Model- Parametric and zone indexes, Learning weights, Term frequency and weighting, Tf-Idf weighting, Vector space model for scoring, variant tf-idf functions.
Computing Scores in a Complete Search System- Efficient scoring and ranking, Inexact retrieval, Champion lists, Impact ordering, Cluster pruning, Tiered indexes, Query term proximity, Evaluation in Information Retrieval: Standard test collections, unranked retrieval sets, Ranked retrieval results, Assessing relevance, Relevance feedback
Probabilistic Information Retrieval- Review of basic probability theory, Probability ranking principle, Binary independence model, Probability estimates, Text Classification- Rocchio classifier, KNearest neighbor classifier, Linear and nonlinear classifiers. Text Clustering- Clustering in information retrieval, Evaluation of clustering
Reference Book:
R.Baeza-Yates and B.Ribeiro-Neto, Modern Information Retrieval, PearsonEducation, 1999
Text Book:
C.D.Manning, P.Raghavan, and H.Schutze, An Introduction to Information Retrieval, Cambridge University Press, 2009.