102
Page views
1
Files
3
Videos
1
R.Links

Icon
Syllabus

UNIT
1
Basics of IR

Introduction- Basics of Information Retrieval and Introduction to Search Engines - Boolean Retrieval-: Boolean queries, Building simple indexes, Processing Boolean queries : Term Vocabulary and Posting Lists- Choosing document units, Selection of terms, Skip lists, Positional postings and Phrase queries

UNIT
2
Dictionaries and Tolerant Retrieval

Data structures for dictionaries, Wildcard queries, Permuterm and K-gram indexes, Spelling correction, Phonetic correction. Index Construction- Single pass scheme, Distributed indexing, Map Reduce, Dynamic indexing; Index Compression - Statistical properties of terms.

UNIT
3
Compression

Dictionary compression, Postings file compression, Variable byte codes, Gamma codes. Vector Space Model- Parametric and zone indexes, Learning weights, Term frequency and weighting, Tf-Idf weighting, Vector space model for scoring, variant tf-idf functions.

UNIT
4
Computing Scores in a Complete Search System

Computing Scores in a Complete Search System- Efficient scoring and ranking, Inexact retrieval, Champion lists, Impact ordering, Cluster pruning, Tiered indexes, Query term proximity, Evaluation in Information Retrieval: Standard test collections, unranked retrieval sets, Ranked retrieval results, Assessing relevance, Relevance feedback

UNIT
5
Probabilistic Information Retrieval

Probabilistic Information Retrieval- Review of basic probability theory, Probability ranking principle, Binary independence model, Probability estimates, Text Classification- Rocchio classifier, KNearest neighbor classifier, Linear and nonlinear classifiers. Text Clustering- Clustering in information retrieval, Evaluation of clustering

Reference Book:

R.Baeza-Yates and B.Ribeiro-Neto, Modern Information Retrieval, PearsonEducation, 1999

Text Book:

C.D.Manning, P.Raghavan, and H.Schutze, An Introduction to Information Retrieval, Cambridge University Press, 2009.

 

Print    Download