UNIT 1:
Origins and challenges of NLP
Language Modelling: Grammar-based LM, Statistical LM
Word and Sentence Segmentation
Detecting and Correcting Spelling Errors
Edit Distance-weighted edit distance
dynamic programming edit distance
UNIT 2:
N-grams – Computing unigram, bigram, trigram probabilities
Interpolation and Backoff
WordClasses, Part-of-Speech Tagging
Rule-based, Stochastic and Transformation-based tagging
Hidden Markov and Maximum Entropy models
Conditional random fields (CRF)
UNIT 3:
Grammar rules for English, Treebanks
Syntactic Parsing, Ambiguity
Dynamic Programming parsing
Shallow parsing – Probabilistic CFG
Probabilistic CYK, Probabilistic Lexicalized CFGs
Feature structures, Unification of feature structures.
UNIT 4:
Introduction to lexical semantics
Syntax-DrivenSemantic analysis
Word Senses, Relations between Senses