CMCS723/LING723 Computational
Linguistics I |
|
![]() |
Instructor:
Saif Mohammad Co-instructor: Nitin Madnani Course co-ordinator: Bonnie Dorr |
Class: | |
Wednesdays, 4 to 6:30pm, Computer Science Instructional Center (CSIC) Room 3120 | |
Text: |
|
Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics, second edition (published in 2008), by Daniel Jurafsky and James H. Martin. | |
Guest lectures: | |
by Bonnie Dorr, Philip Resnik, and Doug Oard. | |
Overview: | |
The lectures in this course will cover topics in four broad areas of Computational Linguistics (words and morphology; syntax; semantics; and pragmatics) and some specific applications. |
September 3: INTRODUCTIONS and OVERVIEW by Saif and Bonnie
Reading: Chapter 1 of J&M; Secton 1.3 and 1.4 from this chapter in Foundations of Statistical Natural Language Processing.
- administrivia
- semester plan
- overview of NLP, by Bonnie
- introduction to statistical NLP, by Saif
Lecture notes: Course details and Introduction to statistical NLP, Overview of Computtional Linguistics
Assignment 0 posted (not for credit)
September 10: NUTS and BOLTS by Nitin
Reading: NLTK Book (Chapters 1 and 2); Python Beginners' Guide (includes resources for both programmers and non-programmers); ACM article on Getting Started on NLP with Python.
- Introduction to Python and NLTK
Lecture notes: Introduction to Python and NLTK
Assignment 0: finish Q1 before class.
September 17: WORDS by Saif
Reading: Chapter 2, Section 2.2 onwards; Chapter 3.
Lecture notes: FSA, FST, morphology
- regular expressions
- finite state automata
- morphology
- finite state transducers
Assignment 0: turn in solutions to both Q1 and Q2
Assignment 1 (on finite state automata) posted
September 24: WORDS by Saif and Nitin
Reading: Chapter 5.1-5.4 and 5.6
Lecture notes: POS tagging
- finish morphology and FSTs
- part-of-speech tagging
- introduction to HMMs
October 1: WORDS by Nitin
Reading: Chapter 5.5, 6.1-6.4
- hidden Markov models (HMMs)
- expectation-maximization and HMM training
Lecture notes: HMM and EM
Assignment 1 is due; Assignment 2 (on POS-tagging and HMM) is posted
October 8: SYNTAX by Bonnie
Reading: This week and next week's lecture will draw from subsets of the slides at these two URLs:
http://www.cs.colorado.edu/~martin/SLP/Slides/slp12.pdf
http://www.cs.colorado.edu/~martin/SLP/Slides/slp13.pdf
- context-free grammars (CFG)
- linguistic phenomena
Lecture notes: Syntax Ia, Syntax Ib
October 15: SYNTAX by Bonnie and Nitin; Midterm Review by Saif and Nitin
Reading: This week and last week's lecture draw from subsets of the slides at these two URLs:
http://www.cs.colorado.edu/~martin/SLP/Slides/slp12.pdf
http://www.cs.colorado.edu/~martin/SLP/Slides/slp13.pdfTAG: Pages 1-13 and 27-33 (Section 8) of Aravind Joshi and Yves Schabes, Tree-Adjoining Grammars, in Handbook of Formal Languages, G. Rozenberg and A. Salomaa (eds.), Vol. 3, Springer, Berlin, New York, 1997, 69-124.
CCG: New Ch 12 (Section 12.7); Mark Steedman, Categorial Grammar (tutorial overview), Lingua, 90:221--258, 1993; and Shieber et al. (1995) section 4.
- context-free parsing: CYK, Earley by Bonnie
- tree adjoining grammars by Nitin
- midterm review by Saif and Nitin
Lecture notes: Syntax II
Lecture notes: CCG and TAGAssignment 2 is due
October 22: MIDTERM
Midterm review: These questions are to help you focus your preparation for the midterm. This is not a substitute for the readings for all the classes so far.
This is a two-hour in-class exam.
October 29: WORDS by Nitin
Reading: Chapter 4 (Sections 4.1--4.7; 4.9.1)
- N-gram language models
Lecture notes: N-grams
Assignment 3 (on n-grams and smoothing) is posted
November 5: SEMANTICS by Saif
Reading: Chapter 19 (Sections 19.1--19.3); Chapter 20 (Sections 20.1--20.5)
- representing meaning
- word senses
- word sense disambiguation
- supervised
- unsupervised
- semi-supervised
Lecture notes: WSD
November 12: SEMANTICS by Saif
Reading: Chapter 20 (Sections 20.6--20.9)
- lexical semantic relations
- semantic distance
- WordNet-based measures
- distributional measures
- hybrid measures
- semantic role labeling
Asssignment 3 is due; Assignment 4 (on WordNet and semantic distance) is posted
November 19: APPLICATIONS
- machine translation
November 26: APPLICATIONS by Saif
- clustering
- first and second-order co-occurrences
- singular value decomposition
Assignment 4 is due
December 3: APPLICATIONS
- information retrieval
- speech
December 10: APPLICATIONS and WRAP-UP by Saif
- text summarization
- exam overview
- wrap-up
FINAL EXAM: December 17 2008, 4 to 6 pm
Last updated: October 2008