- 
                Notifications
    
You must be signed in to change notification settings  - Fork 53
 
Schedule
        jasonbaldridge edited this page Mar 20, 2013 
        ·
        39 revisions
      
    - Unless otherwise noted, assignment submissions are due by 1pm on the Friday of the week indicated. (They will also be clearly indicated in each homework description.)
 - The schedule is subject to change, as the semester progresses, however any changes will be made at least one week in advance of the dates affected.
 
- Topics
- NLP Overview
 - Scala Overview
 
 - Readings (1/16)
 
NOTE: No class on Monday, January 21 (Martin Luther King, Jr. Day)
- 
Topics
- Regular expressions
 - Scala: functional programming
 
 - 
Readings (1/23)
 - 
Due. Homework1: Scala
 
- 
Topics
- Authorship attribution
 - Spelling correction
 - Vector-space models and computing similarity
 - Scala: object-oriented programming
 - Build systems
 
 - 
Readings (1/28)
 - 
Readings (1/30)
 - 
Due. Project Phase One
 
- Topics
- Clustering
 - Data formats (CSV, XML, JSON)
 
 - Readings (2/4)
- Basic XML processing with Scala
 - Manning et al, Chapter 16, Sections 16.1-16.4.
 
 - Readings (2/6)
- Processing JSON in Scala using Jerkson
 - Manning et al, Chapter 17, Sections 17.1-17.4.
 
 - Due. Homework2: Regular expressions
 
- Topics
- Classification
 
 - Readings (2/11)
- Manning et al, Chapter 13, Sections 13.1,13.2,13.6
 
 - Readings (2/13)
- Manning et al, Chapter 14, Sections 14.0,14.1,14.3-14.5
 
 
- Topics
- Evaluation
 - Sentiment analysis
 
 - Due (2/18). Project Phase Two
 
- Topics
- Topic models
 
 - Due (2/25): Homework3: Clustering
 
- Topics
- In class exercise: Topic Modelling
 
 - Due (3/8). Project Phase Three
 
Spring Break. March 11-16
- Topics
- Part-of-speech Tagging
 - Named entity recognition
 - Label propagation
 - In class exercise: Chalk tutorial for sentence detection, tagging and NER
 
 - Due. Homework 4: Classification
 
- Topics
- Deduplication
 - PageRank
 
 - Due. Project: step 4
 
- Topics
- Text analysis pipelines
 - Hadoop and Spark
 - Amazon Web Services
 
 - Due. Homework 5: Sentiment analysis
 
- Topics
- Geolocation
 - Spark
 
 - Due. Project: step 5
 
- Topics
- Streaming data
 
 - Due. Homework 6: Scaling, visualization
 
- Topics
- Parsing
 
 - Due. Project: step 6
 
- Topics
- Project demos
 - Wrap-up
 
 
- Due. Project code and write-up