Ling 317 Introduction to Computational Linguistics

Dr. Nick Danis, nsdanis@wustl.edu

Description

This course introduces the computational tools, both practical and theoretical, to describe and analyze natural language. We will touch upon most of the major aspects of the field of computational linguistics and natural language processing, such as analysis of raw corpora; modeling phonological, morphological, & syntactic patterns; and learnability. We will use Python 3 throughout the course. This course is an introduction to these topics, and therefore no programming experience is required. Students, however, are encouraged to complete optional exercises and readings on basic programming techniques to aid in the completion of assignments and discussions. Prerequisite: L44 Ling 170D and either L44 Ling 258 or CSE 131.

Course Info

Course Number L44 Ling 317
Semester Spring 2024
Time MW 1:00P-2:20P
Location Eads 014
Office January 206
Office Hours TBD
Homepage https://wustl.instructure.com/courses/127686

Goals

  1. Learn practical, computational tools (e.g. Python NLTK) to manipulate natural language data
  2. Understand the computational properties of natural language, independent of whether it is being computed by machines or humans (i.e. treating grammars as purely mathematical objects)
  3. Apply these techniques to solve problems in phonology, morphology, and syntax

Required Materials

There are no required purchases for this course. Any readings will be made available online. Additionally, all software necessary is free as well. This course assumes you have (or have access to) a working computer with a desktop operating system (macOS, Windows, Linux, etc.). For those students working primarily off a Chromebook or iOS/Android device, there are cloud options available. More details are given on Canvas.

Readings

A majority of the readings will come from the following source:

  • Steven Bird et al. (2014). Natural Language Processing with Python. 2nd ed. URL: http://www.nltk.org/book/

The entire second edition is available online at the link above; do not purchase a physical copy as the printed edition is the outdated first edition.

Software

All course material that is not a PDF is distributed in a Jupyter notebook for Python 3. If you are unsure what to install, or what this means, we will go over it the first week of class. There is detailed instructions on the Canvas site as well. If you have some experience, and have Python 3 installed, you can get a headstart with pip install jupyter.

Grade

The grade breakdown is shown below.

Category Weight
Participation & Discussion 15%
Weekly Assignments* 60%
Final Project 25%

*The lowest grade is dropped.

Participation entails completing & uploading the in-class notebooks to Canvas once completed, and regular engagement on the course Piazza & in-class discussions.

Letter grades

Letter grades are assigned based off the following scale. Numerical grades are not rounded.

  • 100 ≥ A+ ≥ 98
  • 98 > A ≥ 93
  • 93 > A- ≥ 90
  • 90 > B+ ≥ 87
  • 87 > B ≥ 83
  • 83 > B- ≥ 80
  • 80 > C+ ≥ 77
  • 77 > C ≥ 73
  • 73 > C- ≥ 70
  • 70 > D+ ≥ 67
  • 67 > D ≥ 63
  • 63 > D- ≥ 60

If you are taking this class pass/fail, you must receive at least a C- (70%) to pass.

If you believe there has been an error in grading, I am happy to discuss it with you. However, you must bring it up to me within one week of the graded assignment being returned to you. After this, the grade is considered final.

Schedule

The exact schedule is likely to change as the semester progresses. Please see Canvas for all up-to-date readings and assignment due dates.

Week Topic
1 introduction
3 regex
4 corpora
5 text processing
6 tf-idf
7 n-grams
8 language models
9 spring break
10 finite state machines
11 finite state transducers
12 pushdown automata
13 context-free grammars
13 machine learning/ai
14 presentations
15 presentations

Academic Integrity

This course adheres to the university’s Academic Integrity Policy, and takes cheating and plagiarism very seriously. All work completed online must be done alone unless instructed otherwise, and no resources not approved by the instructor may be used during exams.

ADA Compliance

Washington University is committed to providing accommodations and/or services to students with documented disabilities. Students who are seeking support for a disability or a suspected disability should contact Disability Resources at 935-4153. Disability Resources is responsible for approving all disability-related accommodations for WU students, and students are responsible for providing faculty members with formal documentation of their approved accommodations at least two weeks prior to using those accommodations. I will accept Disability Resources VISA forms by email and personal delivery. If you have already been approved for accommodations, I request that you provide me with a copy of your VISA within the first two weeks of the semester. Please see more information at http://cornerstone.wustl.edu.

Sexual Assault Resources

The University is committed to offering reasonable academic accommodations (e.g., no contact order, course changes) to students who are victims of relationship or sexual violence, regardless of whether they seek criminal or disciplinary action. If you need to request such accommodations, please contact the Relationship and Sexual Violence Prevention Center (RSVP) at rsvpcenter@wustl.edu or 314-935-3445 to schedule an appointment with an RSVP confidential, licensed counselor. Information shared with counselors is confidential. However, requests for accommodations will be coordinated with the appropriate University administrators and faculty. Please see more information at https://students.wustl.edu/relationship-sexual-violence-prevention-center.