• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Natural Language Processing

2023/2024
Academic Year
ENG
Instruction in English
4
ECTS credits
Delivered at:
School of Fundamental and Applied Linguistics
Course type:
Compulsory course
When:
3 year, 3, 4 module

Instructor


Садов Михаил Александрович

Course Syllabus

Abstract

The course is aimed at mastering the basics of natural language processing (NLP), a vibrant interdisciplinary field. The course covers the methods and approaches used in many real-world NLP applications such as language modeling, text classification, sentiment analysis, summarization and machine translation. The students taking the course will not only use some of the existing NLP libraries and software packages, but also learn about the principles behind their design, and about the mathematical models underlying modern computational linguistics. The course also involves completing practical programming assignments in Python and conducting experiments on texts written in English and Russian. Pre-requisites are python programming skills, general knowledge of linguistics
Learning Objectives

Learning Objectives

  • As a result of mastering the discipline, the student will know the structural features of natural language texts and the principles of their computer processing in order to obtain linguistic (morphological, syntactic, semantic) information;
  • As a result of mastering the discipline, the student will have an idea of the methods used to solve complex practical problems of natural language processing, in particular, information retrieval, summarization, sentiment analysis, machine translation
  • The student woll understand the limitations of existing computer models of natural language processing
Expected Learning Outcomes

Expected Learning Outcomes

  • Has an idea of classical and modern approaches to machine translation
  • Has an idea of computational semantics, knows the basic approaches, able to calculate semantic similarity
  • Has an idea of different types of summarization and ways to assess the quality of summarization
  • Has an idea of the classification problem and approaches to it, understands the naive Bayesian classifier algorithm
  • Has an idea of the tagging problem, knows the principle of hidden Markov models and the basic algorithm for implementation
  • Knows how to preprocess text, knows the syntax of regular expressions, has an idea of the editing distance.
  • Knows why language models are needed, knows how to create language models using n-grams
  • Understands the difference between basic classification metrics
  • Understands the difference between basic classification metrics, has an idea of dependency and constituent trees and context-free grammars, knows the basic algorithms of syntactic parsing
  • Understands the difficulties in natural language processing, has an idea of approaches to solve these problems
Course Contents

Course Contents

  • Introduction to natural language processing
  • Basic text processing and edit distance
  • Language models
  • Tagging problems and hidden Markov models
  • Text classification and sentiment analysis
  • Evaluation
  • Parsing
  • Machine translation
  • Computational semantics
  • Text summarization
Assessment Elements

Assessment Elements

  • non-blocking Classwork activity
  • non-blocking Tests
  • non-blocking Homework
  • non-blocking Exam
  • non-blocking Project
    Solving a practical NLP problem on real data. Required to get scores 9 and 10.
Interim Assessment

Interim Assessment

  • 2023/2024 4th module
    0.025 * Classwork activity + 0.025 * Classwork activity + 0.3 * Exam + 0.15 * Homework + 0.15 * Homework + 0.2 * Project + 0.075 * Tests + 0.075 * Tests
Bibliography

Bibliography

Recommended Core Bibliography

  • Perkins, J. Python Text Processing with NLTK 2.0 Cookbook: Use Python NLTK Suite of Libraries to Maximize Your Natural Language Processing Capabilities [Электронный ресурс] / Jacob Perkins; DB ebrary. – Birmingham: Packt Publishing Ltd, 2010. – 336 p.

Recommended Additional Bibliography

  • The Handbook of Natural Language Processing [Электронный ресурс] / edited by Robert Dale, Hermann Moisl, Harold Somers; DB ebrary. – New York: Marcel Dekker, Inc., 2010. – XIX, 996 p. – режим доступа: https://ebookcentral.proquest.com/lib/hselibrary-ebooks/reader.action?docID=216282&query=natural+language+processing+with

Authors

  • ENIKEEVA EKATERINA VLADIMIROVNA
  • MALAFEEV ALEKSEY YUREVICH
  • SAFARYAN ANNA KARENOVNA
  • LEPIGINA ANASTASIYA ANATOLEVNA
  • KHOMENKO ANNA YUREVNA