• A
• A
• A
• ABC
• ABC
• ABC
• А
• А
• А
• А
• А
Regular version of the site

# Data Mining and Elements of Machine Learning

2023/2024
ENG
Instruction in English
3
ECTS credits
Delivered at:
Department of Applied Mathematics and Informatics (Faculty of Informatics, Mathematics, and Computer Science (HSE Nizhny Novgorod))
Course type:
Elective course
When:
3 year, 3, 4 module

#### Instructors

Kazakov, Maxim

Козлова Анастасия Владимировна

### Course Syllabus

#### Abstract

The course introduces to the students some basic approaches and principles of data mining, the main methods of machine learning and the limits of these methods, the main methods of the quality evaluation.

#### Learning Objectives

• The purpose of the course is to familiarize students with the basic principles and methods of data analysis and machine learning.

#### Expected Learning Outcomes

• Trains logistic regression and KNN, understand quality metrics.
• Trains classification based on decision trees and ensemble models
• Trains the model of classification based on SVM and various parameters
• Trains clustering models, understands clustering evaluation
• Performs a spectrum of machine learning tasks
• Reduces the dimensionality with various methods
• Trains polynomial regression and understand its quality metrics, to identify overfitting and underfitting, to estimate quality during cross-validation
• Trains polynomial regression and understand its quality metrics, identifies overfitting and underfitting, estimates quality during cross-validation
• Trains linear regression, understands its quality metrics
• Prepares data for machine learning algorithms
• Independently conducts a reproducible experiment by a full pipeline: 1) formulate a problem, analyze previous work and scientific papers on the subject; 2) perform preliminary dataset analysis, data preprocessing, feature engineering and selection; 3) select machine learning methods, train, evaluate and compare models; 4) visualize and explain the results.

#### Course Contents

• Introduction. Examples of practical tasks
• Exploratory data analysis
• Linear regression
• Polynomial regression. The concept of overfitting and regularization
• Classification problem. Logistic regression. The KNN algorithm. Naïve Bayes Classifier.
• Classification algorithms: decision trees and ensembles.
• Support vector machines
• Machine Learning approaches to Named Entities Recognition.
• Unsupervised machine learning tasks. Dimension reduction.
• Topic Modelling

#### Assessment Elements

• Exam
The exam is conducted orally (a survey based on course materials).
• Class work
• Homework
• Practical project
• Tests

#### Interim Assessment

• 2023/2024 4th module
0.1 * Class work + 0.2 * Exam + 0.3 * Homework + 0.2 * Practical project + 0.2 * Tests

#### Recommended Core Bibliography

• Sarkar, D., Bali, R., & Sharma, T. (2018). Practical Machine Learning with Python : A Problem-Solver’s Guide to Building Real-World Intelligent Systems. [United States]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1667293