• A
  • A
  • A
  • АБB
  • АБB
  • АБB
  • А
  • А
  • А
  • А
  • А
Обычная версия сайта
Глава в книге
ALOE: Boosting Large Language Model Fine-Tuning with Aggressive Loss-Based Elimination of Samples

Demidovskij A., Трутнев А. И., Тугарев А. М. et al.

In bk.: Frontiers in Artificial Intelligence and Applications: 27th European Conference on Artificial Intelligence, 19–24 October 2024, Santiago de Compostela, Spain. Vol. 392. IOS Press Ebooks, 2024. P. 3980-3986.

Препринт
The Gamma-Theta Conjecture holds for planar graphs

Taletskii D.

math. arXiv. Cornell University, 2024

Data Mining and Elements of Machine Learning

2024/2025
Учебный год
ENG
Обучение ведется на английском языке
3
Кредиты

Преподаватели

Course Syllabus

Abstract

The course introduces to the students some basic approaches and principles of data mining, the main methods of machine learning and the limits of these methods, the main methods of the quality evaluation.
Learning Objectives

Learning Objectives

  • The purpose of the course is to familiarize students with the basic principles and methods of data analysis and machine learning.
Expected Learning Outcomes

Expected Learning Outcomes

  • Trains logistic regression and KNN, understand quality metrics.
  • Trains classification based on decision trees and ensemble models
  • Trains the model of classification based on SVM and various parameters
  • Trains clustering models, understands clustering evaluation
  • Performs a spectrum of machine learning tasks
  • Reduces the dimensionality with various methods
  • Trains polynomial regression and understand its quality metrics, to identify overfitting and underfitting, to estimate quality during cross-validation
  • Trains polynomial regression and understand its quality metrics, identifies overfitting and underfitting, estimates quality during cross-validation
  • Trains linear regression, understands its quality metrics
  • Prepares data for machine learning algorithms
  • Independently conducts a reproducible experiment by a full pipeline: 1) formulate a problem, analyze previous work and scientific papers on the subject; 2) perform preliminary dataset analysis, data preprocessing, feature engineering and selection; 3) select machine learning methods, train, evaluate and compare models; 4) visualize and explain the results.
  • Works with text data: preprocesses and encodes it.
  • Solves a topic modeling task. Has an idea of Non-Negative matrix factorization and Latent Dirichlet allocation.
Course Contents

Course Contents

  • Introduction. Examples of practical tasks
  • Exploratory data analysis
  • Linear regression
  • Polynomial regression. The concept of overfitting and regularization
  • Classification problem. Logistic regression. The KNN algorithm. Naïve Bayes Classifier.
  • Classification algorithms: decision trees and ensembles.
  • Support vector machines
  • Unsupervised machine learning tasks. Dimension reduction.
  • Unsupervised machine learning tasks. The task of clustering
  • Topic Modelling
  • Introduction to NLP. Text data preprocessing and encoding
Assessment Elements

Assessment Elements

  • non-blocking Homework
  • non-blocking Class work
  • non-blocking Practical project
    Independent selection of a final group project task. Its solution should demonstrate a high variety of skills assimilated during the course.
  • non-blocking Exam
    The exam is conducted orally (a survey based on course materials).
  • non-blocking Tests
Interim Assessment

Interim Assessment

  • 2024/2025 4th module
    Cumulative mark = Homework*0.3 + Tests*0.2 + Classwork*0.1 + Practical project*0.2 Final mark: 1. Cumulative*10/8 (if all the current marks > 6) 2. Cumulative + 0.2*Exam (if at least one current mark <6)
Bibliography

Bibliography

Recommended Core Bibliography

  • Sarkar, D., Bali, R., & Sharma, T. (2018). Practical Machine Learning with Python : A Problem-Solver’s Guide to Building Real-World Intelligent Systems. [United States]: Apress. Retrieved from http://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsebk&AN=1667293

Recommended Additional Bibliography

  • Müller, A. C., & Guido, S. (2017). Introduction to Machine Learning with Python : A Guide for Data Scientists: Vol. First edition. Reilly - O’Reilly Media.

Authors

  • Klimova Margarita Andreevna