• A
  • A
  • A
  • ABC
  • ABC
  • ABC
  • А
  • А
  • А
  • А
  • А
Regular version of the site

Student Conference on Computational Linguistics Held at HSE University in Nizhny Novgorod

Student Conference on Computational Linguistics Held at HSE University in Nizhny Novgorod

© iStock

ConCort 2023, a forum dedicated to research in corpus technology and computer science in the humanities, brought together experts and students from all over Russia. The participants discussed the latest developments in corpus linguistics, including the rapidly developing field of digital humanities.

Formal studies of humanitarian subjects have a rich history in the 20th century, including the study of poetry, historical databases, and stylometry. But with the emergence of new possibilities in recent years—the availability of electronic texts, the development of methods of automatic text analysis, new storage and processing capacities, new tools for working with data—the field is experiencing a rebirth.

Digital humanities is a field at the interface between computer science and humanities. Its main focus is the classical humanities of philology, history, philosophy, and cultural studies, but they are studied in a new way—taking into account the fact that the world is going digital.

The ConCort conference has been held at HSE University since 2013. Over these ten years, the field of digital humanities has made a great leap forward: specialists have moved from a simple representation of texts on the internet to full-fledged systems that make it possible to trace the plot lines of a work, semantic fields relevant to analysing works, and syntactic features of the text. ‘Digital humanitarians’ build connections between characters in War and Peace, analyse the development of cultural diplomacy in the early 20th century on postcards, and construct maps of Gulag camps.

HSE University in Nizhny Novgorod has traditionally been the venue for the ConCort conference, and this year’s participants included not only students from HSE University in Moscow and Nizhny Novgorod, but those from Moscow State University, Voronezh State University, the RAS Institute of Oriental Studies, and other universities.

The conference’s main aim is to attract young researchers, and the organisers noted the unprecedented number of student papers this year. Each report presented at the conference was actively discussed, intensive work was done, and it was noticeable that in the year since the previous ConCort, specialists had accumulated a lot of interesting developments that they wanted to share with their colleagues.

Tatyana Romanova, Doctor of Sciences in Russian Language, Professor, Head of School of Fundamental and Applied Linguistics, HSE University in Nizhny Novgorod

‘It is important that students who are involved in computational linguistics get a chance to interact with their peers from other Russian universities and to share their research experience at this conference. Eleven students from our Bachelor’s in Fundamental and Applied Linguistics participate as speakers, and the rest do so as listeners.’

‘This conference is a great way for students engaged in research in such a specific field as digital humanities to interact,’ agrees Veronika Zykova, fourth-year student of the Bachelor’s in Fundamental and Computer Linguistics (HSE University in Moscow). ‘This year’s ConCort was a whole day longer than last year’s, so there were considerably more presentations. This resulted in a kind of immersion.’

Veronika Zykova

Out of the many excellent papers, I would personally single out Danila Fedorov’s paper on Matematicon—a corpus of oral mathematical texts in Russian. This is a big project that is very interesting to follow: a corpus where parts from lectures on mathematics are collected with marking corresponding to the corpus. After all, for a foreigner learning Russian, some mathematical language can sound like an incomprehensible set of syllables, like some kind of chant. It is great that we are developing such specific corpora—it is very interesting.

And purely subjectively, I was interested in the report by my Nizhny Novgorod colleagues Karina Zakirova and Maxim Shestakov on the characteristics of anthroponyms in works of the fantasy genre, because their research is close to mine. Names were standard in classical literature, and only relatively recently, when fiction and fantasy appeared, did the need to invent appropriate names arise. And it is very interesting, for example, how much Tolkien influenced us all—now the names of all elves sound similar to the characters in his works.

My report was about a way of identifying anaphoric proper names in fiction texts. In our work, we tried to bring together the names that actually refer to the same character by automatic means. The algorithm we developed made it possible to gather them all into one group.

‘The importance of developing IT technologies, including linguistic research tools, for Russian society in the current political and economic situation cannot be overestimated,’ stressed Tatyana Romanova. ‘Corpus and computational linguistics solve a diverse range of tasks, such as creating electronic dictionaries and textbooks, automatic processing and analysis of arrays of texts in various formats, and modelling brain speech functions.’

New corpus resources and software tools for language analysis are constantly being created, and the Corpus Technologies and Computer Science in Humanities conference is a major tool to help professionals in this field navigate the rapidly changing scientific landscape.