‘It Doesn’t Test, It Teaches’: Hackathon for DANO Data Analysis Olympiad Held in Nizhny Novgorod
A hackathon was held in Nizhny Novgorod for students in grades 9–11 as part of the Data Analysis National Olympiad (DANO). More than 90 school students in grades 9–11 from Moscow, Nizhny Novgorod and the surrounding region, St Petersburg, Samara, Cheboksary, and Ufa—a total of 15 Russian regions—took part in the hackathon.
‘An increasing part of the market is occupied by positions that require working with big data, so even at school you need to learn how to process it, analyse it, and draw the right conclusions,’ said Dmitry Pokrovsky, Co-Chair of the DANO methodological committee.
The teams received a research dataset with information about 50,000 participants in the Moscow Longevity leisure programme for senior citizens. Based on data analysis, the teams needed to find a solution to attract more participants or increase class attendance in the programme.
The students had to put forward hypotheses, think through the mechanisms of logical relationships, and identify and illustrate dependencies between variables. It was important not only to verify the hypothesis, but also to test the stability of the conclusions and show how the results obtained could be used to develop the programme. Each team was supported by a mentor with experience of research project work.
The hackathon teams were formed in such a way that each team included both beginners and experienced participants.
Igor Privalov
Member of the hackathon jury, senior lecturer at the Faculty of Management, HSE University in Nizhny Novgorod
Each team included participants with different levels of training. Some are taking their first steps—sometimes they lack the observation experience, mathematical tools, and product vision to look at the problem with a higher level of abstraction. Others worked very skilfully with the criteria; their research experience is immediately visible. It is excellent that they participate together in solving the problem; this approach allows them to gain the necessary experience.
A group of experts assessed the research methodology and its practical usefulness based on the specified criteria for the analytical and presentation blocks:
- Preliminary analysis and data structure analysis (analysis of the data structure contains only minor or no flaws; main indicators are examined—average values + several statistics or distributions; relationships and/or correlations of sample variables are built; outliers are processed)
- Hypothesis and mechanism (the hypothesis is clearly formulated, corresponds to the question posed in the hackathon task and is not trivial; the mechanism has minor or no flaws in logic)
- Verification of hypothesis using mathematical methods (means have been compared to verify the hypothesis (or an alternative relevant method has been implemented); subsample analysis, analysis of relationships between variables (possibly graphical) or advanced analysis of variable distributions taking into account variability, shape, multimodality, etc)
- In the presentation block, the jury members assessed teamwork, presentation logic, the visualisation of results, and research prospects and applicability
Irina Zoroastrova
Member of the hackathon jury, senior lecturer at the Faculty of Economics, HSE University
It is important that the hypothesis is non-trivial, and that the criteria it is based on are measurable. Sometimes, teams state one hypothesis and test another. They lack the experience and time to critically look at the results obtained and correct their initial position. During the discussion of the presentations, we tried to show the students the prospects for developing their research, and we hope that this will help improve their data-analysis abilities.
Team No. 20, consisting of students from Moscow and Cheboksary, worked with the hypothesis that there is a positive correlation in the first two and subsequent months of visits: ‘if participants attend classes in the first two months, later on, they maintain or increase this trend.’
To test the hypothesis, the team created graphs in Python and Excel. Some of the ideas were implemented using ChatGPT, while Figma was used for the presentation.
The team confirmed their hypothesis, but believe it would have been useful to obtain data over a longer period of time. In addition, the ratio of men to women turned out to be too disproportionate (1:10) to draw definitive conclusions.
‘We used the seaborn and pandas libraries, the simplest statistical methods—averages, variances, median, and mode. Our “freelance programmer” ChatGPT wrote part of the code and corrected technical errors. It was a real team effort in collecting and processing data, plotting graphs, preparing documentation and the presentation in general. Victoria, as the only humanities student among the engineers, kept an eye on the ideas, their structure, and the final wording,’ the participants shared.
Danil Fedorovykh
Deputy Vice Rector and Head of HSE University’s Office for the Development of Intellectual Competitions
Pooling skills is an important part of teamwork for successful projects. The hardest part of data analysis is formulating the right research question and hypothesis verification mechanism. For this, economic skills are required in addition to a knowledge of mathematics and programming languages. That is why we hold DANO competitions at the intersection of social, exact, and computer sciences and look forward to seeing school students who are interested in this.
A combined team from St Petersburg and Naberezhnye Chelny took first place in the hackathon. Their approach was to overlay the addresses of training centres with the participants’ addresses downloaded from the Yandex Maps API in order to visualise a solution.
According to the team members, the task dataset did not provide a wide range of combinatorial possibilities. The use of data from the API made it possible to perform a unique study, visualise it, and obtain conclusions with recommendations.
During the research, the team members noticed an interesting fact—some of the centres were built in places where there is no target audience at all. Clusters of dots were marked on the map to highlight buildings where opening centres would be useful. The team members reflected this recommendation in the results of their research.
The organisers of the competition believe that good data analysts need more than just a knowledge of mathematics and programming. It is important to train observation skills, creativity, and most importantly, the ability to compare numbers with their physical manifestation. ‘The numbers themselves won’t tell you anything. You need to understand the patterns behind them,’ says Vladislav Pikinevich, Head of Analytics at Tinkoff Benefit, Chair of the DANO Jury.
Anton Lykov
Organiser of the Hackathon for the Data Analysis National Olympiad in Nizhny Novgorod
We agree with the experts that this hackathon doesn’t test the participants—it teaches them. It’s great that the students saw the testing of their research and participated in the discussion. The participants receive not just points, but qualitative feedback on what can be corrected and how the study can be developed. They made a great investment in their education and their future profession this weekend.