# Publications

In this paper we consider the problem of vehicle assignment in heterogeneous fleet site-dependent Vehicle Routing Problems (VRP) with split deliveries. In such VRP problems vehicles can have different capacities, fixed and travel costs, and site-dependency constraints limit for every customer a set of vehicles, which can serve it. The Vehicle Assignment Problem (VAP) arises in heuristic and exact algorithms, when vehicles are assigned to all customers or one customer is added to the current vehicle route. The VAP objective is to minimize the total assignment cost while the cost of assigning a vehicle to a customer is computed in some heuristic way. Without split deliveries, when a delivery to a customer cannot be split between two vehicles, the VAP problem is modeled in literature as the Generalized Assignment Problem. We demonstrate that allowing split deliveries reduces the VAP to the Hitchcock Transportation Problem, which can be efficiently solved with Transportation Simplex Methods. We also consider a special case, which is not rare in practice, when all customers are partitioned into classes, where customers have the same set of vehicles able to serve them, and the vehicle sets for these classes form a sequence of nested sets. We show that in this case if the cost per demand unit of assigning a vehicle to a customer depends only on the vehicle, then the VAP problem can be solved by a linear algorithm.

In this paper, a deep learning method study is conducted to solve a new multiclass text classification problem, identifying user interests by text messages. We used an original dataset of almost 90 thousand forum text messages, labeled for ten interests. We experimented with different modern neural network architectures: recurrent and convolutional, as well as simpler feedforward networks. Classification accuracy was evaluated for different architectures, text representations, and sets of miscellaneous parameters.

We introduce graph clustering quality measures based on comparisons of global, intra- and inter-cluster densities, an accompanying statistical significance test and a step-by-step routine for clustering quality assessment. Our work is centred on the idea that well-clustered graphs will display a mean intra-cluster density that is higher than global density and mean inter-cluster density. We do not rely on any generative model for the null model graph. Our measures are shown to meet the axioms of a good clustering quality function. They have an intuitive graph-theoretic interpretation, a formal statistical interpretation and can be tested for significance. Empirical tests also show they are more responsive to graph structure, less likely to breakdown during numerical implementation and less sensitive to uncertainty in connectivity than the commonly used measures.

This book constitutes the proceedings of the 8th International Conference on Analysis of Images, Social Networks and Texts, AIST 2019, held in Kazan, Russia, in July 2019.

The 24 full papers and 10 short papers were carefully reviewed and selected from 134 submissions (of which 21 papers were rejected without being reviewed). The papers are organized in topical sections on general topics of data analysis; natural language processing; social network analysis; analysis of images and video; optimization problems on graphs and network structures; analysis of dynamic behaviour through event data.

The article continues our previous work [V. V. Savchenko and A. V. Savchenko, Izmeritel’naya Tekhnika, No. 12, 46–52 (2019)]. We study the problem of automatic control of the quality of voice samples recorded and stored in the Unified Biometric System. We propose a solution of the problem of timely updates of the collected samples required because their consumer qualities are lost in the course of time. A new indicator of the acoustic quality of voice samples in the Kullback–Leibler information metric is investigated and a new procedure is proposed for measuring this indicator at the times of contact of the users with the system with service requests. An example of practical implementation of the proposed method in the real-time mode is demonstrated. By using software specially developed by the authors, we performed a full-scale experiment and obtained quantitative estimates of the period of updating of voice samples. Some recommendations concerning their practical application are given. The accumulated results can be used for the purposes of development of new systems and technologies (and upgrading of the existing systems) of automatic quality control and adaptive renewal of the biometric personal data samples.

This volume contains the refereed proceedings of the 8th International Conference on Analysis of Images, Social Networks, and Texts (AIST 2019). The previous conferences during 2012–2018 attracted a significant number of data scientists – students, researchers, academics, and engineers working on interdisciplinary data analysis of images, texts, and social networks.

In this research we introduce a new labelled SportLogo dataset, that contains images of two kinds of sports: hockey (NHL) and basketball (NBA). This dataset presents several challenges typical for logo detection tasks. A huge number of occlusions and logo view changes during playing games lead to an ambiguity of a straightforward detection approach use. Another issue is logo style changes due to seasonal kits updates. In this paper we propose a two stage approach, in which, firstly, an input image is processed by a specially trained scene recognition convolutional neural network. Second, conventional object detectors are applied only for sport scenes. Experimental study contains results of different combinations of backbone and detector convolutional neural networks. It was shown that MobileNet + YOLO v3 solution provides the best quality results on the designed dataset (mAP = 0.74, Recall = 0.87).

In Chirkov et al., (2019), classes of *conic* and *discrete conic* functions were introduced. In this paper we use the term *convic* instead *conic*. The class of convic functions properly includes the classes of convex functions, strictly quasiconvex functions and the class of quasiconvex polynomials. On the other hand, the class of convic functions is properly included in the class of quasiconvex functions. The discrete convic function is a discrete analogue of the convic function. In Chirkov et al., (2019), the lower bound 3^{n-1} log(R) for the number of calls to the comparison oracle needed to find the minimum of the discrete convic function defined on integer points of some n-dimensional ball with radius R was obtained. But the problem of the existence of a polynomial (in log(R) for fixed n) algorithm for minimizing such functions has remained open. In this paper, we answer positively the question of the existence of such an algorithm. Namely, we propose an algorithm for minimizing discrete convic functions that uses 2^{n^2 log(n)} log(R) calls to the comparison oracle and has 2^{n^2 log(n)} poly(log(R)) bit complexity.

For a graph *G* and a positive integer *k*, a subset *C* of vertices of *G* is called a *k*-path vertex cover if *C* intersects all paths of *k* vertices in *G*. The cardinality of a minimum *k*-path vertex cover is denoted by *β_{**P**_k*}(*G*). For a graph *G* and a positive integer *k*, a subset *M* of pairwise vertex-disjoint paths of *k* vertices in *G* is called a *k*-path packing. The cardinality of a maximum *k*-path packing is denoted by *μ**_{P**_k}*(*G*). In this paper, we describe some graphs, having equal values of *β_{**P**_k*} and *μ_{**P**_k*}, for *k*≥5, and present polynomial-time algorithms of finding a minimum *k*-path vertex cover and a maximum *k*-path packing in such graphs.

The goal of the study is to increase the computation efficiency of the face recognition that uses feature vectors to describe facial images on photos and videos. These high-dimensional feature vectors are nowadays produced by convolutional neural networks. The methods to aggregate the features generated for each video frame are used to process the video sequences. A novel hierarchical recognition algorithm is proposed. In contrast to known approaches our algorithm seeks the nearest neighbors only among reference images of most reliable classes selected at the preceding stage to carry out the sequential analysis of a more detailed description level (with a greater dimensionality of the feature vector). At each stage principal components are compared, the number of the components being chosen according to a given portion of explained variations. Datasets like Labeled Faces in the Wild, YouTubeFaces, IAPRA, Jenus Benchmark C and different neural-net face descriptors are used to compare the algorithm with other methods. In contrast with the conventional nearest-neighbor method, the proposed approach is shown to allow a 2- to 16-times reduction of the classifier running time.

Food analysis is one of the most important parts of user preference prediction engines for recommendation systems in the travel domain. In this paper, we describe and study the neural network method that allows you to recognize food in a gallery of photos taken with mobile devices. The described method consists of three main stages, including the classification of scenes, food detection, and subsequent classification. An essential feature of the developed method is the use of lightweight neural network models, which allows its usage on mobile devices. The development of the method was carried out using both known open data and a proprietary data set.

It is proved that sign network with elliptical distribution with known shift parameter is equivalent to sign network with elliptical distribution with unknown shift parameter estimated as sample mean. This result is proved for the case of independent identically distributed observation and for the case of sample from matrix elliptically contoured distribution with any dependence between observations.

In this paper, we consider the problem of event recognition on single images. In contrast to conventional fine-tuning of convolutional neural networks (CNN), we proposed to use image captioning, i.e., a generative model that converts images to textual descriptions. The motivation here is the possibility to combine conventional CNNs with a completely different approach in an ensemble with high diversity. As event recognition task has nothing serial or temporal, obtained captions are one-hot encoded and summarized into a sparse feature vector suitable for the learning of an arbitrary classifier. We provide the experimental study of several feature extractors for Photo Event Collection, Web Image Dataset for Event Recognition and Multi-Label Curation of Flickr Events Dataset. It is shown that the image captions trained on the Conceptual Captions dataset can be classified more accurately than the features from an object detector, though they both are obviously not as rich as the CNN-based features. However, an ensemble of CNN and our approach provides state-of-the-art results for several event datasets.

In this paper a new formulation of event recognition task is examined: it is required to predict event categories given a gallery of images, for which albums (groups of photos corresponding to a single event) are unknown. The novel two-stage approach is proposed. At first, features are extracted in each photo using the pre-trained convolutional neural network (CNN). These features are classified individually. The normalized scores of the classifier are used to group sequential photos into several clusters. Finally, the features of photos in each group are aggregated into a single descriptor using neural attention mechanism. This algorithm is implemented in Android mobile application. Experimental study with features extracted by contemporary convolutional neural networks including EfficientNets for Photo Event Collection and Multi-Label Curation of Flickr Events Dataset demonstrates that the proposed approach is 9-23% more accurate than conventional event recognition on single photos. Moreover, proposed method has 13-16% lower error rate when compared to classification of groups of photos obtained with hierarchical clustering of CNN-based embeddings.

In this paper we propose an efficient heuristic for the Vehicle Routing Problem on Trees (TVRP). An initial solution is constructed with a greedy algorithm based on the Depth-First Search (DFS) approach. To optimize initial solutions obtained by our DFS heuristic, Ruin-and-Recreate (RR) method is then applied. For diversification purposes a randomization mechanism is added to the construction of initial solutions and DFS+RR algorithm is executed multiple times until the best found solution stops changing. The results of our approach are compared with the solutions obtained by the exact model of Chandran & Raghavan (2008). The computational experiments show that the suggested heuristic is fast and finds solutions which differ from optimal ones less than by 1% in average.

In this paper, we address the problem of detecting small objects on high-quality X-ray imagesusing deep neural networks. We propose to implement the two-stage approach, in which, firstly, input image issplit into partially overlapping blocks to make small objects more discriminative for detection. Secondly, the small blocks are fed into conventional single-shot detectors. These detectors are trained using the blocks of the training images extracted by the same procedure.Two datasets of X-ray images from the customs inspection complex are examined in the experimental study. It was shown thatthe proposed algorithm with data augmentationleads tomore precise results when compared to the conventional technique:ourmethod outperforms the traditional approach by 5.4 - 25.7% depending on the type of used backbone convolutional neural network.

Independent domination is one of the rare problems for which the complexity of weighted and unweighted versions is known to be different in some classes of graphs. Trying to better understand the gap between the two versions of the problem, in the present paper we prove two complexity results. First, we extend NP-hardness of the weighted version in a certain class to the unweighted case. Second, we strengthen polynomial-time solvability of the unweighted version in the class of P2+P3-free graphs to the weighted case. This result is tight in the sense that both versions are NP-hard in the class of P3+P3-free graphs, i.e. P3+P3 is a minimal graph forbidding of which produces an NP-hard case for both versions of the problem.

This book constitutes the proceedings of the 19th International Conference on Mathematical Optimization Theory and Operations Research, MOTOR 2020, held in Novosibirsk, Russia, in July 2020. The 31 full papers presented in this volume were carefully reviewed and selected from 102 submissions. The papers are grouped in these topical sections: discrete optimization; mathematical programming; game theory; scheduling problem; heuristics and metaheuristics; and operational research applications.

Network interdiction problems by deleting critical edges have wide applicatio ns. However, in some practical applications, the goal of deleting edges is difficult to achieve. We consider the maximum shortest path interdiction problem by upgrading edges on trees (MSPIT) under unit/weighted l1l1 norm. We aim to maximize the the length of the shortest path from the root to all the leaves by increasing the weights of some edges such that the upgrade cost under unit/weighted l1l1 norm is upper-bounded by a given value. We construct their mathematical models and prove some properties. We propose a revised algorithm for the problem (MSPIT) under unit l1l1 norm with time complexity *O*(*n*), where *n* is the number of vertices in the tree. We put forward a primal dual algorithm in O(n2)O(n2) time to solve the problem (MSPIT) under weighted l1l1 norm, in which a minimum cost cut is found in each iteration. We also solve the problem to minimize the cost to upgrade edges such that the length of the shortest path is lower bounded by a value and present an O(n2)O(n2) algorithm. Finally, we perform some numerical experiments to compare the results obtained by these algorithms.

The problem of automated quality control of audio recordings containing voice samples of individuals is considered. It is shown that in solving this problem the most acute impediment is the problem of small samples of observations. To overcome the problem, a new, high-speed method of acoustic measurements is proposed, based on the principle of relative stability of the frequency of the fundamental tone of a speech signal within a voice sample of short duration. An example of practical implementation of the developed method according to a scheme with inter-period signal accumulation is considered. Using proprietary software, a full-scale experiment was conducted in which statistical estimates of the effectiveness of the method in noise conditions were obtained. It is shown that when using the proposed method, if the signal-to-noise ratio is lower than 15 dB, an audio recording is rejected with a probability of 0.95 or more as unsuitable for biometric identification of a person. The results obtained are intended for use in the development of new systems and modernization of existing systems and technologies for collection and automated quality control of biometric personal data. The article is intended for a wide circle of specialists in the field of acoustic measurements and digital processing of speech signals, as well as for practitioners who organize the work of authorized organizations in preparing samples of biometric personal data for registration with the unified biometric system (UBS).