# Publications

Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications. In this paper we consider several compression techniques for recurrent neural networks including Long–Short Term Memory models. We make particular attention to the high-dimensional output problem caused by the very large vocabulary size. We focus on effective compression methods in the context of their exploitation on devices: pruning, quantization, and matrix decomposition approaches (low-rank factorization and tensor train decomposition, in particular). For each model we investigate the trade-off between its size, suitability for fast inference and perplexity. We propose a general pipeline for applying the most suitable methods to compress recurrent neural networks for language modeling. It has been shown in the experimental study with the Penn Treebank (PTB) dataset that the most efficient results in terms of speed and compression–perplexity balance are obtained by matrix decomposition techniques.

This paper is focused on the automatic extraction of persons and their attributes (gender, year of born) from album of photos and videos. A two-stage approach is proposed in which, firstly, the convolutional neural network simultaneously predicts age/gender from all photos and additionally extracts facial representations suitable for face identification. Here the MobileNet is modified and is preliminarily trained to perform face recognition in order to additionally recognize age and gender. The age is estimated as the expected value of top predictions in the neural network. In the second stage of the proposed approach, extracted faces are grouped using hierarchical agglomerative clustering techniques. The birth year and gender of a person in each cluster are estimated using aggregation of predictions for individual photos. The proposed approach is implemented in an Android mobile application. It is experimentally demonstrated that the quality of facial clustering for the developed network is competitive with the state-of-the-art results achieved by deep neural networks, though implementation of the proposed approach is much computationally cheaper. Moreover, this approach is characterized by more accurate age/gender recognition when compared to the publicly available models.

The problem of computing the width of simplices generated by the convex hull of their integer vertices is considered. An FPT algorithm, in which the parameter is the maximum absolute value of the rank minors of the matrix consisting from the simplex vertices, is presented.

In this paper, we studied the phonetic approach for voice processing. A method for automatic recognition of speech signals, in which each quasistationary segment is associated with a fuzzy set of phonemes, was developed. We proposed the operation of the probabilistic triangular norm for fuzzy sets corresponding to the input frame and the nearest reference phoneme. The developed method was experimentally shown to allow a 1.5–5% reduction in the probability of erroneous recognition in comparison with known analogues.

Let f:R^n→R be a conic function and x_0∈R^n. In this note, we show that the shallow separation oracle for the set K={x∈R^n:f(x)≤f(x_0)} can be polynomially reduced to the comparison oracle of the function *f*. Combining these results with known results of D. Dadush et al., we give an algorithm with (O(n))^n*logR calls to the comparison oracle for checking the non-emptiness of the set K∩Z^n, where *K* is included to the Euclidean ball of a radius *R*. Additionally, we give a randomized algorithm with the expected oracle complexity (O(n))^n*logR for the problem to find an integral vector that minimizes values of *f* on an Euclidean ball of a radius *R*. It is known that the classes of convex, strictly quasiconvex functions, and quasiconvex polynomials are included into the class of conic functions. Since any system of conic functions can be represented by a single conic function, the last facts give us an opportunity to check the feasibility of any system of convex, strictly quasiconvex functions, and quasiconvex polynomials by an algorithm with (O(n))^n*logR calls to the comparison oracle of the functions. It is also possible to solve a constraint minimization problem with the considered classes of functions by a randomized algorithm with (O(n))^n*logR expected oracle calls.Let f:R^n→R be a conic function and x_0∈R^n. In this note, we show that the shallow separation oracle for the set K={x∈R^n:f(x)≤f(x_0)} can be polynomially reduced to the comparison oracle of the function *f*. Combining these results with known results of D. Dadush et al., we give an algorithm with (O(n))^n*logR calls to the comparison oracle for checking the non-emptiness of the set K∩Z^n, where *K* is included to the Euclidean ball of a radius *R*. Additionally, we give a randomized algorithm with the expected oracle complexity (O(n))^n*logR for the problem to find an integral vector that minimizes values of *f* on an Euclidean ball of a radius *R*. It is known that the classes of convex, strictly quasiconvex functions, and quasiconvex polynomials are included into the class of conic functions. Since any system of conic functions can be represented by a single conic function, the last facts give us an opportunity to check the feasibility of any system of convex, strictly quasiconvex functions, and quasiconvex polynomials by an algorithm with (O(n))^n*logR calls to the comparison oracle of the functions. It is also possible to solve a constraint minimization problem with the considered classes of functions by a randomized algorithm with (O(n))^n*logR expected oracle calls.

We describe the class of graphs whose each subgraph has the next property: The maximal number of disjoint 4-paths is equal to the minimal cardinality of sets of vertices such that every 4-path in the subgraph contains at least one of these vertices. We completely describe the set of minimal forbidden subgraphs for this class. Moreover, we present an alternative description of the class based on the operations of edge subdivision applied to bipartite multigraphs and the addition of so-called pendant subgraphs, isomorphic to triangles and stars.

A Gaussian graphical model is a graphical representation of the dependence structure for

a Gaussian random vector. Gaussian graphical model selection is a statistical problem that

identifies the Gaussian graphical model from observations. There are several statistical

approaches for Gaussian graphical model identification. Their properties, such as unbiasedeness

and optimality, are not established. In this paper we study these properties.

We consider the graphical model selection problem in the framework of multiple decision

theory and suggest assessing these procedures using an additive loss function. Associated

risk function in this case is a linear combination of the expected numbers of the two types

of error (False Positive and False Negative). We combine the tests of a Neyman structure for

individual hypotheses with simultaneous inference and prove that the obtained multiple

decision procedure is optimal in the class of unbiased multiple decision procedures.

This survey paper attempts to cover a broad range of topics related to computational biomedicine. The field has been attracting great attention due to a number of benefits it can provide the society with. New technological and theoretical advances have made it possible to progress considerably. Traditionally, problems emerging in this field are challenging from many perspectives. In this paper,we considered the influence of big data on the field, problems associated with massive datasets in biomedicine and ways to address these problems. We analyzed the most commonly used machine learning and feature mining tools and several new trends and tendencies such as deep learning and biological networks for computational biomedicine.

We consider the class $G$ of gradient-like orientation preserving diffeomorphisms of the 2-sphere with saddles of negative orientation type. We show that the for every diffeomorphism $f\in G$ every saddle point is fixed. We show that there are exactly three equivalence classes (up to topological conjugacy) $G=G_1\cup G_2\cup G_3$ where a diffeomorphism $f_1\in G_1$ has exactly one saddle and three nodes (one fixed source and two periodic sinks); a diffeomorphism $f_2\in G_2$ has exactly two saddles and four nodes (two periodic sources and two periodic sinks) and a diffeomorphism $f_3\in G_3$ is topologically conjugate to a diffeomorphism $f_1^{-1}$. The main result is the proof that every diffeomorphism $f\in G$ can be connected to the ``source-sink'' diffeomorphism by a stable arc and this arc contains at most finitely many points of period-doubling bifurcations.

In this paper we find all possible periodic data for orientation preserving gradient-like diffeomorphisms of orientable surfaces with one saddle orbit. We also construct a system of this class for every admissible collection of periodic data.

In this paper, we consider the class of quasiconvex functions and its proper subclass of conic functions. The integer minimization problem of these functions is considered, assuming that the optimized function is defined by the comparison oracle. We will show that there is no a polynomial algorithm on log R to optimize quasiconvex functions in the ball of radius R using only the comparison oracle. On the other hand, if the optimized function is conic, then we show that there is a polynomial on log R algorithm (the dimension is fixed). We also present an exponential on the dimension lower bound for the oracle complexity of the conic function integer optimization problem. Additionally, we give examples of known problems that can be polynomially reduced to the minimization problem of functions in our classes.

Gaussian graphical model selection is a statistical problem

that identifies the Gaussian graphical model from observations. Existing

Gaussian graphical model selection methods focus on the error rate

for incorrect edge inclusion. However, when comparing statistical procedures,

it is also important to take into account the error rate for

incorrect edge exclusion. To handle this issue we consider the graphical

model selection problem in the framework of multiple decision theory.We

show that the statistical procedure based on simultaneous inference with

UMPU individual tests is optimal in the class of unbiased procedures.

The paper addresses the issue of insufficient speed of image recognition methods if the number of classes is rather large. We propose the novel algorithm based on sequential three-way decisions and a formal description of granular computing. Each image is associated with principal component scores of the high-dimensional features extracted by deep convolution neural network. Low number of principal components stand for the coarse-grained granules, while fine-grained granules include all components. Initially, first principal components of an observed image and all training instances are matched at the coarsest granularity level. Next, negative decisions are defined by using the multiple comparisons theory and asymptotic distribution of the Kullback-Leibler divergence. Namely, the distance factors (ratios of the minimum distance and all other distances) are evaluated. The set of negative decisions is populated by the instances, for which the distance factors exceed a certain threshold. The images from this set are not examined at the next levels with finer granularity. In the experiments unconstrained face recognition and image categorisation are considered using the state-of-the-art deep learning-based feature extractors. We demonstrate that the proposed approach decreases the running time in 1.5–10 times when compared to conventional classifiers and the known multi-class decision-theoretic rough sets.

In this paper we consider the problem of vehicle assignment in heterogeneous fleet site-dependent Vehicle Routing Problems (VRP) with split deliveries. In such VRP problems vehicles can have different capacities, fixed and travel costs, and site-dependency constraints limit for every customer a set of vehicles, which can serve it. The Vehicle Assignment Problem (VAP) arises in heuristic and exact algorithms, when vehicles are assigned to all customers or one customer is added to the current vehicle route. The VAP objective is to minimize the total assignment cost while the cost of assigning a vehicle to a customer is computed in some heuristic way. Without split deliveries, when a delivery to a customer cannot be split between two vehicles, the VAP problem is modeled in literature as the Generalized Assignment Problem. We demonstrate that allowing split deliveries reduces the VAP to the Hitchcock Transportation Problem, which can be efficiently solved with Transportation Simplex Methods. We also consider a special case, which is not rare in practice, when all customers are partitioned into classes, where customers have the same set of vehicles able to serve them, and the vehicle sets for these classes form a sequence of nested sets. We show that in this case if the cost per demand unit of assigning a vehicle to a customer depends only on the vehicle, then the VAP problem can be solved by a linear algorithm.

In this paper we focus on the problem of user prediction in visual product recommender systems based on the given set of photos of products purchased by the user previously. We studied neural aggregation methods for image features extracted by the deep neural networks. We propose the novel two-stage algorithm. At first, the image features are learned by fine-tuning the convolutional neural network. At the second stage, we sequentially combine the known learnable pooling techniques (neural aggregation network and context gating) in order to compute a single descriptor for particular user as a weighted average of image features. It is experimentally shown for the Amazon product dataset that F1-measure for our approach is more than 20% higher when compared to conventional averaging of the feature vector.

The Cell Formation Problem (CFP) is an important optimisation problem in manufacturing. It has been introduced in the Group Technology (GT) and its goal is to group machines and parts processed on them into production cells minimising the movement of parts to other cells for processing and maximising for each cell the loading of its machines with operations on its parts. We consider one of the computationally hardest formulations of this problem – the CFP with a variable number of cells and the grouping efficacy objective, which is a fractional function. The CFP literature contains many heuristic algorithms, but only a small number of exact approaches especially for this formulation. In the current paper, we present an exact branch-and-bound algorithm for the same hard CFP formulation. To linearise the fractional objective function, we apply the Dinkelbach approach. We have been able to solve 24 of the 35 instances from the well known GT benchmark. For the remaining 11 instances, the difference in the grouping efficacy with the best known solutions is less than 2.6%.