# Publications

This book constitutes the proceedings of the 8th International Conference on Analysis of Images, Social Networks and Texts, AIST 2019, held in Kazan, Russia, in July 2019.

The 24 full papers and 10 short papers were carefully reviewed and selected from 134 submissions (of which 21 papers were rejected without being reviewed). The papers are organized in topical sections on general topics of data analysis; natural language processing; social network analysis; analysis of images and video; optimization problems on graphs and network structures; analysis of dynamic behaviour through event data.

This volume contains the refereed proceedings of the 8th International Conference on Analysis of Images, Social Networks, and Texts (AIST 2019). The previous conferences during 2012–2018 attracted a significant number of data scientists – students, researchers, academics, and engineers working on interdisciplinary data analysis of images, texts, and social networks.

For a graph *G* and a positive integer *k*, a subset *C* of vertices of *G* is called a *k*-path vertex cover if *C* intersects all paths of *k* vertices in *G*. The cardinality of a minimum *k*-path vertex cover is denoted by *β_{**P**_k*}(*G*). For a graph *G* and a positive integer *k*, a subset *M* of pairwise vertex-disjoint paths of *k* vertices in *G* is called a *k*-path packing. The cardinality of a maximum *k*-path packing is denoted by *μ**_{P**_k}*(*G*). In this paper, we describe some graphs, having equal values of *β_{**P**_k*} and *μ_{**P**_k*}, for *k*≥5, and present polynomial-time algorithms of finding a minimum *k*-path vertex cover and a maximum *k*-path packing in such graphs.

The goal of the study is to increase the computation efficiency of the face recognition that uses feature vectors to describe facial images on photos and videos. These high-dimensional feature vectors are nowadays produced by convolutional neural networks. The methods to aggregate the features generated for each video frame are used to process the video sequences. A novel hierarchical recognition algorithm is proposed. In contrast to known approaches our algorithm seeks the nearest neighbors only among reference images of most reliable classes selected at the preceding stage to carry out the sequential analysis of a more detailed description level (with a greater dimensionality of the feature vector). At each stage principal components are compared, the number of the components being chosen according to a given portion of explained variations. Datasets like Labeled Faces in the Wild, YouTubeFaces, IAPRA, Jenus Benchmark C and different neural-net face descriptors are used to compare the algorithm with other methods. In contrast with the conventional nearest-neighbor method, the proposed approach is shown to allow a 2- to 16-times reduction of the classifier running time.

Independent domination is one of the rare problems for which the complexity of weighted and unweighted versions is known to be different in some classes of graphs. Trying to better understand the gap between the two versions of the problem, in the present paper we prove two complexity results. First, we extend NP-hardness of the weighted version in a certain class to the unweighted case. Second, we strengthen polynomial-time solvability of the unweighted version in the class of P2+P3-free graphs to the weighted case. This result is tight in the sense that both versions are NP-hard in the class of P3+P3-free graphs, i.e. P3+P3 is a minimal graph forbidding of which produces an NP-hard case for both versions of the problem.

Under consideration is the Successive Minima Problem for the 2-dimensional lattice with respect to the order given by some conic function f.We propose an algorithm with complexity of 3.32*log_2(R)+O(1) calls to the comparison oracle of f, where R is the radius of the circular searching area, while the best known lower oracle complexity bound is 3*log_2(R) + O(1).We give an efficient criterion for checking that given vectors of a 2-dimensional lattice are successive minima and form a basis for the lattice.Moreover, we show that the similar Successive Minima Problem for dimension n can be solved by an algorithm with at most (O(n))^{2n}*log_2(R) calls to the comparison oracle. The results of the article can be applied to searching successive minima with respect to arbitrary convex functions defined by the comparison oracle.

This proceedings presents the result of the 8th International Conference in Network Analysis, held at the Higher School of Economics, Moscow, in May 2018. The conference brought together scientists, engineers, and researchers from academia, industry, and government. Contributions in this book focus on the development of network algorithms for data mining and its applications. Researchers and students in mathematics, economics, statistics, computer science, and engineering find this collection a valuable resource filled with the latest research in network analysis. Computational aspects and applications of large-scale networks in market models, neural networks, social networks, power transmission grids, maximum clique problem, telecommunication networks, and complexity graphs are included with new tools for efficient network analysis of large-scale networks. Machine learning techniques in network settings including community detection, clustering, and biclustering algorithms are presented with applications to social network analysis.

In the current paper we provide a proof of NP-completeness for the Cell Formation Problem (CFP) with the fractional grouping efficacy objective function. First the CFP with a linear objective function is considered. Following the ideas of Pinheiro et al. (2016) we show that it is equivalent to the Bicluster Graph Editing Problem (BGEP), which is known to be NP-complete due to the reduction from the 3-Exact 3-Cover Problem – 3E3CP (Amit, 2004). Then we suggest a polynomial reduction of the CFP with the linear objective to the CFP with the grouping efficacy objective. It proves the NP-completeness of this fractional CFP formulation. Along with the NP-status our paper presents important connections of the CFP with the BGEP and 3E3CP. Such connections could be used for ”transferring” of known theoretical properties, efficient algorithms, polynomial cases, and other features of well-studied graph editing and exact covering problems to the CFP.

In the present paper we prove that frames of one-dimensional separatrices in basins of sources of structurally stable 3-diffeomorhisms with two-dimensional expanding attractor are trivially embedded. This result plays an important part in the classification of such systems. The classification was given by V. Grines and E. Zhuzhoma with assumption that all one-dimensional separatrices are trivially embedded into the ambient manifold but the proof of the assumption was never given. Thus, the present paper is a necessary and nontrivial element of the classification of structurally stable diffeomorphisms with codimension one expanding attractors.

We consider a class of sequential network interdiction problem settings where the interdictor has incomplete initial information about the network while the evader has complete knowledge of the network including its structure and arc costs. In each decision epoch, the interdictor can block (for the duration of the epoch) at most k arcs known to him. By observing the evader's actions, the interdictor learns about the network structure and costs and thus, can adjust his actions in subsequent decision epochs. It is known from the literature that if the evader is greedy (i.e., she uses the shortest available path in each decision epochs), then under some assumptions the greedy interdiction policies that block k-most vital arcs in each epoch are efficient and have a finite regret. In this paper, we consider the evader's perspective and explore strategic evasion policies under the assumption that the interdictor is greedy. We first study the theoretical computational complexity of the evader's problem. Then we derive basic constructive properties of optimal evasion policies for two decision epochs when the interdictor has no initial information about the network structure. These properties are then exploited for the design of a heuristic algorithm for a strategic evader in a general setting with an arbitrary time horizon and any initial information available to the interdictor. Our computational experiments demonstrate that the proposed heuristic consistently outperforms the greedy evasion policy on several classes of synthetic network instances under perfect and noisy information feedbacks. Finally, some interesting insights from our theoretical and computational results conclude the paper.

If the training data set in image recognition task is not very large, the feature extraction with a convolutional neural network is usually applied. Here, we focus on the nonparametric classification of extracted feature vectors using the probabilistic neural network (PNN). The latter is characterized by the high runtime and memory space complexity. We propose to overcome these drawbacks by replacing the exponential activation function in the Gaussian kernel to the complex exponential functions. Such complex nonlinearities make it possible to accurately approximate the unknown density function using the network with the number of neurons proportional to only cubic root of the database size. As a result, the proposed approach decreases the runtime and memory complexities of the PNN without losing its main advantages, namely, fast training and convergence to the Bayesian decision. In the experimental study, we describe a protocol for comparing recognition methods using the well-known visual object category data sets in the context of the small sample size problem. It has been experimentally shown that our approach rapidly obtains accurate decisions when compared to the known classifiers including the baseline PNN.

For any *n*, in the set of *n*-vertex trees such that any two leaves have no common adjacent vertex, we describe the trees with the smallest number of maximal independent sets.

In this paper we consider several efficient techniques for multi-label classification of a set of images. We propose the two-stage approach. First, we perform transfer learning on a pretrained convolutional neural network in order to use it as a feature extractor. Next, the feature vectors of each image from a given input set are combined into a single vector using the modification of the attention-based neural aggregation module. In the experimental study we examine the classification of the user's interests based on the photos of products purchased by this user using the Amazon Product Home and Kitchen dataset. It was shown that one of the most highest F1-measure (0.87 for 15 recommendations) is achieved for one-layer attention block with squeezed visual features. It is emphasized that the resulting model including the MobileNet feature extractor has 16 Mb size only.

We developed a new method for measuring the pitch frequency of speech signals with elevated noise immunity. The problem of protection against intense background noise is solved in this method by the frequency selection of vocalized segments of speech signals according to a scheme with comb filter of interperiodic accumulation. The efficiency of the method is analyzed both theoretically and experimentally with the help of a multichannel frequency meter intended for the acoustic speech analysis. It is shown that, for a signal-to-noise ratio of 10 dB and higher, the error of the method does not exceed 2%.

This book constitutes the post-conference proceedings of the 8th International Conference on Analysis of Images, Social Networks and Texts, AIST 2019, held in Kazan, Russia, in July 2019.

The 27 full and 8 short papers were carefully reviewed and selected from 134 submissions (of which 21 papers were automatically rejected without being reviewed). The papers are organized in topical sections on general topics of data analysis; natural language processing; social network analysis; analysis of images and video; optimization problems on graphs and network structures; and analysis of dynamic behavior through event data.

We consider reformulations of a class of bilevel linear integer programs as equivalent linear mixed-integer programs (linear MIPs). Themost common technique to reformulate such programs as a single-level problem is to replace the lower-level linear optimization problem by Karush–Kuhn–Tucker (KKT) optimality conditions. Employing the strong duality (SD) property of linear programs is an alternative method to perform such transformations. In this note, we describe two SD-based reformulations where the key idea is to exploit the binary expansion of upper-level integer variables. We compare the performance of an offthe- shelfMIP solver with the SD-based reformulations against the KKT-based one and show that the SD-based approaches can lead to orders of magnitude reduction in computational times for certain classes of instances