All issues
- 2024 Vol. 16
- 2023 Vol. 15
- 2022 Vol. 14
- 2021 Vol. 13
- 2020 Vol. 12
- 2019 Vol. 11
- 2018 Vol. 10
- 2017 Vol. 9
- 2016 Vol. 8
- 2015 Vol. 7
- 2014 Vol. 6
- 2013 Vol. 5
- 2012 Vol. 4
- 2011 Vol. 3
- 2010 Vol. 2
- 2009 Vol. 1
-
On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays
Computer Research and Modeling, 2013, v. 5, no. 2, pp. 131-140Citations: 4 (RSCI).Microarray datasets are highly dimensional, with a small number of collected samples in comparison to thousands of features. This poses a significant challenge that affects the interpretation, applicability and validation of the analytical results. Matrix factorizations have proven to be a useful method for describing data in terms of a small number of meta-features, which reduces noise, while still capturing the essential features of the data. Three novel and mutually relevant methods are presented in this paper: 1) gradient-based matrix factorization with two adaptive learning rates (in accordance with the number of factor matrices) and their automatic updates; 2) nonparametric criterion for the selection of the number of factors; and 3) nonnegative version of the gradient-based matrix factorization which doesn't require any extra computational costs in difference to the existing methods. We demonstrate effectiveness of the proposed methods to the supervised classification of gene expression data.
-
The error accumulation in the conjugate gradient method for degenerate problem
Computer Research and Modeling, 2021, v. 13, no. 3, pp. 459-472In this paper, we consider the conjugate gradient method for solving the problem of minimizing a quadratic function with additive noise in the gradient. Three concepts of noise were considered: antagonistic noise in the linear term, stochastic noise in the linear term and noise in the quadratic term, as well as combinations of the first and second with the last. It was experimentally obtained that error accumulation is absent for any of the considered concepts, which differs from the folklore opinion that, as in accelerated methods, error accumulation must take place. The paper gives motivation for why the error may not accumulate. The dependence of the solution error both on the magnitude (scale) of the noise and on the size of the solution using the conjugate gradient method was also experimentally investigated. Hypotheses about the dependence of the error in the solution on the noise scale and the size (2-norm) of the solution are proposed and tested for all the concepts considered. It turned out that the error in the solution (by function) linearly depends on the noise scale. The work contains graphs illustrating each individual study, as well as a detailed description of numerical experiments, which includes an account of the methods of noise of both the vector and the matrix.
-
Cluster method of mathematical modeling of interval-stochastic thermal processes in electronic systems
Computer Research and Modeling, 2020, v. 12, no. 5, pp. 1023-1038A cluster method of mathematical modeling of interval-stochastic thermal processes in complex electronic systems (ES), is developed. In the cluster method, the construction of a complex ES is represented in the form of a thermal model, which is a system of clusters, each of which contains a core that combines the heat-generating elements falling into a given cluster, the cluster shell and a medium flow through the cluster. The state of the thermal process in each cluster and every moment of time is characterized by three interval-stochastic state variables, namely, the temperatures of the core, shell, and medium flow. The elements of each cluster, namely, the core, shell, and medium flow, are in thermal interaction between themselves and elements of neighboring clusters. In contrast to existing methods, the cluster method allows you to simulate thermal processes in complex ESs, taking into account the uneven distribution of temperature in the medium flow pumped into the ES, the conjugate nature of heat exchange between the medium flow in the ES, core and shells of clusters, and the intervalstochastic nature of thermal processes in the ES, caused by statistical technological variation in the manufacture and installation of electronic elements in ES and random fluctuations in the thermal parameters of the environment. The mathematical model describing the state of thermal processes in a cluster thermal model is a system of interval-stochastic matrix-block equations with matrix and vector blocks corresponding to the clusters of the thermal model. The solution to the interval-stochastic equations are statistical measures of the state variables of thermal processes in clusters - mathematical expectations, covariances between state variables and variance. The methodology for applying the cluster method is shown on the example of a real ES.
-
Experimental comparison of PageRank vector calculation algorithms
Computer Research and Modeling, 2023, v. 15, no. 2, pp. 369-379Finding PageRank vector is of great scientific and practical interest due to its applicability to modern search engines. Despite the fact that this problem is reduced to finding the eigenvector of the stochastic matrix $P$, the need for new algorithms is justified by a large size of the input data. To achieve no more than linear execution time, various randomized methods have been proposed, returning the expected result only with some probability close enough to one. We will consider two of them by reducing the problem of calculating the PageRank vector to the problem of finding equilibrium in an antagonistic matrix game, which is then solved using the Grigoriadis – Khachiyan algorithm. This implementation works effectively under the assumption of sparsity of the input matrix. As far as we know, there are no successful implementations of neither the Grigoriadis – Khachiyan algorithm nor its application to the task of calculating the PageRank vector. The purpose of this paper is to fill this gap. The article describes an algorithm giving pseudocode and some details of the implementation. In addition, it discusses another randomized method of calculating the PageRank vector, namely, Markov chain Monte Carlo (MCMC), in order to compare the results of these algorithms on matrices with different values of the spectral gap. The latter is of particular interest, since the magnitude of the spectral gap strongly affects the convergence rate of MCMC and does not affect the other two approaches at all. The comparison was carried out on two types of generated graphs: chains and $d$-dimensional cubes. The experiments, as predicted by the theory, demonstrated the effectiveness of the Grigoriadis – Khachiyan algorithm in comparison with MCMC for sparse graphs with a small spectral gap value. The written code is publicly available, so everyone can reproduce the results themselves or use this implementation for their own needs. The work has a purely practical orientation, no theoretical results were obtained.
-
Statistical analysis of bigrams of specialized texts
Computer Research and Modeling, 2020, v. 12, no. 1, pp. 243-254The method of the stochastic matrix spectrum analysis is used to build an indicator that allows to determine the subject of scientific texts without keywords usage. This matrix is a matrix of conditional probabilities of bigrams, built on the statistics of the alphabet characters in the text without spaces, numbers and punctuation marks. Scientific texts are classified according to the mutual arrangement of invariant subspaces of the matrix of conditional probabilities of pairs of letter combinations. The separation indicator is the value of the cosine of the angle between the right and left eigenvectors corresponding to the maximum and minimum eigenvalues. The computational algorithm uses a special representation of the dichotomy parameter, which is the integral of the square norm of the resolvent of the stochastic matrix of bigrams along the circumference of a given radius in the complex plane. The tendency of the integral to infinity testifies to the approximation of the integration circuit to the eigenvalue of the matrix. The paper presents the typical distribution of the indicator of identification of specialties. For statistical analysis were analyzed dissertations on the main 19 specialties without taking into account the classification within the specialty, 20 texts for the specialty. It was found that the empirical distributions of the cosine of the angle for the mathematical and Humanities specialties do not have a common domain, so they can be formally divided by the value of this indicator without errors. Although the body of texts was not particularly large, nevertheless, in the case of arbitrary selection of dissertations, the identification error at the level of 2 % seems to be a very good result compared to the methods based on semantic analysis. It was also found that it is possible to make a text pattern for each of the specialties in the form of a reference matrix of bigrams, in the vicinity of which in the norm of summable functions it is possible to accurately identify the theme of the written scientific work, without using keywords. The proposed method can be used as a comparative indicator of greater or lesser severity of the scientific text or as an indicator of compliance of the text to a certain scientific level.
-
Nonsmooth Distributed Min-Max Optimization Using the Smoothing Technique
Computer Research and Modeling, 2023, v. 15, no. 2, pp. 469-480Distributed saddle point problems (SPPs) have numerous applications in optimization, matrix games and machine learning. For example, the training of generated adversarial networks is represented as a min-max optimization problem, and training regularized linear models can be reformulated as an SPP as well. This paper studies distributed nonsmooth SPPs with Lipschitz-continuous objective functions. The objective function is represented as a sum of several components that are distributed between groups of computational nodes. The nodes, or agents, exchange information through some communication network that may be centralized or decentralized. A centralized network has a universal information aggregator (a server, or master node) that directly communicates to each of the agents and therefore can coordinate the optimization process. In a decentralized network, all the nodes are equal, the server node is not present, and each agent only communicates to its immediate neighbors.
We assume that each of the nodes locally holds its objective and can compute its value at given points, i. e. has access to zero-order oracle. Zero-order information is used when the gradient of the function is costly, not possible to compute or when the function is not differentiable. For example, in reinforcement learning one needs to generate a trajectory to evaluate the current policy. This policy evaluation process can be interpreted as the computation of the function value. We propose an approach that uses a smoothing technique, i. e., applies a first-order method to the smoothed version of the initial function. It can be shown that the stochastic gradient of the smoothed function can be viewed as a random two-point gradient approximation of the initial function. Smoothing approaches have been studied for distributed zero-order minimization, and our paper generalizes the smoothing technique on SPPs.
Keywords: convex optimization, distributed optimization.
Indexed in Scopus
Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU
The journal is included in the Russian Science Citation Index
The journal is included in the RSCI
International Interdisciplinary Conference "Mathematics. Computing. Education"