All issues
- 2024 Vol. 16
- 2023 Vol. 15
- 2022 Vol. 14
- 2021 Vol. 13
- 2020 Vol. 12
- 2019 Vol. 11
- 2018 Vol. 10
- 2017 Vol. 9
- 2016 Vol. 8
- 2015 Vol. 7
- 2014 Vol. 6
- 2013 Vol. 5
- 2012 Vol. 4
- 2011 Vol. 3
- 2010 Vol. 2
- 2009 Vol. 1
-
On the stochastic gradient descent matrix factorization in application to the supervised classification of microarrays
Computer Research and Modeling, 2013, v. 5, no. 2, pp. 131-140Citations: 4 (RSCI).Microarray datasets are highly dimensional, with a small number of collected samples in comparison to thousands of features. This poses a significant challenge that affects the interpretation, applicability and validation of the analytical results. Matrix factorizations have proven to be a useful method for describing data in terms of a small number of meta-features, which reduces noise, while still capturing the essential features of the data. Three novel and mutually relevant methods are presented in this paper: 1) gradient-based matrix factorization with two adaptive learning rates (in accordance with the number of factor matrices) and their automatic updates; 2) nonparametric criterion for the selection of the number of factors; and 3) nonnegative version of the gradient-based matrix factorization which doesn't require any extra computational costs in difference to the existing methods. We demonstrate effectiveness of the proposed methods to the supervised classification of gene expression data.
-
Direct multiplicative methods for sparse matrices. Linear programming
Computer Research and Modeling, 2017, v. 9, no. 2, pp. 143-165Views (last year): 10. Citations: 2 (RSCI).Multiplicative methods for sparse matrices are best suited to reduce the complexity of operations solving systems of linear equations performed on each iteration of the simplex method. The matrix of constraints in these problems of sparsely populated nonzero elements, which allows to obtain the multipliers, the main columns which are also sparse, and the operation of multiplication of a vector by a multiplier according to the complexity proportional to the number of nonzero elements of this multiplier. In addition, the transition to the adjacent basis multiplier representation quite easily corrected. To improve the efficiency of such methods requires a decrease in occupancy multiplicative representation of the nonzero elements. However, at each iteration of the algorithm to the sequence of multipliers added another. As the complexity of multiplication grows and linearly depends on the length of the sequence. So you want to run from time to time the recalculation of inverse matrix, getting it from the unit. Overall, however, the problem is not solved. In addition, the set of multipliers is a sequence of structures, and the size of this sequence is inconvenient is large and not precisely known. Multiplicative methods do not take into account the factors of the high degree of sparseness of the original matrices and constraints of equality, require the determination of initial basic feasible solution of the problem and, consequently, do not allow to reduce the dimensionality of a linear programming problem and the regular procedure of compression — dimensionality reduction of multipliers and exceptions of the nonzero elements from all the main columns of multipliers obtained in previous iterations. Thus, the development of numerical methods for the solution of linear programming problems, which allows to overcome or substantially reduce the shortcomings of the schemes implementation of the simplex method, refers to the current problems of computational mathematics.
In this paper, the approach to the construction of numerically stable direct multiplier methods for solving problems in linear programming, taking into account sparseness of matrices, presented in packaged form. The advantage of the approach is to reduce dimensionality and minimize filling of the main rows of multipliers without compromising accuracy of the results and changes in the position of the next processed row of the matrix are made that allows you to use static data storage formats.
As a direct continuation of this work is the basis for constructing a direct multiplicative algorithm set the direction of descent in the Newton methods for unconstrained optimization is proposed to put a modification of the direct multiplier method, linear programming by integrating one of the existing design techniques significantly positive definite matrix of the second derivatives.
-
Direct multiplicative methods for sparse matrices. Quadratic programming
Computer Research and Modeling, 2018, v. 10, no. 4, pp. 407-420Views (last year): 32.A numerically stable direct multiplicative method for solving systems of linear equations that takes into account the sparseness of matrices presented in a packed form is considered. The advantage of the method is the calculation of the Cholesky factors for a positive definite matrix of the system of equations and its solution within the framework of one procedure. And also in the possibility of minimizing the filling of the main rows of multipliers without losing the accuracy of the results, and no changes are made to the position of the next processed row of the matrix, which allows using static data storage formats. The solution of the system of linear equations by a direct multiplicative algorithm is, like the solution with LU-decomposition, just another scheme for implementing the Gaussian elimination method.
The calculation of the Cholesky factors for a positive definite matrix of the system and its solution underlies the construction of a new mathematical formulation of the unconditional problem of quadratic programming and a new form of specifying necessary and sufficient conditions for optimality that are quite simple and are used in this paper to construct a new mathematical formulation for the problem of quadratic programming on a polyhedral set of constraints, which is the problem of finding the minimum distance between the origin ordinate and polyhedral boundary by means of a set of constraints and linear algebra dimensional geometry.
To determine the distance, it is proposed to apply the known exact method based on solving systems of linear equations whose dimension is not higher than the number of variables of the objective function. The distances are determined by the construction of perpendiculars to the faces of a polyhedron of different dimensions. To reduce the number of faces examined, the proposed method involves a special order of sorting the faces. Only the faces containing the vertex closest to the point of the unconditional extremum and visible from this point are subject to investigation. In the case of the presence of several nearest equidistant vertices, we investigate a face containing all these vertices and faces of smaller dimension that have at least two common nearest vertices with the first face.
-
Four-factor computing experiment for the random walk on a two-dimensional square field
Computer Research and Modeling, 2017, v. 9, no. 6, pp. 905-918Views (last year): 21.Nowadays the random search became a widespread and effective tool for solving different complex optimization and adaptation problems. In this work, the problem of an average duration of a random search for one object by another is regarded, depending on various factors on a square field. The problem solution was carried out by holding total experiment with 4 factors and orthogonal plan with 54 lines. Within each line, the initial conditions and the cellular automaton transition rules were simulated and the duration of the search for one object by another was measured. As a result, the regression model of average duration of a random search for an object depending on the four factors considered, specifying the initial positions of two objects, the conditions of their movement and detection is constructed. The most significant factors among the factors considered in the work that determine the average search time are determined. An interpretation is carried out in the problem of random search for an object from the constructed model. The important result of the work is that the qualitative and quantitative influence of initial positions of objects, the size of the lattice and the transition rules on the average duration of search is revealed by means of model obtained. It is shown that the initial neighborhood of objects on the lattice does not guarantee a quick search, if each of them moves. In addition, it is quantitatively estimated how many times the average time of searching for an object can increase or decrease with increasing the speed of the searching object by 1 unit, and also with increasing the field size by 1 unit, with different initial positions of the two objects. The exponential nature of the growth in the number of steps for searching for an object with an increase in the lattice size for other fixed factors is revealed. The conditions for the greatest increase in the average search duration are found: the maximum distance of objects in combination with the immobility of one of them when the field size is changed by 1 unit. (that is, for example, with $4 \times 4$ at $5 \times 5$) can increase the average search duration in $e^{1.69} \approx 5.42$. The task presented in the work may be relevant from the point of view of application both in the landmark for ensuring the security of the state, and, for example, in the theory of mass service.
-
Difference scheme for solving problems of hydrodynamics for large grid Peclet numbers
Computer Research and Modeling, 2019, v. 11, no. 5, pp. 833-848The paper discusses the development and application of the accounting rectangular cell fullness method with material substance, in particular, a liquid, to increase the smoothness and accuracy of a finite-difference solution of hydrodynamic problems with a complex shape of the boundary surface. Two problems of computational hydrodynamics are considered to study the possibilities of the proposed difference schemes: the spatial-twodimensional flow of a viscous fluid between two coaxial semi-cylinders and the transfer of substances between coaxial semi-cylinders. Discretization of diffusion and convection operators was performed on the basis of the integro-interpolation method, taking into account taking into account the fullness of cells and without it. It is proposed to use a difference scheme, for solving the problem of diffusion – convection at large grid Peclet numbers, that takes into account the cell population function, and a scheme on the basis of linear combination of the Upwind and Standard Leapfrog difference schemes with weight coefficients obtained by minimizing the approximation error at small Courant numbers. As a reference, an analytical solution describing the Couette – Taylor flow is used to estimate the accuracy of the numerical solution. The relative error of calculations reaches 70% in the case of the direct use of rectangular grids (stepwise approximation of the boundaries), under the same conditions using the proposed method allows to reduce the error to 6%. It is shown that the fragmentation of a rectangular grid by 2–8 times in each of the spatial directions does not lead to the same increase in the accuracy that numerical solutions have, obtained taking into account the fullness of the cells. The proposed difference schemes on the basis of linear combination of the Upwind and Standard Leapfrog difference schemes with weighting factors of 2/3 and 1/3, respectively, obtained by minimizing the order of approximation error, for the diffusion – convection problem have a lower grid viscosity and, as a corollary, more precisely, describe the behavior of the solution in the case of large grid Peclet numbers.
-
Application of a hybrid large-particle method to the computation of the interaction of a shock wave with a gas suspension layer
Computer Research and Modeling, 2020, v. 12, no. 6, pp. 1323-1338For a non-homogeneous model transport equation with source terms, the stability analysis of a linear hybrid scheme (a combination of upwind and central approximations) is performed. Stability conditions are obtained that depend on the hybridity parameter, the source intensity factor (the product of intensity per time step), and the weight coefficient of the linear combination of source power on the lower- and upper-time layer. In a nonlinear case for the non-equilibrium by velocities and temperatures equations of gas suspension motion, the linear stability analysis was confirmed by calculation. It is established that the maximum permissible Courant number of the hybrid large-particle method of the second order of accuracy in space and time with an implicit account of friction and heat exchange between gas and particles does not depend on the intensity factor of interface interactions, the grid spacing and the relaxation times of phases (K-stability). In the traditional case of an explicit method for calculating the source terms, when a dimensionless intensity factor greater than 10, there is a catastrophic (by several orders of magnitude) decrease in the maximum permissible Courant number, in which the calculated time step becomes unacceptably small.
On the basic ratios of Riemann’s problem in the equilibrium heterogeneous medium, we obtained an asymptotically exact self-similar solution of the problem of interaction of a shock wave with a layer of gas-suspension to which converge the numerical solution of two-velocity two-temperature dynamics of gassuspension when reducing the size of dispersed particles.
The dynamics of the shock wave in gas and its interaction with a limited gas suspension layer for different sizes of dispersed particles: 0.1, 2, and 20 ìm were studied. The problem is characterized by two discontinuities decay: reflected and refracted shock waves at the left boundary of the layer, reflected rarefaction wave, and a past shock wave at the right contact edge. The influence of relaxation processes (dimensionless phase relaxation times) to the flow of a gas suspension is discussed. For small particles, the times of equalization of the velocities and temperatures of the phases are small, and the relaxation zones are sub-grid. The numerical solution at characteristic points converges with relative accuracy $O \, (10^{-4})$ to self-similar solutions.
-
The invariance principle of La-Salle and mathematical models for the evolution of microbial populations
Computer Research and Modeling, 2011, v. 3, no. 2, pp. 177-190Views (last year): 8. Citations: 3 (RSCI).A mathematical model for the evolution of microbial populations during prolonged cultivation in a chemostat has been constructed. This model generalizes the sequence of the well-known mathematical models of the evolution, in which such factors of the genetic variability were taken into account as chromosomal mutations, mutations in plasmid genes, the horizontal gene transfer, the plasmid loss due to cellular division and others. Liapunov’s function for the generic model of evolution is constructed. The existence proof of bounded, positive invariant and globally attracting set in the state space of the generic mathematical model for the evolution is presented because of the application of La-Salle’s theorem. The analytic description of this set is given. Numerical methods for estimate of the number of limit sets, its location and following investigation in the mathematical models for evolution are discussed.
-
Stoichiometric synthesis of metabolic pathways
Computer Research and Modeling, 2015, v. 7, no. 6, pp. 1241-1267Views (last year): 6. Citations: 3 (RSCI).A vector-matrix approach to the theoretical design of metabolic pathways converting chemical compounds, viz., preset substrates, into desirable products is described. It is a mathematical basis for computer–aided generation of alternative biochemical reaction sets executing the given substrate–product conversion. The pathways are retrieved from the used database of biochemical reactions and utilize the reaction stoichiometry and restrictions based on the irreversibility of a part of them. Particular attention is paid to the analysis of restriction interrelations. It is shown that the number of restrictions can be notably reduced due to the existence of families of parallel restricting planes in the space of reaction flows. Coinciding planes of contradirectional restrictions result in the existence of fixed reaction flow values. The problem of exclusion of so called futile cycles is also considered. Utilization of these factors allows essential lowering of the problem complexity and necessary computational resources. An example of alternative biochemical pathway computation for conversion of glucose and glycerol into succinic acid is given. It is found that for a preset “substrate–product” pair many pathways have the same high-energy bond balance.
-
Machine learning interpretation of inter-well radiowave survey data
Computer Research and Modeling, 2019, v. 11, no. 4, pp. 675-684Views (last year): 3.Traditional geological search methods going to be ineffective. The exploration depth of kimberlite bodies and ore deposits has increased significantly. The only direct exploration method is to drill a system of wells to the depths that provide access to the enclosing rocks. Due to the high cost of drilling, the role of inter-well survey methods has increased. They allows to increase the mean well spacing without significantly reducing the kimberlite or ore body missing probability. The method of inter-well radio wave survey is effective to search for high contrast conductivity objects. The physics of the method based on the dependence of the electromagnetic wave propagation on the propagation medium conductivity. The source and receiver of electromagnetic radiation is an electric dipole, they are placed in adjacent wells. The distance between the source and receiver is known. Therefore we could estimate the medium absorption coefficient by the rate of radio wave amplitude decrease. Low electrical resistance rocks corresponds to high absorption of radio waves. The inter-well measurement data allows to estimate an effective electrical resistance (or conductivity) of the rock. Typically, the source and receiver are immersed in adjacent wells synchronously. The value of the of the electric field amplitude measured at the receiver site allows to estimate the average value of the attenuation coefficient on the line connecting the source and receiver. The measurements are taken during stops, approximately every 5 m. The distance between stops is much less than the distance between adjacent wells. This leads to significant spatial anisotropy in the measured data distribution. Drill grid covers a large area, and our point is to build a three-dimensional model of the distribution of the electrical properties of the inter-well space throughout the whole area. The anisotropy of spatial distribution makes hard to the use of standard geostatistics approach. To build a three-dimensional model of attenuation coefficient, we used one of machine learning theory methods, the method of nearest neighbors. In this method, the value of the absorption coefficient at a given point is calculated by $k$ nearest measurements. The number $k$ should be determined from additional reasons. The spatial distribution anisotropy effect can be reduced by changing the spatial scale in the horizontal direction. The scale factor $\lambda$ is one yet external parameter of the problem. To select the parameters $k$ and $\lambda$ values we used the determination coefficient. To demonstrate the absorption coefficient three-dimensional image construction we apply the procedure to the inter-well radio wave survey data. The data was obtained at one of the sites in Yakutia.
-
Evaluation of the scalability property of the program for the simulation of atmospheric chemical transport by means of the simulator gem5
Computer Research and Modeling, 2020, v. 12, no. 4, pp. 773-794In this work we have developed a new efficient program for the numerical simulation of 3D global chemical transport on an adaptive finite-difference grid which allows us to concentrate grid points in the regions where flow variables sharply change and coarsen the grid in the regions of their smooth behavior, which significantly minimizes the grid size. We represent the adaptive grid with a combination of several dynamic (tree, linked list) and static (array) data structures. The dynamic data structures are used for a grid reconstruction, and the calculations of the flow variables are based on the static data structures. The introduction of the static data structures allows us to speed up the program by a factor of 2 in comparison with the conventional approach to the grid representation with only dynamic data structures.
We wrote and tested our program on a computer with 6 CPU cores. Using the computer microarchitecture simulator gem5, we estimated the scalability property of the program on a significantly greater number of cores (up to 32), using several models of a computer system with the design “computational cores – cache – main memory”. It has been shown that the microarchitecture of a computer system has a significant impact on the scalability property, i.e. the same program demonstrates different efficiency on different computer microarchitectures. For example, we have a speedup of 14.2 on a processor with 32 cores and 2 cache levels, but we have a speedup of 22.2 on a processor with 32 cores and 3 cache levels. The execution time of a program on a computer model in gem5 is 104–105 times greater than the execution time of the same program on a real computer and equals 1.5 hours for the most complex model.
Also in this work we describe how to configure gem5 and how to perform simulations with gem5 in the most optimal way.
Indexed in Scopus
Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU
The journal is included in the Russian Science Citation Index
The journal is included in the RSCI
International Interdisciplinary Conference "Mathematics. Computing. Education"