Результаты поиска по 'gradient method':
Найдено статей: 54
  1. Editor’s note
    Computer Research and Modeling, 2024, v. 16, no. 4, pp. 821-823
  2. Microarray datasets are highly dimensional, with a small number of collected samples in comparison to thousands of features. This poses a significant challenge that affects the interpretation, applicability and validation of the analytical results. Matrix factorizations have proven to be a useful method for describing data in terms of a small number of meta-features, which reduces noise, while still capturing the essential features of the data. Three novel and mutually relevant methods are presented in this paper: 1) gradient-based matrix factorization with two adaptive learning rates (in accordance with the number of factor matrices) and their automatic updates; 2) nonparametric criterion for the selection of the number of factors; and 3) nonnegative version of the gradient-based matrix factorization which doesn't require any extra computational costs in difference to the existing methods. We demonstrate effectiveness of the proposed methods to the supervised classification of gene expression data.

    Citations: 4 (RSCI).
  3. Gasnikov A.V., Gorbunov E.A., Kovalev D.A., Mohammed A.A., Chernousova E.O.
    The global rate of convergence for optimal tensor methods in smooth convex optimization
    Computer Research and Modeling, 2018, v. 10, no. 6, pp. 737-753

    In this work we consider Monteiro – Svaiter accelerated hybrid proximal extragradient (A-HPE) framework and accelerated Newton proximal extragradient (A-NPE) framework. The last framework contains an optimal method for rather smooth convex optimization problems with second-order oracle. We generalize A-NPE framework for higher order derivative oracle (schemes). We replace Newton’s type step in A-NPE that was used for auxiliary problem by Newton’s regularized (tensor) type step (Yu. Nesterov, 2018). Moreover we generalize large step A-HPE/A-NPE framework by replacing Monteiro – Svaiter’s large step condition so that this framework could work for high-order schemes. The main contribution of the paper is as follows: we propose optimal highorder methods for convex optimization problems. As far as we know for that moment there exist only zero, first and second order optimal methods that work according to the lower bounds. For higher order schemes there exists a gap between the lower bounds (Arjevani, Shamir, Shiff, 2017) and existing high-order (tensor) methods (Nesterov – Polyak, 2006; Yu.Nesterov, 2008; M. Baes, 2009; Yu.Nesterov, 2018). Asymptotically the ratio of the rates of convergences for the best existing methods and lower bounds is about 1.5. In this work we eliminate this gap and show that lower bounds are tight. We also consider rather smooth strongly convex optimization problems and show how to generalize the proposed methods to this case. The basic idea is to use restart technique until iteration sequence reach the region of quadratic convergence of Newton method and then use Newton method. One can show that the considered method converges with optimal rates up to a logarithmic factor. Note, that proposed in this work technique can be generalized in the case when we can’t solve auxiliary problem exactly, moreover we can’t even calculate the derivatives of the functional exactly. Moreover, the proposed technique can be generalized to the composite optimization problems and in particular to the constraint convex optimization problems. We also formulate a list of open questions that arise around the main result of this paper (optimal universal method of high order e.t.c.).

    Views (last year): 75.
  4. Alkousa M.S.
    On some stochastic mirror descent methods for constrained online optimization problems
    Computer Research and Modeling, 2019, v. 11, no. 2, pp. 205-217

    The problem of online convex optimization naturally occurs in cases when there is an update of statistical information. The mirror descent method is well known for non-smooth optimization problems. Mirror descent is an extension of the subgradient method for solving non-smooth convex optimization problems in the case of a non-Euclidean distance. This paper is devoted to a stochastic variant of recently proposed Mirror Descent methods for convex online optimization problems with convex Lipschitz (generally, non-smooth) functional constraints. This means that we can still use the value of the functional constraint, but instead of (sub)gradient of the objective functional and the functional constraint, we use their stochastic (sub)gradients. More precisely, assume that on a closed subset of $n$-dimensional vector space, $N$ convex Lipschitz non-smooth functionals are given. The problem is to minimize the arithmetic mean of these functionals with a convex Lipschitz constraint. Two methods are proposed, for solving this problem, using stochastic (sub)gradients: adaptive method (does not require knowledge of Lipschitz constant neither for the objective functional, nor for the functional of constraint) and non-adaptivemethod (requires knowledge of Lipschitz constant for the objective functional and the functional of constraint). Note that it is allowed to calculate the stochastic (sub)gradient of each functional only once. In the case of non-negative regret, we find that the number of non-productive steps is $O$($N$), which indicates the optimality of the proposed methods. We consider an arbitrary proximal structure, which is essential for decisionmaking problems. The results of numerical experiments are presented, allowing to compare the work of adaptive and non-adaptive methods for some examples. It is shown that the adaptive method can significantly improve the number of the found solutions.

    Views (last year): 42.
  5. Tyurin A.I.
    Primal-dual fast gradient method with a model
    Computer Research and Modeling, 2020, v. 12, no. 2, pp. 263-274

    In this work we consider a possibility to use the conception of $(\delta, L)$-model of a function for optimization tasks, whereby solving a primal problem there is a necessity to recover a solution of a dual problem. The conception of $(\delta, L)$-model is based on the conception of $(\delta, L)$-oracle which was proposed by Devolder–Glineur–Nesterov, herewith the authors proposed approximate a function with an upper bound using a convex quadratic function with some additive noise $\delta$. They managed to get convex quadratic upper bounds with noise even for nonsmooth functions. The conception of $(\delta, L)$-model continues this idea by using instead of a convex quadratic function a more complex convex function in an upper bound. Possibility to recover the solution of a dual problem gives great benefits in different problems, for instance, in some cases, it is faster to find a solution in a primal problem than in a dual problem. Note that primal-dual methods are well studied, but usually each class of optimization problems has its own primal-dual method. Our goal is to develop a method which can find solutions in different classes of optimization problems. This is realized through the use of the conception of $(\delta, L)$-model and adaptive structure of our methods. Thereby, we developed primal-dual adaptive gradient method and fast gradient method with $(\delta, L)$-model and proved convergence rates of the methods, moreover, for some classes of optimization problems the rates are optimal. The main idea is the following: we find a dual solution to an approximation of a primal problem using the conception of $(\delta, L)$-model. It is much easier to find a solution to an approximated problem, however, we have to do it in each step of our method, thereby the principle of “divide and conquer” is realized.

  6. Ryabtsev A.B.
    The error accumulation in the conjugate gradient method for degenerate problem
    Computer Research and Modeling, 2021, v. 13, no. 3, pp. 459-472

    In this paper, we consider the conjugate gradient method for solving the problem of minimizing a quadratic function with additive noise in the gradient. Three concepts of noise were considered: antagonistic noise in the linear term, stochastic noise in the linear term and noise in the quadratic term, as well as combinations of the first and second with the last. It was experimentally obtained that error accumulation is absent for any of the considered concepts, which differs from the folklore opinion that, as in accelerated methods, error accumulation must take place. The paper gives motivation for why the error may not accumulate. The dependence of the solution error both on the magnitude (scale) of the noise and on the size of the solution using the conjugate gradient method was also experimentally investigated. Hypotheses about the dependence of the error in the solution on the noise scale and the size (2-norm) of the solution are proposed and tested for all the concepts considered. It turned out that the error in the solution (by function) linearly depends on the noise scale. The work contains graphs illustrating each individual study, as well as a detailed description of numerical experiments, which includes an account of the methods of noise of both the vector and the matrix.

  7. Agafonov A.D.
    Lower bounds for conditional gradient type methods for minimizing smooth strongly convex functions
    Computer Research and Modeling, 2022, v. 14, no. 2, pp. 213-223

    In this paper, we consider conditional gradient methods for optimizing strongly convex functions. These are methods that use a linear minimization oracle, which, for a given vector $p \in \mathbb{R}^n$, computes the solution of the subproblem

    \[ \text{Argmin}_{x\in X}{\langle p,\,x \rangle}. \]There are a variety of conditional gradient methods that have a linear convergence rate in a strongly convex case. However, in all these methods, the dimension of the problem is included in the rate of convergence, which in modern applications can be very large. In this paper, we prove that in the strongly convex case, the convergence rate of the conditional gradient methods in the best case depends on the dimension of the problem $ n $ as $ \widetilde {\Omega} \left(\!\sqrt {n}\right) $. Thus, the conditional gradient methods may turn out to be ineffective for solving strongly convex optimization problems of large dimensions.

    Also, the application of conditional gradient methods to minimization problems of a quadratic form is considered. The effectiveness of the Frank – Wolfe method for solving the quadratic optimization problem in the convex case on a simplex (PageRank) has already been proved. This work shows that the use of conditional gradient methods to solve the minimization problem of a quadratic form in a strongly convex case is ineffective due to the presence of dimension in the convergence rate of these methods. Therefore, the Shrinking Conditional Gradient method is considered. Its difference from the conditional gradient methods is that it uses a modified linear minimization oracle. It's an oracle, which, for a given vector $p \in \mathbb{R}^n$, computes the solution of the subproblem \[ \text{Argmin}\{\langle p, \,x \rangle\colon x\in X, \;\|x-x_0^{}\| \leqslant R \}. \] The convergence rate of such an algorithm does not depend on dimension. Using the Shrinking Conditional Gradient method the complexity (the total number of arithmetic operations) of solving the minimization problem of quadratic form on a $ \infty $-ball is obtained. The resulting evaluation of the method is comparable to the complexity of the gradient method.

  8. Antonov I.V., Bruttan I.V.
    Synthesis of the structure of organised systems as central problem of evolutionary cybernetics
    Computer Research and Modeling, 2023, v. 15, no. 5, pp. 1103-1124

    The article provides approaches to evolutionary modelling of synthesis of organised systems and analyses methodological problems of evolutionary computations of this kind. Based on the analysis of works on evolutionary cybernetics, evolutionary theory, systems theory and synergetics, we conclude that there are open problems in formalising the synthesis of organised systems and modelling their evolution. The article emphasises that the theoretical basis for the practice of evolutionary modelling is the principles of the modern synthetic theory of evolution. Our software project uses a virtual computing environment for machine synthesis of problem solving algorithms. In the process of modelling, we obtained the results on the basis of which we conclude that there are a number of conditions that fundamentally limit the applicability of genetic programming methods in the tasks of synthesis of functional structures. The main limitations are the need for the fitness function to track the step-by-step approach to the solution of the problem and the inapplicability of this approach to the problems of synthesis of hierarchically organised systems. We note that the results obtained in the practice of evolutionary modelling in general for the whole time of its existence, confirm the conclusion the possibilities of genetic programming are fundamentally limited in solving problems of synthesizing the structure of organized systems. As sources of fundamental difficulties for machine synthesis of system structures the article points out the absence of directions for gradient descent in structural synthesis and the absence of regularity of random appearance of new organised structures. The considered problems are relevant for the theory of biological evolution. The article substantiates the statement about the biological specificity of practically possible ways of synthesis of the structure of organised systems. As a theoretical interpretation of the discussed problem, we propose to consider the system-evolutionary concept of P.K.Anokhin. The process of synthesis of functional structures in this context is an adaptive response of organisms to external conditions based on their ability to integrative synthesis of memory, needs and information about current conditions. The results of actual studies are in favour of this interpretation. We note that the physical basis of biological integrativity may be related to the phenomena of non-locality and non-separability characteristic of quantum systems. The problems considered in this paper are closely related to the problem of creating strong artificial intelligence.

  9. Pasechnyuk D.A., Stonyakin F.S.
    One method for minimization a convex Lipschitz-continuous function of two variables on a fixed square
    Computer Research and Modeling, 2019, v. 11, no. 3, pp. 379-395

    In the article we have obtained some estimates of the rate of convergence for the recently proposed by Yu. E.Nesterov method of minimization of a convex Lipschitz-continuous function of two variables on a square with a fixed side. The idea of the method is to divide the square into smaller parts and gradually remove them so that in the remaining sufficiently small part. The method consists in solving auxiliary problems of one-dimensional minimization along the separating segments and does not imply the calculation of the exact value of the gradient of the objective functional. The main result of the paper is proved in the class of smooth convex functions having a Lipschitz-continuous gradient. Moreover, it is noted that the property of Lipschitzcontinuity for gradient is sufficient to require not on the whole square, but only on some segments. It is shown that the method can work in the presence of errors in solving auxiliary one-dimensional problems, as well as in calculating the direction of gradients. Also we describe the situation when it is possible to neglect or reduce the time spent on solving auxiliary one-dimensional problems. For some examples, experiments have demonstrated that the method can work effectively on some classes of non-smooth functions. In this case, an example of a simple non-smooth function is constructed, for which, if the subgradient is chosen incorrectly, even if the auxiliary one-dimensional problem is exactly solved, the convergence property of the method may not hold. Experiments have shown that the method under consideration can achieve the desired accuracy of solving the problem in less time than the other methods (gradient descent and ellipsoid method) considered. Partially, it is noted that with an increase in the accuracy of the desired solution, the operating time for the Yu. E. Nesterov’s method can grow slower than the time of the ellipsoid method.

    Views (last year): 34.
  10. Alkousa M.S., Gasnikov A.V., Dvurechensky P.E., Sadiev A.A., Razouk L.Ya.
    An approach for the nonconvex uniformly concave structured saddle point problem
    Computer Research and Modeling, 2022, v. 14, no. 2, pp. 225-237

    Recently, saddle point problems have received much attention due to their powerful modeling capability for a lot of problems from diverse domains. Applications of these problems occur in many applied areas, such as robust optimization, distributed optimization, game theory, and many applications in machine learning such as empirical risk minimization and generative adversarial networks training. Therefore, many researchers have actively worked on developing numerical methods for solving saddle point problems in many different settings. This paper is devoted to developing a numerical method for solving saddle point problems in the nonconvex uniformly-concave setting. We study a general class of saddle point problems with composite structure and H\"older-continuous higher-order derivatives. To solve the problem under consideration, we propose an approach in which we reduce the problem to a combination of two auxiliary optimization problems separately for each group of variables, the outer minimization problem w.r.t. primal variables, and the inner maximization problem w.r.t the dual variables. For solving the outer minimization problem, we use the Adaptive Gradient Method, which is applicable for nonconvex problems and also works with an inexact oracle that is generated by approximately solving the inner problem. For solving the inner maximization problem, we use the Restarted Unified Acceleration Framework, which is a framework that unifies the high-order acceleration methods for minimizing a convex function that has H\"older-continuous higher-order derivatives. Separate complexity bounds are provided for the number of calls to the first-order oracles for the outer minimization problem and higher-order oracles for the inner maximization problem. Moreover, the complexity of the whole proposed approach is then estimated.

Pages: next last »

Indexed in Scopus

Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU

The journal is included in the Russian Science Citation Index

The journal is included in the RSCI

International Interdisciplinary Conference "Mathematics. Computing. Education"