Результаты поиска по 'differential games':
Найдено статей: 8
  1. Editor’s note
    Computer Research and Modeling, 2024, v. 16, no. 7, pp. 1533-1538
  2. Reshitko M.A., Usov A.B.
    Neural network methods for optimal control problems
    Computer Research and Modeling, 2022, v. 14, no. 3, pp. 539-557

    In this study we discuss methods to solve optimal control problems based on neural network techniques. We study hierarchical dynamical two-level system for surface water quality control. The system consists of a supervisor (government) and a few agents (enterprises). We consider this problem from the point of agents. In this case we solve optimal control problem with constraints. To solve this problem, we use Pontryagin’s maximum principle, with which we obtain optimality conditions. To solve emerging ODEs, we use feedforward neural network. We provide a review of existing techniques to study such problems and a review of neural network’s training methods. To estimate the error of numerical solution, we propose to use defect analysis method, adapted for neural networks. This allows one to get quantitative error estimations of numerical solution. We provide examples of our method’s usage for solving synthetic problem and a surface water quality control model. We compare the results of this examples with known solution (when provided) and the results of shooting method. In all cases the errors, estimated by our method are of the same order as the errors compared with known solution. Moreover, we study surface water quality control problem when no solutions is provided by other methods. This happens because of relatively large time interval and/or the case of several agents. In the latter case we seek Nash equilibrium between agents. Thus, in this study we show the ability of neural networks to solve various problems including optimal control problems and differential games and we show the ability of quantitative estimation of an error. From the numerical results we conclude that the presence of the supervisor is necessary for achieving the sustainable development.

  3. Ougolnitsky G.A., Usov A.B.
    Game-theoretic model of coordinations of interests at innovative development of corporations
    Computer Research and Modeling, 2016, v. 8, no. 4, pp. 673-684

    Dynamic game theoretic models of the corporative innovative development are investigated. The proposed models are based on concordance of private and public interests of agents. It is supposed that the structure of interests of each agent includes both private (personal interests) and public (interests of the whole company connected with its innovative development first) components. The agents allocate their personal resources between these two directions. The system dynamics is described by a difference (not differential) equation. The proposed model of innovative development is studied by simulation and the method of enumeration of the domains of feasible controls with a constant step. The main contribution of the paper consists in comparative analysis of efficiency of the methods of hierarchical control (compulsion or impulsion) for information structures of Stackelberg or Germeier (four structures) by means of the indices of system compatibility. The proposed model is a universal one and can be used for a scientifically grounded support of the programs of innovative development of any economic firm. The features of a specific company are considered in the process of model identification (a determination of the specific classes of model functions and numerical values of its parameters) which forms a separate complex problem and requires an analysis of the statistical data and expert estimations. The following assumptions about information rules of the hierarchical game are accepted: all players use open-loop strategies; the leader chooses and reports to the followers some values of administrative (compulsion) or economic (impulsion) control variables which can be only functions of time (Stackelberg games) or depend also on the followers’ controls (Germeier games); given the leader’s strategies all followers simultaneously and independently choose their strategies that gives a Nash equilibrium in the followers’ game. For a finite number of iterations the proposed algorithm of simulation modeling allows to build an approximate solution of the model or to conclude that it doesn’t exist. A reliability and efficiency of the proposed algorithm follow from the properties of the scenario method and the method of a direct ordered enumeration with a constant step. Some comprehensive conclusions about the comparative efficiency of methods of hierarchical control of innovations are received.

    Views (last year): 9. Citations: 6 (RSCI).
  4. Gasnikov A.V., Kubentayeva M.B.
    Searching stochastic equilibria in transport networks by universal primal-dual gradient method
    Computer Research and Modeling, 2018, v. 10, no. 3, pp. 335-345

    We consider one of the problems of transport modelling — searching the equilibrium distribution of traffic flows in the network. We use the classic Beckman’s model to describe time costs and flow distribution in the network represented by directed graph. Meanwhile agents’ behavior is not completely rational, what is described by the introduction of Markov logit dynamics: any driver selects a route randomly according to the Gibbs’ distribution taking into account current time costs on the edges of the graph. Thus, the problem is reduced to searching of the stationary distribution for this dynamics which is a stochastic Nash – Wardrope equilibrium in the corresponding population congestion game in the transport network. Since the game is potential, this problem is equivalent to the problem of minimization of some functional over flows distribution. The stochasticity is reflected in the appearance of the entropy regularization, in contrast to non-stochastic case. The dual problem is constructed to obtain a solution of the optimization problem. The universal primal-dual gradient method is applied. A major specificity of this method lies in an adaptive adjustment to the local smoothness of the problem, what is most important in case of the complex structure of the objective function and an inability to obtain a prior smoothness bound with acceptable accuracy. Such a situation occurs in the considered problem since the properties of the function strongly depend on the transport graph, on which we do not impose strong restrictions. The article describes the algorithm including the numerical differentiation for calculation of the objective function value and gradient. In addition, the paper represents a theoretical estimate of time complexity of the algorithm and the results of numerical experiments conducted on a small American town.

    Views (last year): 28.
  5. Malygina N.V., Surkov P.G.
    On the modeling of water obstacles overcoming by Rangifer tarandus L
    Computer Research and Modeling, 2019, v. 11, no. 5, pp. 895-910

    Seasonal migrations and herd instinct are traditionally recognized as wild reindeer (Rangifer tarandus L.) species-specific behavioral signs. These animals are forced to overcome water obstacles during the migrations. Behaviour peculiarities are considered as the result of the selection process, which has chosen among the sets of strategies, as the only evolutionarily stable one, determining the reproduction and biological survival of wild reindeer as a species. Natural processes in the Taimyr population wild reindeer are currently occurring against the background of an increase in the influence of negative factors due to the escalation of the industrial development of the Arctic. That is why the need to identify the ethological features of these animals completely arose. This paper presents the results of applying the classical methods of the theory of optimal control and differential games to the wild reindeer study of the migration patterns in overcoming water barriers, including major rivers. Based on these animals’ ethological features and behavior forms, the herd is presented as a controlled dynamic system, which presents also two classes of individuals: the leader and the rest of the herd, for which their models, describing the trajectories of their movement, are constructed. The models are based on hypotheses, which are the mathematical formalization of some animal behavior patterns. This approach made it possible to find the trajectory of the important one using the methods of the optimal control theory, and in constructing the trajectories of other individuals, apply the principle of control with a guide. Approbation of the obtained results, which can be used in the formation of a common “platform” for the adaptive behavior models systematic construction and as a reserve for the cognitive evolution models fundamental development, is numerically carried out using a model example with observational data on the Werchnyaya Taimyra River.

  6. Reshitko M.A., Ougolnitsky G.A., Usov A.B.
    Numerical method for finding Nash and Shtakelberg equilibria in river water quality control models
    Computer Research and Modeling, 2020, v. 12, no. 3, pp. 653-667

    In this paper we consider mathematical model to control water quality. We study a system with two-level hierarchy: one environmental organization (supervisor) at the top level and a few industrial enterprises (agents) at the lower level. The main goal of the supervisor is to keep water pollution level below certain value, while enterprises pollute water, as a side effect of the manufacturing process. Supervisor achieves its goal by charging a penalty for enterprises. On the other hand, enterprises choose how much to purify their wastewater to maximize their income.The fee increases the budget of the supervisor. Moreover, effulent fees are charged for the quantity and/or quality of the discharged pollution. Unfortunately, in practice, such charges are ineffective due to the insufficient tax size. The article solves the problem of determining the optimal size of the charge for pollution discharge, which allows maintaining the quality of river water in the rear range.

    We describe system members goals with target functionals, and describe water pollution level and enterprises state as system of ordinary differential equations. We consider the problem from both supervisor and enterprises sides. From agents’ point a normal-form game arises, where we search for Nash equilibrium and for the supervisor, we search for Stackelberg equilibrium. We propose numerical algorithms for finding both Nash and Stackelberg equilibrium. When we construct Nash equilibrium, we solve optimal control problem using Pontryagin’s maximum principle. We construct Hamilton’s function and solve corresponding system of partial differential equations with shooting method and finite difference method. Numerical calculations show that the low penalty for enterprises results in increasing pollution level, when relatively high penalty can result in enterprises bankruptcy. This leads to the problem of choosing optimal penalty, which requires considering problem from the supervisor point. In that case we use the method of qualitatively representative scenarios for supervisor and Pontryagin’s maximum principle for agents to find optimal control for the system. At last, we compute system consistency ratio and test algorithms for different data. The results show that a hierarchical control is required to provide system stability.

  7. Varshavsky L.E.
    Iterative decomposition methods in modelling the development of oligopolistic markets
    Computer Research and Modeling, 2025, v. 17, no. 6, pp. 1237-1256

    One of the principles of forming a competitive market environment is to create conditions for economic agents to implement Nash – Cournot optimal strategies. With the standard approach to determining Nash – Cournot optimal market strategies, economic agents must have complete information about the indicators and dynamic characteristics of all market participants. Which is not true.

    In this regard, to find Nash – Cournot optimal solutions in dynamic models, it is necessary to have a coordinator who has complete information about the participants. However, in the case of a large number of game participants, even if the coordinator has the necessary information, computational difficulties arise associated with the need to solve a large number of coupled equations (in the case of linear dynamic games — Riccati matrix equations).

    In this regard, there is a need to decompose the general problem of determining optimal strategies for market participants into private (local) problems. Approaches based on the iterative decomposition of coupled matrix Riccati equations and the solution of local Riccati equations were studied for linear dynamic games with a quadratic criterion. This article considers a simpler approach to the iterative determination of the Nash – Cournot equilibrium in an oligopoly, by decomposition using operational calculus (operator method).

    The proposed approach is based on the following procedure. A virtual coordinator, which has information about the parameters of the inverse demand function, forms prices for the prospective period. Oligopolists, given fixed price dynamics, determine their strategies in accordance with a slightly modified optimality criterion. The optimal volumes of production of the oligopolists are sent to the coordinator, who, based on the iterative algorithm, adjusts the price dynamics at the previous step.

    The proposed procedure is illustrated by the example of a static and dynamic model of rational behavior of oligopoly participants who maximize the net present value (NPV). Using the methods of operational calculus (and in particular, the inverse Z-transformation), conditions are found under which the iterative procedure leads to equilibrium levels of price and production volumes in the case of linear dynamic games with both quadratic and nonlinear (concave) optimization criteria.

    The approach considered is used in relation to examples of duopoly, triopoly, duopoly on the market with a differentiated product, duopoly with interacting oligopolists with a linear inverse demand function. Comparison of the results of calculating the dynamics of price and production volumes of oligopolists for the considered examples based on coupled equations of the matrix Riccati equations in Matlab (in the table — Riccati), as well as in accordance with the proposed iterative method in the widely available Excel system shows their practical identity.

    In addition, the application of the proposed iterative procedure is illustrated by the example of a duopoly with a nonlinear demand function.

  8. Chen J., Lobanov A.V., Rogozin A.V.
    Nonsmooth Distributed Min-Max Optimization Using the Smoothing Technique
    Computer Research and Modeling, 2023, v. 15, no. 2, pp. 469-480

    Distributed saddle point problems (SPPs) have numerous applications in optimization, matrix games and machine learning. For example, the training of generated adversarial networks is represented as a min-max optimization problem, and training regularized linear models can be reformulated as an SPP as well. This paper studies distributed nonsmooth SPPs with Lipschitz-continuous objective functions. The objective function is represented as a sum of several components that are distributed between groups of computational nodes. The nodes, or agents, exchange information through some communication network that may be centralized or decentralized. A centralized network has a universal information aggregator (a server, or master node) that directly communicates to each of the agents and therefore can coordinate the optimization process. In a decentralized network, all the nodes are equal, the server node is not present, and each agent only communicates to its immediate neighbors.

    We assume that each of the nodes locally holds its objective and can compute its value at given points, i. e. has access to zero-order oracle. Zero-order information is used when the gradient of the function is costly, not possible to compute or when the function is not differentiable. For example, in reinforcement learning one needs to generate a trajectory to evaluate the current policy. This policy evaluation process can be interpreted as the computation of the function value. We propose an approach that uses a smoothing technique, i. e., applies a first-order method to the smoothed version of the initial function. It can be shown that the stochastic gradient of the smoothed function can be viewed as a random two-point gradient approximation of the initial function. Smoothing approaches have been studied for distributed zero-order minimization, and our paper generalizes the smoothing technique on SPPs.

Indexed in Scopus

Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU

The journal is included in the Russian Science Citation Index

The journal is included in the RSCI

International Interdisciplinary Conference "Mathematics. Computing. Education"