Результаты поиска по 'problem of time':
Найдено статей: 190
  1. Rudenko V.D., Yudin N.E., Vasin A.A.
    Survey of convex optimization of Markov decision processes
    Computer Research and Modeling, 2023, v. 15, no. 2, pp. 329-353

    This article reviews both historical achievements and modern results in the field of Markov Decision Process (MDP) and convex optimization. This review is the first attempt to cover the field of reinforcement learning in Russian in the context of convex optimization. The fundamental Bellman equation and the criteria of optimality of policy — strategies based on it, which make decisions based on the known state of the environment at the moment, are considered. The main iterative algorithms of policy optimization based on the solution of the Bellman equations are also considered. An important section of this article was the consideration of an alternative to the $Q$-learning approach — the method of direct maximization of the agent’s average reward for the chosen strategy from interaction with the environment. Thus, the solution of this convex optimization problem can be represented as a linear programming problem. The paper demonstrates how the convex optimization apparatus is used to solve the problem of Reinforcement Learning (RL). In particular, it is shown how the concept of strong duality allows us to naturally modify the formulation of the RL problem, showing the equivalence between maximizing the agent’s reward and finding his optimal strategy. The paper also discusses the complexity of MDP optimization with respect to the number of state–action–reward triples obtained as a result of interaction with the environment. The optimal limits of the MDP solution complexity are presented in the case of an ergodic process with an infinite horizon, as well as in the case of a non-stationary process with a finite horizon, which can be restarted several times in a row or immediately run in parallel in several threads. The review also reviews the latest results on reducing the gap between the lower and upper estimates of the complexity of MDP optimization with average remuneration (Averaged MDP, AMDP). In conclusion, the real-valued parametrization of agent policy and a class of gradient optimization methods through maximizing the $Q$-function of value are considered. In particular, a special class of MDPs with restrictions on the value of policy (Constrained Markov Decision Process, CMDP) is presented, for which a general direct-dual approach to optimization with strong duality is proposed.

  2. The second part presents numerical studies of the parameters of the lower ionosphere at altitudes of 40–90 km when heated by powerful high-frequency radio waves of various frequencies and powers. The problem statement is considered in the first part of the article. The main attention is paid to the interrelation between the energy and kinetic parameters of the disturbed $D$-region of the ionosphere in the processes that determine the absorption and transformation of the radio beam energy flux in space and time. The possibility of a significant difference in the behavior of the parameters of the disturbed region in the daytime and at nighttime, both in magnitude and in space-time distribution, is shown. In the absence of sufficiently reliable values of the rate constants for a number of important kinetic processes, numerical studies were carried out in stages with the gradual addition of individual processes and kinetic blocks corresponding at the same time to a certain physical content. It is shown that the energy thresholds for inelastic collisions of electrons with air molecules are the main ones. This approach made it possible to detect the effect of the emergence of a self-oscillating mode of changing parameters if the main channel for energy losses in inelastic processes is the most energy-intensive process — ionization. This effect may play a role in plasma studies using high-frequency inductive and capacitive discharges. The results of calculations of the ionization and optical parameters of the disturbed $D$-region for daytime conditions are presented. The electron temperature, density, emission coefficients in the visible and infrared ranges of the spectrum are obtained for various values of the power of the radio beam and its frequency in the lower ionosphere. The height-time distribution of the absorbed radiation power is calculated, which is necessary in studies of higher layers of the ionosphere. The influence on the electron temperature and on the general behavior of the parameters of energy losses by electrons on the excitation of vibrational and metastable states of molecules has been studied in detail. It is shown that under nighttime conditions, when the electron concentration begins at altitudes of about 80 km, and the concentration of heavy particles decreases by two orders of magnitude compared to the average $D$-region, large-scale gas-dynamic motion can develop with sufficient radio emission power The algorithm was developed based on the McCormack method and two-dimensional gas-dynamic calculations of the behavior of the parameters of the perturbed region were performed with some simplifications of the kinetics.

  3. Pivovarova A.S., Steryakov A.A.
    Modeling the behavior proceeding market crash in a hierarchically organized financial market
    Computer Research and Modeling, 2011, v. 3, no. 2, pp. 215-222

    We consider the hierarchical model of financial crashes introduced by A. Johansen and D. Sornette which reproduces the log-periodic power law behavior of the price before the critical point. In order to build the generalization of this model we introduce the dependence of an influence exponent on an ultrametric distance between agents. Much attention is being paid to a problem of critical point universality which is investigated by comparison of probability density functions of the crash times corresponding to systems with various total numbers of agents.

    Views (last year): 1.
  4. Davydov D.V., Shapoval A.B., Yamilov A.I.
    Languages in China provinces: quantitative estimation with incomplete data
    Computer Research and Modeling, 2016, v. 8, no. 4, pp. 707-716

    This paper formulates and solves a practical problem of data recovery regarding the distribution of languages on regional level in context of China. The necessity of this recovery is related to the problem of the determination of the linguistic diversity indices, which, in turn, are used to analyze empirically and to predict sources of social and economic development as well as to indicate potential conflicts at regional level. We use Ethnologue database and China census as the initial data sources. For every language spoken in China, the data contains (a) an estimate of China residents who claim this language to be their mother tongue, and (b) indicators of the presence of such residents in China provinces. For each pair language/province, we aim to estimate the number of the province inhabitants that claim the language to be their mother tongue. This base problem is reduced to solving an undetermined system of algebraic equations. Given additional restriction that Ethnologue database introduces data collected at different time moments because of gaps in Ethnologue language surveys and accompanying data collection expenses, we relate those data to a single time moment, that turns the initial task to an ’ill-posed’ system of algebraic equations with imprecisely determined right hand side. Therefore, we are looking for an approximate solution characterized by a minimal discrepancy of the system. Since some languages are much less distributed than the others, we minimize the weighted discrepancy, introducing weights that are inverse to the right hand side elements of the equations. This definition of discrepancy allows to recover the required variables. More than 92% of the recovered variables are robust to probabilistic modelling procedure for potential errors in initial data.

    Views (last year): 3.
  5. Kovalenko S.Yu., Yusubalieva G.M.
    Survival task for the mathematical model of glioma therapy with blood-brain barrier
    Computer Research and Modeling, 2018, v. 10, no. 1, pp. 113-123

    The paper proposes a mathematical model for the therapy of glioma, taking into account the blood-brain barrier, radiotherapy and antibody therapy. The parameters were estimated from experimental data and the evaluation of the effect of parameter values on the effectiveness of treatment and the prognosis of the disease were obtained. The possible variants of sequential use of radiotherapy and the effect of antibodies have been explored. The combined use of radiotherapy with intravenous administration of $mab$ $Cx43$ leads to a potentiation of the therapeutic effect in glioma.

    Radiotherapy must precede chemotherapy, as radio exposure reduces the barrier function of endothelial cells. Endothelial cells of the brain vessels fit tightly to each other. Between their walls are formed so-called tight contacts, whose role in the provision of BBB is that they prevent the penetration into the brain tissue of various undesirable substances from the bloodstream. Dense contacts between endothelial cells block the intercellular passive transport.

    The mathematical model consists of a continuous part and a discrete one. Experimental data on the volume of glioma show the following interesting dynamics: after cessation of radio exposure, tumor growth does not resume immediately, but there is some time interval during which glioma does not grow. Glioma cells are divided into two groups. The first group is living cells that divide as fast as possible. The second group is cells affected by radiation. As a measure of the health of the blood-brain barrier system, the ratios of the number of BBB cells at the current moment to the number of cells at rest, that is, on average healthy state, are chosen.

    The continuous part of the model includes a description of the division of both types of glioma cells, the recovery of BBB cells, and the dynamics of the drug. Reducing the number of well-functioning BBB cells facilitates the penetration of the drug to brain cells, that is, enhances the action of the drug. At the same time, the rate of division of glioma cells does not increase, since it is limited not by the deficiency of nutrients available to cells, but by the internal mechanisms of the cell. The discrete part of the mathematical model includes the operator of radio interaction, which is applied to the indicator of BBB and to glial cells.

    Within the framework of the mathematical model of treatment of a cancer tumor (glioma), the problem of optimal control with phase constraints is solved. The patient’s condition is described by two variables: the volume of the tumor and the condition of the BBB. The phase constraints delineate a certain area in the space of these indicators, which we call the survival area. Our task is to find such treatment strategies that minimize the time of treatment, maximize the patient’s rest time, and at the same time allow state indicators not to exceed the permitted limits. Since the task of survival is to maximize the patient’s lifespan, it is precisely such treatment strategies that return the indicators to their original position (and we see periodic trajectories on the graphs). Periodic trajectories indicate that the deadly disease is translated into a chronic one.

    Views (last year): 14.
  6. Varshavsky L.E.
    Uncertainty factor in modeling dynamics of economic systems
    Computer Research and Modeling, 2018, v. 10, no. 2, pp. 261-276

    Analysis and practical aspects of implementing developed in the control theory robust control methods in studying economic systems is carried out. The main emphasis is placed on studying results obtained for dynamical systems with structured uncertainty. Practical aspects of implementing such results in control of economic systems on the basis of dynamical models with uncertain parameters and perturbations (stabilization of price on the oil market and inflation in macroeconomic systems) are discussed. With the help of specially constructed aggregate model of oil price dynamics studied the problem of finding control which provides minimal deviation of price from desired levels over middle range period. The second real problem considered in the article consists in determination of stabilizing control providing minimal deviation of inflation from desired levels (on the basis of constructed aggregate macroeconomic model of the USA over middle range period).

    Upper levels of parameters uncertainty and control laws guaranteeing stabilizability of the real considered economic systems have been found using the robust method of control with structured uncertainty. At the same time we have come to the conclusion that received estimates of parameters uncertainty upper levels are conservative. Monte-Carlo experiments carried out for the article made it possible to analyze dynamics of oil price and inflation under received limit levels of models parameters uncertainty and under implementing found robust control laws for the worst and the best scenarios. Results of these experiments show that received robust control laws may be successfully used under less stringent uncertainty constraints than it is guaranteed by sufficient conditions of stabilization.

    Views (last year): 39.
  7. Koroleva M.R., Mishenkova O.V., Raeder T., Tenenev V.A., Chernova A.A.
    Numerical simulation of the process of activation of the safety valve
    Computer Research and Modeling, 2018, v. 10, no. 4, pp. 495-509

    The conjugate problem of disk movement into gas-filled volume of the spring-type safety valve is solved. The questions of determining the physically correct value of the disk initial lift are considered. The review of existing approaches and methods for solving of such type problems is conducted. The formulation of the problem about the valve actuation when the vessel pressure rises and the mathematical model of the actuation processes are given. A special attention to the binding of physical subtasks is paid. Used methods, numerical schemes and algorithms are described. The mathematical modeling is performed on basе the fundamental system of differential equations for viscous gas movement with the equation for displacement of disk valve. The solution of this problem in the axe symmetric statement is carried out numerically using the finite volume method. The results obtained by the viscous and inviscid models are compared. In an inviscid formulation this problem is solved using the Godunov scheme, and in a viscous formulation is solved using the Kurganov – Tadmor method. The dependence of the disk displacement on time was obtained and compared with the experimental data. The pressure distribution on the disk surface, velocity profiles in the cross sections of the gap for different disk heights are given. It is shown that a value of initial drive lift it does not affect on the gas flow and valve movement part dynamic. It can significantly reduce the calculation time of the full cycle of valve work. Immediate isotahs for various elevations of the disk are presented. The comparison of jet flow over critical section is given. The data carried out by two numerical experiments are well correlated with each other. So, the inviscid model can be applied to the numerical modeling of the safety valve dynamic.

    Views (last year): 34. Citations: 1 (RSCI).
  8. Nevmerzhitskiy Y.V.
    Application of the streamline method for nonlinear filtration problems acceleration
    Computer Research and Modeling, 2018, v. 10, no. 5, pp. 709-728

    The paper contains numerical simulation of nonisothermal nonlinear flow in a porous medium. Twodimensional unsteady problem of heavy oil, water and steam flow is considered. Oil phase consists of two pseudocomponents: light and heavy fractions, which like the water component, can vaporize. Oil exhibits viscoplastic rheology, its filtration does not obey Darcy's classical linear law. Simulation considers not only the dependence of fluids density and viscosity on temperature, but also improvement of oil rheological properties with temperature increasing.

    To solve this problem numerically we use streamline method with splitting by physical processes, which consists in separating the convective heat transfer directed along filtration from thermal conductivity and gravitation. The article proposes a new approach to streamline methods application, which allows correctly simulate nonlinear flow problems with temperature-dependent rheology. The core of this algorithm is to consider the integration process as a set of quasi-equilibrium states that are results of solving system on a global grid. Between these states system solved on a streamline grid. Usage of the streamline method allows not only to accelerate calculations, but also to obtain a physically reliable solution, since integration takes place on a grid that coincides with the fluid flow direction.

    In addition to the streamline method, the paper presents an algorithm for nonsmooth coefficients accounting, which arise during simulation of viscoplastic oil flow. Applying this algorithm allows keeping sufficiently large time steps and does not change the physical structure of the solution.

    Obtained results are compared with known analytical solutions, as well as with the results of commercial package simulation. The analysis of convergence tests on the number of streamlines, as well as on different streamlines grids, justifies the applicability of the proposed algorithm. In addition, the reduction of calculation time in comparison with traditional methods demonstrates practical significance of the approach.

    Views (last year): 18.
  9. Aristov V.V., Ilyin O.V.
    Methods and problems in the kinetic approach for simulating biological structures
    Computer Research and Modeling, 2018, v. 10, no. 6, pp. 851-866

    The biological structure is considered as an open nonequilibrium system which properties can be described on the basis of kinetic equations. New problems with nonequilibrium boundary conditions are introduced. The nonequilibrium distribution tends gradually to an equilibrium state. The region of spatial inhomogeneity has a scale depending on the rate of mass transfer in the open system and the characteristic time of metabolism. In the proposed approximation, the internal energy of the motion of molecules is much less than the energy of translational motion. Or in other terms we can state that the kinetic energy of the average blood velocity is substantially higher than the energy of chaotic motion of the same particles. We state that the relaxation problem models a living system. The flow of entropy to the system decreases in downstream, this corresponds to Shrödinger’s general ideas that the living system “feeds on” negentropy. We introduce a quantity that determines the complexity of the biosystem, more precisely, this is the difference between the nonequilibrium kinetic entropy and the equilibrium entropy at each spatial point integrated over the entire spatial region. Solutions to the problems of spatial relaxation allow us to estimate the size of biosystems as regions of nonequilibrium. The results are compared with empirical data, in particular, for mammals we conclude that the larger the size of animals, the smaller the specific energy of metabolism. This feature is reproduced in our model since the span of the nonequilibrium region is larger in the system where the reaction rate is shorter, or in terms of the kinetic approach, the longer the relaxation time of the interaction between the molecules. The approach is also used for estimation of a part of a living system, namely a green leaf. The problems of aging as degradation of an open nonequilibrium system are considered. The analogy is related to the structure, namely, for a closed system, the equilibrium of the structure is attained for the same molecules while in the open system, a transition occurs to the equilibrium of different particles, which change due to metabolism. Two essentially different time scales are distinguished, the ratio of which is approximately constant for various animal species. Under the assumption of the existence of these two time scales the kinetic equation splits in two equations, describing the metabolic (stationary) and “degradative” (nonstationary) parts of the process.

    Views (last year): 31.
  10. Koganov A.V., Rakcheeva T.A., Prikhodko D.I.
    Experimental identification of the organization of mental calculations of the person on the basis of algebras of different associativity
    Computer Research and Modeling, 2019, v. 11, no. 2, pp. 311-327

    The work continues research on the ability of a person to improve the productivity of information processing, using parallel work or improving the performance of analyzers. A person receives a series of tasks, the solution of which requires the processing of a certain amount of information. The time and the validity of the decision are recorded. The dependence of the average solution time on the amount of information in the problem is determined by correctly solved problems. In accordance with the proposed method, the problems contain calculations of expressions in two algebras, one of which is associative and the other is nonassociative. To facilitate the work of the subjects in the experiment were used figurative graphic images of elements of algebra. Non-associative calculations were implemented in the form of the game “rock-paper-scissors”. It was necessary to determine the winning symbol in the long line of these figures, considering that they appear sequentially from left to right and play with the previous winner symbol. Associative calculations were based on the recognition of drawings from a finite set of simple images. It was necessary to determine which figure from this set in the line is not enough, or to state that all the pictures are present. In each problem there was no more than one picture. Computation in associative algebra allows the parallel counting, and in the absence of associativity only sequential computations are possible. Therefore, the analysis of the time for solving a series of problems reveals a consistent uniform, sequential accelerated and parallel computing strategy. In the experiments it was found that all subjects used a uniform sequential strategy to solve non-associative problems. For the associative task, all subjects used parallel computing, and some have used parallel computing acceleration of the growth of complexity of the task. A small part of the subjects with a high complexity, judging by the evolution of the solution time, supplemented the parallel account with a sequential stage of calculations (possibly to control the solution). We develop a special method for assessing the rate of processing of input information by a person. It allowed us to estimate the level of parallelism of the calculation in the associative task. Parallelism of level from two to three was registered. The characteristic speed of information processing in the sequential case (about one and a half characters per second) is twice less than the typical speed of human image recognition. Apparently the difference in processing time actually spent on the calculation process. For an associative problem in the case of a minimum amount of information, the solution time is near to the non-associativity case or less than twice. This is probably due to the fact that for a small number of characters recognition almost exhausts the calculations for the used non-associative problem.

    Views (last year): 16.
Pages: « first previous next last »

Indexed in Scopus

Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU

The journal is included in the Russian Science Citation Index

The journal is included in the RSCI

International Interdisciplinary Conference "Mathematics. Computing. Education"