Результаты поиска по 'combinatorial algorithms':
Найдено статей: 2
  1. Kholodov Y.A., Salloum H., Jnadi A., Khubiev K.Yu., Petrenko A.
    Quantum-inspired episode selection for Monte Carlo reinforcement learning via QUBO optimization
    Computer Research and Modeling, 2026, v. 18, no. 2, pp. 273-288

    Monte Carlo (MC) reinforcement learning suffers from high sample complexity, especially in environments with sparse rewards, large state spaces, and strongly correlated trajectories that reduce the statistical efficiency of return estimation. These well-known limitations often lead to slow convergence and unstable learning dynamics, particularly in settings where only a small fraction of collected trajectories is actually informative for policy improvement. A key challenge is therefore to identify a compact yet diverse subset of episodes that contributes most to the accuracy of value estimates while preserving sufficient exploration of the environment. To address this challenge, we reformulate episode selection as a Quadratic Unconstrained Binary Optimization (QUBO) problem and solve it using quantum-inspired sampling techniques. Our method, MC+ QUBO, inserts a combinatorial filtering step into the standard MC policy-evaluation pipeline: given a batch of trajectories, it selects a subset that maximizes cumulative reward and encourages broad state-space coverage. This selection procedure is expressed as a QUBO model, where linear terms favor high-return episodes, quadratic terms penalize redundancy between trajectories, and additional coupling terms can be used to enforce coverage-related constraints or promote structural diversity. Within this framework, we investigate two black-box QUBO solvers: Simulated Quantum Annealing (SQA), which emulates tunneling-based exploration of the search landscape, and Simulated Bifurcation (SB), a dynamical-systems-based iterative optimization method. Both solvers demonstrate the ability to efficiently navigate the combinatorial structure of the trajectory-selection problem and to handle batch sizes that are otherwise computationally expensive for exhaustive or deterministic search. Experiments in a finite-horizon GridWorld environment show that MC+QUBO consistently outperforms vanilla MC in convergence speed, stability of return estimates, and final policy quality. These results highlight the promise of quantum-inspired optimization as a practical decision-making subroutine within reinforcement-learning algorithms, offering a scalable way to improve sample efficiency without modifying the underlying learning paradigm.

  2. Fedorov A.A., Soshilov I.V., Loginov V.N.
    Augmented data routing algorithms for satellite delay-tolerant networks. Development and validation
    Computer Research and Modeling, 2022, v. 14, no. 4, pp. 983-993

    The problem of centralized planning for data transmission routes in delay tolerant networks is considered. The original problem is extended with additional requirements to nodes storage and communication process. First, it is assumed that the connection between the nodes of the graph is established using antennas. Second, it is assumed that each node has a storage of finite capacity. The existing works do not consider these requirements. It is assumed that we have in advance information about messages to be processed, information about the network configuration at specified time points taken with a certain time periods, information on time delays for the orientation of the antennas for data transmission and restrictions on the amount of data storage on each satellite of the grouping. Two wellknown algorithms — CGR and Earliest Delivery with All Queues are improved to satisfy the extended requirements. The obtained algorithms solve the optimal message routing problem separately for each message. The problem of validation of the algorithms under conditions of lack of test data is considered as well. Possible approaches to the validation based on qualitative conjectures are proposed and tested, and experiment results are described. A performance comparison of the two implementations of the problem solving algorithms is made. Two algorithms named RDTNAS-CG and RDTNAS-AQ have been developed based on the CGR and Earliest Delivery with All Queues algorithms, respectively. The original algorithms have been significantly expanded and an augmented implementation has been developed. Validation experiments were carried to check the minimum «quality» requirements for the correctness of the algorithms. Comparative analysis of the performance of the two algorithms showed that the RDTNAS-AQ algorithm is several orders of magnitude faster than RDTNAS-CG.

Indexed in Scopus

Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU

The journal is included in the Russian Science Citation Index

The journal is included in the RSCI

International Interdisciplinary Conference "Mathematics. Computing. Education"