Результаты поиска по 'formal automat':
Найдено статей: 6
  1. Editor’s note
    Computer Research and Modeling, 2024, v. 16, no. 7, pp. 1533-1538
  2. Antipova S.A., Zhurkin A.M.
    Resource-adaptive approach to structured text data annotation using small language models
    Computer Research and Modeling, 2026, v. 18, no. 1, pp. 41-59

    This paper presents an experimental study of the application of automatic annotation of text data in the question – answer format (QA pairs) under conditions of limited computing resources and data protection requirements. Unlike traditional approaches based on rigid rules or the use of external APIs, we propose using small language models with a small number of parameters that can function locally without a GPU on standard CPU systems. Two models were selected for testing — Gemma-3-4b and Qwen-2.5-3b (quantized 4-bit versions) — and a corpus of documents with a clear structure and a formally rigorous style of presentation was used as source material. An automatic annotation system was developed that implements the full cycle of QA dataset generation: automatic division of the source document into logically connected fragments, formation of “question – answer” pairs using the Gemma-3-4b model, preliminary verification of their correctness using Qwen-2.5-3b based on evidence span from the context and expert quality assessment. The results are exported in JSONL format. Performance evaluation covers the entire QA pair generation system, including fragment processing by the local language model, text preprocessing and postprocessing modules. Performance is measured by the time it takes to generate a single QA pair, the total throughput of the system, RAM usage, and CPU load, which allows for an objective assessment of the computational efficiency of the proposed approach when running on a CPU. An experiment on an extended sample of 12 documents showed that automatic annotation demonstrates stable performance when processing different types of documents, while manual annotation is characterized by significantly higher time costs and high variability. Depending on the type of document, the acceleration of annotation compared to the manual process ranges from 8 to 14 times. Quality analysis showed that most of the generated QA pairs have high semantic consistency with the original context, with only a limited proportion of data requiring expert correction or exception. Although full manual validation of the corpus (the “gold standard”) was not performed as part of this work, the combination of automatic evaluation and selective expert review allows us to consider the resulting quality level acceptable for preliminary automated annotation tasks. Overall, the results confirm the practical applicability of small language models for building autonomous and reproducible automatic text annotation systems under limited computational resources and provide a basis for further research in the field of effective training corpus preparation for natural language processing tasks.

  3. We build new tests which permit to increase the human capacity for the information processing by the parallel execution of the several logic operations of prescribed type. For checking of the causes of the capacity increasing we develop the check tests on the same logic operations class in which the parallel organization of the calculations is low-effectively. We use the apparatus of the universal algebra and automat theory. This article is the extension of the cycle of the work, which investigates the human capacity for the parallel calculations. The general publications on this theme content in the references. The tasks in the described tests may to define in the form of the calculation of the result in the sequence of the same type operations from some algebra. If this operation is associative then the parallel calculation is effectively by successful grouping of process. In Theory of operations that is the using the simultaneous work several processors. Each processor transforms in the time unit the certain known number of the elements of the input date or the intermediate results (the processor productivity). Now it is not known what kind elements of date are using by the brain for the logical or mathematical calculation, and how many elements are treating in the time units. Therefore the test contains the sequence of the presentations of the tasks with different numbers of logical operations in the fixed alphabet. That is the measure of the complexity for the task. The analysis of the depending of the time for the task solution from the complexity gives the possible to estimate the processor productivity and the form of the calculate organization. For the sequence calculations only one processor is working, and the time of solution is a line function of complexity. If the new processors begin to work in parallel when the complexities of the task increase than the depending of the solution time from complexity is represented by the curve which is convex at the bottom. For the detection of situation when the man increases the speed of the single processor under the condition of the increasing complexity we use the task series with similar operations but in the no associate algebra. In such tasks the parallel calculation is little affectivity in the sense of the increasing efficiency by the increasing the number of processors. That is the check set of the tests. In article we consider still one class of the tests, which are based on the calculation of the trajectory of the formal automat state if the input sequence is determined. We investigate the special class of automats (relay) for which the construction affect on the affectivity of the parallel calculations of the final automat state. For all tests we estimate the affectivity of the parallel calculation. This article do not contained the experiment results.

    Views (last year): 14. Citations: 1 (RSCI).
  4. Pekhterev A.A., Domaschenko D.V., Guseva I.A.
    Modelling of trends in the volume and structure of accumulated credit indebtedness in the banking system
    Computer Research and Modeling, 2019, v. 11, no. 5, pp. 965-978

    The volume and structure of accumulated credit debt to the banking system depends on many factors, the most important of which is the level of interest rates. The correct assessment of borrowers’ reaction to the changes in the monetary policy allows to develop econometric models, representing the structure of the credit portfolio in the banking system by terms of lending. These models help to calculate indicators characterizing the level of interest rate risk in the whole system. In the study, we carried out the identification of four types of models: discrete linear model based on transfer functions; the state-space model; the classical econometric model ARMAX, and a nonlinear Hammerstein –Wiener model. To describe them, we employed the formal language of automatic control theory; to identify the model, we used the MATLAB software pack-age. The study revealed that the discrete linear state-space model is most suitable for short-term forecasting of both the volume and the structure of credit debt, which in turn allows to predict trends in the structure of accumulated credit debt on the forecasting horizon of 1 year. The model based on the real data has shown a high sensitivity of the structure of credit debt by pay back periods reaction to the changes in the Ñentral Bank monetary policy. Thus, a sharp increase in interest rates in response to external market shocks leads to shortening of credit terms by borrowers, at the same time the overall level of debt rises, primarily due to the increasing revaluation of nominal debt. During the stable falling trend of interest rates, the structure shifts toward long-term debts.

  5. Musaev A.A., Grigoriev D.A.
    Extracting knowledge from text messages: overview and state-of-the-art
    Computer Research and Modeling, 2021, v. 13, no. 6, pp. 1291-1315

    In general, solving the information explosion problem can be delegated to systems for automatic processing of digital data. These systems are intended for recognizing, sorting, meaningfully processing and presenting data in formats readable and interpretable by humans. The creation of intelligent knowledge extraction systems that handle unstructured data would be a natural solution in this area. At the same time, the evident progress in these tasks for structured data contrasts with the limited success of unstructured data processing, and, in particular, document processing. Currently, this research area is undergoing active development and investigation. The present paper is a systematic survey on both Russian and international publications that are dedicated to the leading trend in automatic text data processing: Text Mining (TM). We cover the main tasks and notions of TM, as well as its place in the current AI landscape. Furthermore, we analyze the complications that arise during the processing of texts written in natural language (NLP) which are weakly structured and often provide ambiguous linguistic information. We describe the stages of text data preparation, cleaning, and selecting features which, alongside the data obtained via morphological, syntactic, and semantic analysis, constitute the input for the TM process. This process can be represented as mapping a set of text documents to «knowledge». Using the case of stock trading, we demonstrate the formalization of the problem of making a trade decision based on a set of analytical recommendations. Examples of such mappings are methods of Information Retrieval (IR), text summarization, sentiment analysis, document classification and clustering, etc. The common point of all tasks and techniques of TM is the selection of word forms and their derivatives used to recognize content in NL symbol sequences. Considering IR as an example, we examine classic types of search, such as searching for word forms, phrases, patterns and concepts. Additionally, we consider the augmentation of patterns with syntactic and semantic information. Next, we provide a general description of all NLP instruments: morphological, syntactic, semantic and pragmatic analysis. Finally, we end the paper with a comparative analysis of modern TM tools which can be helpful for selecting a suitable TM platform based on the user’s needs and skills.

  6. Vassilevski Y.V., Simakov S.S., Gamilov T.M., Salamatova V.Yu., Dobroserdova T.K., Kopytov G.V., Bogdanov O.N., Danilov A.A., Dergachev M.A., Dobrovolskii D.D., Kosukhin O.N., Larina E.V., Meleshkina A.V., Mychka E.Yu., Kharin V.Yu., Chesnokova K.V., Shipilov A.A.
    Personalization of mathematical models in cardiology: obstacles and perspectives
    Computer Research and Modeling, 2022, v. 14, no. 4, pp. 911-930

    Most biomechanical tasks of interest to clinicians can be solved only using personalized mathematical models. Such models allow to formalize and relate key pathophysiological processes, basing on clinically available data evaluate non-measurable parameters that are important for the diagnosis of diseases, predict the result of a therapeutic or surgical intervention. The use of models in clinical practice imposes additional restrictions: clinicians require model validation on clinical cases, the speed and automation of the entire calculated technological chain, from processing input data to obtaining a result. Limitations on the simulation time, determined by the time of making a medical decision (of the order of several minutes), imply the use of reduction methods that correctly describe the processes under study within the framework of reduced models or machine learning tools.

    Personalization of models requires patient-oriented parameters, personalized geometry of a computational domain and generation of a computational mesh. Model parameters are estimated by direct measurements, or methods of solving inverse problems, or methods of machine learning. The requirement of personalization imposes severe restrictions on the number of fitted parameters that can be measured under standard clinical conditions. In addition to parameters, the model operates with boundary conditions that must take into account the patient’s characteristics. Methods for setting personalized boundary conditions significantly depend on the clinical setting of the problem and clinical data. Building a personalized computational domain through segmentation of medical images and generation of the computational grid, as a rule, takes a lot of time and effort due to manual or semi-automatic operations. Development of automated methods for setting personalized boundary conditions and segmentation of medical images with the subsequent construction of a computational grid is the key to the widespread use of mathematical modeling in clinical practice.

    The aim of this work is to review our solutions for personalization of mathematical models within the framework of three tasks of clinical cardiology: virtual assessment of hemodynamic significance of coronary artery stenosis, calculation of global blood flow after hemodynamic correction of complex heart defects, calculating characteristics of coaptation of reconstructed aortic valve.

Indexed in Scopus

Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU

The journal is included in the Russian Science Citation Index

The journal is included in the RSCI

International Interdisciplinary Conference "Mathematics. Computing. Education"