All issues
- 2026 Vol. 18
- 2025 Vol. 17
- 2024 Vol. 16
- 2023 Vol. 15
- 2022 Vol. 14
- 2021 Vol. 13
- 2020 Vol. 12
- 2019 Vol. 11
- 2018 Vol. 10
- 2017 Vol. 9
- 2016 Vol. 8
- 2015 Vol. 7
- 2014 Vol. 6
- 2013 Vol. 5
- 2012 Vol. 4
- 2011 Vol. 3
- 2010 Vol. 2
- 2009 Vol. 1
-
Resource-adaptive approach to structured text data annotation using small language models
Computer Research and Modeling, 2026, v. 18, no. 1, pp. 41-59This paper presents an experimental study of the application of automatic annotation of text data in the question – answer format (QA pairs) under conditions of limited computing resources and data protection requirements. Unlike traditional approaches based on rigid rules or the use of external APIs, we propose using small language models with a small number of parameters that can function locally without a GPU on standard CPU systems. Two models were selected for testing — Gemma-3-4b and Qwen-2.5-3b (quantized 4-bit versions) — and a corpus of documents with a clear structure and a formally rigorous style of presentation was used as source material. An automatic annotation system was developed that implements the full cycle of QA dataset generation: automatic division of the source document into logically connected fragments, formation of “question – answer” pairs using the Gemma-3-4b model, preliminary verification of their correctness using Qwen-2.5-3b based on evidence span from the context and expert quality assessment. The results are exported in JSONL format. Performance evaluation covers the entire QA pair generation system, including fragment processing by the local language model, text preprocessing and postprocessing modules. Performance is measured by the time it takes to generate a single QA pair, the total throughput of the system, RAM usage, and CPU load, which allows for an objective assessment of the computational efficiency of the proposed approach when running on a CPU. An experiment on an extended sample of 12 documents showed that automatic annotation demonstrates stable performance when processing different types of documents, while manual annotation is characterized by significantly higher time costs and high variability. Depending on the type of document, the acceleration of annotation compared to the manual process ranges from 8 to 14 times. Quality analysis showed that most of the generated QA pairs have high semantic consistency with the original context, with only a limited proportion of data requiring expert correction or exception. Although full manual validation of the corpus (the “gold standard”) was not performed as part of this work, the combination of automatic evaluation and selective expert review allows us to consider the resulting quality level acceptable for preliminary automated annotation tasks. Overall, the results confirm the practical applicability of small language models for building autonomous and reproducible automatic text annotation systems under limited computational resources and provide a basis for further research in the field of effective training corpus preparation for natural language processing tasks.
-
The development of an ARM system on chip based processing unit for data stream computing
Computer Research and Modeling, 2015, v. 7, no. 3, pp. 505-509Views (last year): 1.Modern big science projects are becoming highly data intensive to the point where offline processing of stored data is infeasible. High data throughput computing, or Data Stream Computing, for future projects is required to deal with terabytes of data per second which cannot be stored in long-term storage elements. Conventional data-centres based on typical server-grade hardware are expensive and are biased towards processing power. The overall I/O bandwidth can be increased with massive parallelism, usually at the expense of excessive processing power and high energy consumption. An ARM System on Chip (SoC) based processing unit may address the issue of system I/O and CPU balance, affordability and energy efficiency since ARM SoCs are mass produced and designed to be energy efficient for use in mobile devices. Such a processing unit is currently in development, with a design goal of 20 Gb/s I/O throughput and significant processing power. The I/O capabilities of consumer ARM System on Chips are discussed along with to-date performance and I/O throughput tests.
-
A CPU benchmarking characterization of ARM based processors
Computer Research and Modeling, 2015, v. 7, no. 3, pp. 581-586Views (last year): 1.Big science projects are producing data at ever increases rates. Typical techniques involve storing the data to disk, after minor filtering, and then processing it in large computer farms. Data production has reached a point where on-line processing is required in order to filter the data down to manageable sizes. A potential solution involves using low-cost, low-power ARM processors in large arrays to provide massive parallelisation for data stream computing (DSC). The main advantage in using System on Chips (SoCs) is inherent in its design philosophy. SoCs are primarily used in mobile devices and hence consume less power while maintaining relatively good performance. A benchmarking characterisation of three different models of ARM processors will be presented.
-
Memory benchmarking characterisation of ARM-based SoCs
Computer Research and Modeling, 2015, v. 7, no. 3, pp. 607-613Computational intensity is traditionally the focus of large-scale computing system designs, generally leaving such designs ill-equipped to efficiently handle throughput-oriented workloads. In addition, cost and energy consumption considerations for large-scale computing systems in general remain a source of concern. A potential solution involves using low-cost, low-power ARM processors in large arrays in a manner which provides massive parallelisation and high rates of data throughput (relative to existing large-scale computing designs). Giving greater priority to both throughput-rate and cost considerations increases the relevance of primary memory performance and design optimisations to overall system performance. Using several primary memory performance benchmarks to evaluate various aspects of RAM and cache performance, we provide characterisations of the performances of four different models of ARM-based system-on-chip, namely the Cortex-A9, Cortex- A7, Cortex-A15 r3p2 and Cortex-A15 r3p3. We then discuss the relevance of these results to high volume computing and the potential for ARM processors.
Indexed in Scopus
Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU
The journal is included in the Russian Science Citation Index
The journal is included in the RSCI
International Interdisciplinary Conference "Mathematics. Computing. Education"




