Latest issue Issue 2, 2026 Vol. 18

All issues

2026 Vol. 18
- Issue 2
- Issue 1
2025 Vol. 17
- Issue 6
- Issue 5
- Issue 4
- Issue 3
- Issue 2
- Issue 1
2024 Vol. 16
- Issue 7 (special issue)
- Issue 6
- Issue 5
- Issue 4
- Issue 3
- Issue 2
- Issue 1 (special issue)
2023 Vol. 15
- Issue 6
- Issue 5
- Issue 4 (special issue)
- Issue 3
- Issue 2 (special issue)
- Issue 1
2022 Vol. 14
- Issue 6
- Issue 5
- Issue 4 (special issue)
- Issue 3
- Issue 2 (special issue)
- Issue 1
2021 Vol. 13
- Issue 6
- Issue 5
- Issue 4
- Issue 3
- Issue 2 (special issue)
- Issue 1
2020 Vol. 12
- Issue 6
- Issue 5
- Issue 4
- Issue 3
- Issue 2
- Issue 1
2019 Vol. 11
- Issue 6
- Issue 5
- Issue 4
- Issue 3
- Issue 2
- Issue 1
2018 Vol. 10
- Issue 6
- Issue 5 (special issue)
- Issue 4
- Issue 3 (special issue)
- Issue 2
- Issue 1
2017 Vol. 9
- Issue 6
- Issue 5
- Issue 4
- Issue 3
- Issue 2
- Issue 1
2016 Vol. 8
- Issue 6
- Issue 5
- Issue 4
- Issue 3
- Issue 2
- Issue 1
2015 Vol. 7
- Issue 6
- Issue 5
- Issue 4
- Issue 3 (special issue)
- Issue 2
- Issue 1
2014 Vol. 6
- Issue 6 (special issue)
- Issue 5
- Issue 4
- Issue 3
- Issue 2
- Issue 1
2013 Vol. 5
- Issue 6 (special issue)
- Issue 5
- Issue 4
- Issue 3
- Issue 2
- Issue 1
2012 Vol. 4
- Issue 4
- Issue 3
- Issue 2
- Issue 1
2011 Vol. 3
- Issue 4
- Issue 3
- Issue 2
- Issue 1
2010 Vol. 2
- Issue 4
- Issue 3
- Issue 2
- Issue 1
2009 Vol. 1
- Issue 4
- Issue 3
- Issue 2
- Issue 1

Результаты поиска по 'компьютерное зрение':

Найдено статей: 27

Середа-Калинин П.Ю., Власова А.С.
Объяснимый искусственный интеллект: принципы, методы и применение
Компьютерные исследования и моделирование, 2026, т. 18, № 2, с. 211-241

Объяснимый искусственный интеллект (Explainable AI, XAI) представляет собой область искусственного интеллекта, направленную на создание методов и инструментов для генерации интерпретируемых и понятных для человека объяснений решений ИИ. Актуальность объяснимости моделей возрастает по мере внедрения искусственного интеллекта в критически важные сферы (медицина, финансы, юриспруденция), где непрозрачность алгоритмов может приводить к серьезным последствиям для пользователей и общества. В работе представлен аналитический обзор современного состояния области XAI, охватывающий теоретические основы, методологию и практические применения.

Рассматриваемые методы объяснимого ИИ были отобраны и систематизированы на основе многоуровневой классификации методов XAI по постановке задачи (цель, целевая аудитория, тип данных), методологии (стадия применения, модель-специфичность, методы, масштаб) и форме результата (представление, презентация, метрики оценки).

Проведен сравнительный анализ методов объяснимого ИИ для различных областей применения. Для классического машинного обучения детально рассмотрены SHAP и LIME с выявлением их теоретических оснований, вычислительных характеристик и ограничений. Для компьютерного зрения систематизированы градиентные методы (SmoothGrad, Integrated Gradients), методы визуализации активаций (Grad-CAM, Grad-CAM++), методы на основе возмущений (RISE, Occlusion) и концептуальные объяснения (TCAV, Network Dissection). Особое внимание уделено специфике применения XAI к обработке естественного языка и большим языковым моделям, включая анализ достоверности цепочек размышлений (Chain-of-Thought), естественно-языковых объяснений и методов на основе графов атрибуции. Выделены фундаментальные ограничения существующих подходов к объяснимости LLM и определены направления дальнейших исследований.

Результаты обзора демонстрируют, что методы XAI достигли значительной зрелости в области классического машинного обучения и компьютерного зрения, однако применение к большим языковым моделям остается открытой исследовательской проблемой, требующей разработки новых парадигм объяснения.

Ключевые слова: объяснимый искусственный интеллект, XAI, интерпретируемость, прозрачность моделей, машинное обучение, глубокое обучение, большие языковые модели.

Sereda-Kalinin P.Y., Vlasova A.S.
Explainable artificial intelligence: principles, methods and applications
Computer Research and Modeling, 2026, v. 18, no. 2, pp. 211-241

Explainable Artificial Intelligence (XAI) is a field of artificial intelligence aimed at creating methods and tools for generating interpretable and human-understandable explanations of AI decisions. The relevance of model explainability increases with the deployment of artificial intelligence in critical domains (healthcare, finance, law), where algorithmic opacity can lead to serious consequences for users and society. This work presents an analytical review of the current state of the XAI field, covering theoretical foundations, methodology, and practical applications.

The examined explainable AI methods were selected and systematized based on a multi-level classification of XAI methods by problem formulation (goal, target audience, data type), methodology (application stage, model-specificity, methods, scale), and result form (representation, presentation, evaluation metrics).

A comparative analysis of explainable AI methods for various application domains is conducted. For classical machine learning, SHAP and LIME are examined in detail, revealing their theoretical foundations, computational characteristics, and limitations. For computer vision, gradient-based methods (SmoothGrad, Integrated Gradients), activation visualization methods (Grad-CAM, Grad-CAM++), perturbation-based methods (RISE, Occlusion), and conceptual explanations (TCAV, Network Dissection) are systematized. Special attention is paid to the specifics of applying XAI to natural language processing and large language models, including analysis of the faithfulness of Chain-of-Thought reasoning, natural language explanations, and attribution graph methods. Fundamental limitations of existing approaches to LLM explainability are identified and directions for future research are defined.

The review results demonstrate that XAI methods have reached significant maturity in classical machine learning and computer vision, however, their application to large language models remains an open research problem requiring the development of new explanation paradigms.

Keywords: explainable artificial intelligence, XAI, interpretability, model transparency, machine learning, deep learning, large language models.
Небаба С.Г., Марков Н.Г.
Сверточные нейронные сети семейства YOLO для мобильных систем компьютерного зрения
Компьютерные исследования и моделирование, 2024, т. 16, № 3, с. 615-631

Работа посвящена анализу известных классов моделей сверточных нейронных сетей и исследованию выбранных из них перспективных моделей для детектирования летающих объектов на изображениях. Под детектированием объектов (англ. — Object Detection) здесь понимаются обнаружение, локализация в пространстве и классификация летающих объектов. Комплексное исследование выбранных перспективных моделей сверточных нейронных сетей проводится с целью выявления наиболее эффективных из них для создания мобильных систем компьютерного зрения реального времени. Показано, что наиболее приемлемыми для детектирования летающих объектов на изображениях с учетом сформулированных требований к мобильным системам компьютерного зрения реального времени и, соответственно, к лежащим в их основе моделям сверточных нейронных сетей являются модели семейства YOLO, причем наиболее перспективными следует считать пять моделей из этого семейства: YOLOv4, YOLOv4-Tiny, YOLOv4-CSP, YOLOv7 и YOLOv7-Tiny. Для обучения, валидации и комплексного исследования этих моделей разработан соответствующий набор данных. Каждое размеченное изображение из набора данных включает от одного до нескольких летающих объектов четырех классов: «птица», «беспилотный летательный аппарат самолетного типа», «беспилотный летательный аппарат вертолетного типа» и «неизвестный объект» (объекты в воздушном пространстве, не входящие в первые три класса). Исследования показали, что все модели сверточных нейронных сетей по скорости детектирования объектов на изображении (по скорости вычисления модели) значительно превышают заданное пороговое значение, однако только модели YOLOv4-CSP и YOLOv7, причем только частично, удовлетворяют требованию по точности детектирования (классификации) летающих объектов. Наиболее сложным для детектирования классом объектов является класс «птица». При этом выявлено, что наиболее эффективной по точности классификации является модель YOLOv7, модель YOLOv4-CSP на втором месте. Обе модели рекомендованы к использованию в составе мобильной системы компьютерного зрения реального времени при условии увеличения в созданном наборе данных числа изображений с объектами класса «птица» и дообучения этих моделей с тем, чтобы они удовлетворяли требованию по точности детектирования летающих объектов каждого из четырех классов.

Ключевые слова: детектирование летающих объектов на изображениях, сверточная нейронная сеть, YOLO, мобильная система компьютерного зрения.

Nebaba S.G., Markov N.G.
Convolutional neural networks of YOLO family for mobile computer vision systems
Computer Research and Modeling, 2024, v. 16, no. 3, pp. 615-631

The work analyzes known classes of convolutional neural network models and studies selected from them promising models for detecting flying objects in images. Object detection here refers to the detection, localization in space and classification of flying objects. The work conducts a comprehensive study of selected promising convolutional neural network models in order to identify the most effective ones from them for creating mobile real-time computer vision systems. It is shown that the most suitable models for detecting flying objects in images, taking into account the formulated requirements for mobile real-time computer vision systems, are models of the YOLO family, and five models from this family should be considered: YOLOv4, YOLOv4-Tiny, YOLOv4-CSP, YOLOv7 and YOLOv7-Tiny. An appropriate dataset has been developed for training, validation and comprehensive research of these models. Each labeled image of the dataset includes from one to several flying objects of four classes: “bird”, “aircraft-type unmanned aerial vehicle”, “helicopter-type unmanned aerial vehicle”, and “unknown object” (objects in airspace not included in the first three classes). Research has shown that all convolutional neural network models exceed the specified threshold value by the speed of detecting objects in the image, however, only the YOLOv4-CSP and YOLOv7 models partially satisfy the requirements of the accuracy of detection of flying objects. It was shown that most difficult object class to detect is the “bird” class. At the same time, it was revealed that the most effective model is YOLOv7, the YOLOv4-CSP model is in second place. Both models are recommended for use as part of a mobile real-time computer vision system with condition of additional training of these models on increased number of images with objects of the “bird” class so that they satisfy the requirement for the accuracy of detecting flying objects of each four classes.

Keywords: detection of flying objects in images, convolutional neural network, YOLO, mobile computer vision system.
Вражнов Д.А., Шаповалов А.В., Николаев В.В.
Симметрии дифференциальных уравнений в задачах компьютерного зрения
Компьютерные исследования и моделирование, 2010, т. 2, № 4, с. 369-376

В данной работе приводится обобщение подхода к построению инвариантных векторов признаков изображений в задачах распознавания образов. Базовым элементом предлагаемого алгоритма является замена обычно применяемого гауссова фильтра исходного изображения сверткой функции изображения с функцией Грина эволюционного оператора, наследующей свойства симметрий этого оператора. Применение обобщенной фильтрации позволяет выделять дополнительные характеристики инвариантных векторов признаков.

Ключевые слова: компьютерное зрение, распознавание образов, фильтрация, симметрии дифференциальных уравнений, функция Грина.

Vrazhnov D.A., Shapovalov A.V., Nikolaev V.V.
Symmetries of differential equations in computer vision applications
Computer Research and Modeling, 2010, v. 2, no. 4, pp. 369-376

In our work we present generalization of well-known approach for construction of invariant feature vectors of images in computer vision applications. Basic feature of the suggested algorithm is replacement of commonly used Gaussian filter by convolution of image function with Green’s function of evolution operator, which inherits symmetries of this operator. The use of general filtration allows to obtain additional characteristics of invariant feature vectors.

Keywords: computer vision, pattern recognition, filtering, symmetries of differential equations, Green Function.
Views (last year): 8. Citations: 4 (RSCI).
Вражнов Д.А., Шаповалов А.В., Николаев В.В.
О качестве работы алгоритмов слежения за объектами на видео
Компьютерные исследования и моделирование, 2012, т. 4, № 2, с. 303-313

Движение объекта на видео классифицируется на регулярное (движение объекта по непрерывной траектории) и нерегулярное (разрывы траекторий вследствие заслонения объекта слежения другими объектами, скачка объекта и др.). В случае регулярного движения объекта трекер рассматривается как динамическая система, что позволяет использовать условия существования, единственности и устойчивости решения такой системы как критерий корректной работы трекера. Предложен количественный критерий оценки корректной работы алгоритма слежения mean-shift, основанный на применении условия Липшица и других параметров трекера. Полученный результат обобщается на случай произвольного алгоритма слежения.

Ключевые слова: компьютерное зрение, трекинг, миншифт, динамические системы.

Vrazhnov D.A., Shapovalov A.V., Nikolaev V.V.
On quality of object tracking algorithms
Computer Research and Modeling, 2012, v. 4, no. 2, pp. 303-313

Object movement on a video is classified on the regular (object movement on continuous trajectory) and non-regular (trajectory breaks due to object occlusions by other objects, object jumps and others). In the case of regular object movement a tracker is considered as a dynamical system that enables to use conditions of existence, uniqueness, and stability of the dynamical system solution. This condition is used as the correctness criterion of the tracking process. Also, quantitative criterion for correct mean-shift tracking assessment based on the Lipchitz condition is suggested. Results are generalized for arbitrary tracker.

Keywords: computer vision, tracking, mean-shift, dynamical systems.
Views (last year): 20. Citations: 9 (RSCI).
Петров М.Н., Зимина С.В., Дьяченко Д.Л., Дубоделов А.В., Симаков С.С.
Двухпроходная модель Feature-Fused SSD для детекции разномасштабных изображений рабочих на строительной площадке
Компьютерные исследования и моделирование, 2023, т. 15, № 1, с. 57-73

При распознавании рабочих на изображениях строительной площадки, получаемых с камер наблюдения, типичной является ситуация, при которой объекты детекции имеют сильно различающийся пространственный масштаб относительно друг друга и других объектов. Повышение точности детекции мелких объектов может быть обеспечено путем использования Feature-Fused модификации детектора SSD (Single Shot Detector). Вместе с применением на инференсе нарезки изображения с перекрытием такая модель хорошо справляется с детекцией мелких объектов. Однако при практическом использовании данного подхода требуется ручная настройка параметров нарезки. При этом снижается точность детекции объектов на сценах, отличающихся от сцен, использованных при обучении, а также крупных объектов. В данной работе предложен алгоритм автоматического выбора оптимальных параметров нарезки изображения в зависимости от соотношений характерных геометрических размеров объектов на изображении. Нами разработан двухпроходной вариант детектора Feature-Fused SSD для автоматического определения параметров нарезки изображения. На первом проходе применяется усеченная версия детектора, позволяющая определять характерные размеры объектов интереса. На втором проходе осуществляется финальная детекция объектов с параметрами нарезки, выбранными после первого прохода. Был собран датасет с изображениями рабочих на строительной площадке. Датасет включает крупные, мелкие и разноплановые изображения рабочих. Для сравнения результатов детекции для однопроходного алгоритма без разбиения входного изображения, однопроходного алгоритма с равномерным разбиением и двухпроходного алгоритма с подбором оптимального разбиения рассматривались тесты по детекции отдельно крупных объектов, очень мелких объектов, с высокой плотностью объектов как на переднем, так и на заднем плане, только на заднем плане. В диапазоне рассмотренных нами случаев наш подход превосходит подходы, взятые в сравнение, позволяет хорошо бороться с проблемой двойных детекций и демонстрирует качество 0,82–0,91 по метрике mAP (mean Average Precision).

Ключевые слова: компьютерное зрение, строительная площадка, одностадийный детектор.

Petrov M.N., Zimina S.V., Dyachenko D.L., Dubodelov A.V., Simakov S.S.
Dual-pass Feature-Fused SSD model for detecting multi-scale images of workers on the construction site
Computer Research and Modeling, 2023, v. 15, no. 1, pp. 57-73

When recognizing workers on images of a construction site obtained from surveillance cameras, a situation is typical in which the objects of detection have a very different spatial scale relative to each other and other objects. An increase in the accuracy of detection of small objects can be achieved by using the Feature-Fused modification of the SSD detector. Together with the use of overlapping image slicing on the inference, this model copes well with the detection of small objects. However, the practical use of this approach requires manual adjustment of the slicing parameters. This reduces the accuracy of object detection on scenes that differ from the scenes used in training, as well as large objects. In this paper, we propose an algorithm for automatic selection of image slicing parameters depending on the ratio of the characteristic geometric dimensions of objects in the image. We have developed a two-pass version of the Feature-Fused SSD detector for automatic determination of optimal image slicing parameters. On the first pass, a fast truncated version of the detector is used, which makes it possible to determine the characteristic sizes of objects of interest. On the second pass, the final detection of objects with slicing parameters selected after the first pass is performed. A dataset was collected with images of workers on a construction site. The dataset includes large, small and diverse images of workers. To compare the detection results for a one-pass algorithm without splitting the input image, a one-pass algorithm with uniform splitting, and a two-pass algorithm with the selection of the optimal splitting, we considered tests for the detection of separately large objects, very small objects, with a high density of objects both in the foreground and in the background, only in the background. In the range of cases we have considered, our approach is superior to the approaches taken in comparison, allows us to deal well with the problem of double detections and demonstrates a quality of 0.82–0.91 according to the mAP (mean Average Precision) metric.

Keywords: computer vision, construction site, single shot detector.
Семакин А.Н.
Оценка масштабируемости программы расчета движения примесей в атмосфере средствами симулятора gem5
Компьютерные исследования и моделирование, 2020, т. 12, № 4, с. 773-794

В данной работе мы предлагаем новую эффективную программную реализацию алгоритма расчета трансконтинентального переноса примеси в атмосфере от естественного или антропогенного источника на адаптивной конечно-разностной сетке, концентрирующей свои узлы внутри переносимого облака примеси, где наблюдаются резкие изменения значений ее массовой доли, и максимально разрежающей узлы во всех остальных частях атмосферы, что позволяет минимизировать общее количество узлов. Особенностью реализации является представление адаптивной сетки в виде комбинации динамических (дерево, связный список) и статических (массив) структур данных. Такое представление сетки позволяет увеличить скорость выполнения расчетов в два раза по сравнению со стандартным подходом представления адаптивной сетки только через динамические структуры данных.

Программа создавалась на компьютере с шестиядерным процессором. С помощью симулятора gem5, позволяющего моделировать работу различных компьютерных систем, была произведена оценка масштабируемости программы при переходе на большее число ядер (вплоть до 32) на нескольких моделях компьютерной системы вида «вычислительные ядра – кэш-память – оперативная память» с разной степенью детализации ее элементов. Отмечено существенное влияние состава компьютерной системы на степень масштабируемости исполняемой на ней программы: максимальное ускорение на 32-х ядрах при переходе от двухуровневого кэша к трехуровневому увеличивается с 14.2 до 22.2. Время выполнения программы на модели компьютера в gem5 превосходит время ее выполнения на реальном компьютере в 104–105 раз в зависимости от состава модели и составляет 1.5 часа для наиболее детализированной и сложной модели.

Также в статье рассматриваются подробный порядок настройки симулятора gem5 и наиболее оптимальный с точки зрения временных затрат способ проведения симуляций, когда выполнение не представляющих интерес участков кода переносится на физический процессор компьютера, где работает gem5, а непосредственно внутри симулятора выполняется лишь исследуемый целевой кусок кода.

Ключевые слова: gem5, масштабируемость программ, трехмерный перенос примесей в атмосфере.

Semakin A.N.
Evaluation of the scalability property of the program for the simulation of atmospheric chemical transport by means of the simulator gem5
Computer Research and Modeling, 2020, v. 12, no. 4, pp. 773-794

In this work we have developed a new efficient program for the numerical simulation of 3D global chemical transport on an adaptive finite-difference grid which allows us to concentrate grid points in the regions where flow variables sharply change and coarsen the grid in the regions of their smooth behavior, which significantly minimizes the grid size. We represent the adaptive grid with a combination of several dynamic (tree, linked list) and static (array) data structures. The dynamic data structures are used for a grid reconstruction, and the calculations of the flow variables are based on the static data structures. The introduction of the static data structures allows us to speed up the program by a factor of 2 in comparison with the conventional approach to the grid representation with only dynamic data structures.

We wrote and tested our program on a computer with 6 CPU cores. Using the computer microarchitecture simulator gem5, we estimated the scalability property of the program on a significantly greater number of cores (up to 32), using several models of a computer system with the design “computational cores – cache – main memory”. It has been shown that the microarchitecture of a computer system has a significant impact on the scalability property, i.e. the same program demonstrates different efficiency on different computer microarchitectures. For example, we have a speedup of 14.2 on a processor with 32 cores and 2 cache levels, but we have a speedup of 22.2 on a processor with 32 cores and 3 cache levels. The execution time of a program on a computer model in gem5 is 104–105 times greater than the execution time of the same program on a real computer and equals 1.5 hours for the most complex model.

Also in this work we describe how to configure gem5 and how to perform simulations with gem5 in the most optimal way.

Keywords: gem5, scalability property of programs, 3D atmospheric chemical transport.
Назаров В.Г., Прохоров И.В., Яровенко И.П.
Идентификация неоднородного вещества методами импульсной мультиэнергетической томографии
Компьютерные исследования и моделирование, 2025, т. 17, № 4, с. 621-639

В статье рассматриваются математические аспекты проблемы идентификации многокомпонентной рассеивающей среды по данным импульсного мультиэнергетического рентгеновского облучения. Задачи рентгеновской диагностики представляют значительный интерес как с теоретической, так и с практической точки зрения, а радиографические методыне заменимы при неразрушающем контроле изделий.

В рамках математической модели на основе нестационарного интегро-дифференциального уравнения переноса излучения сформулированы обратная задача нахождения коэффициента ослабления по излучению, известному на границе области, и задача идентификации вещества по найденным значениям коэффициента ослабления на дискретном наборе энергий облучения среды. Проведена предварительная обработка широкого списка веществ, представляющих интерес в компьютерной томографии, на предмет возможности их идентификации по приближенно заданному коэффициенту ослабления излучения, характеризующему среду. При анализе степени близости веществ в некоторой норме установлено, что множество всех возможных веществ, потенциально содержащихся в среде, распадается на конечное число непересекающихся кластеров. При достаточно малой длительности зондирующего сигнала рассеивающая составляющая выходящего из среды излучения асимптотически мала. Это обстоятельство позволяет свести обратную задачу для уравнения переноса излучения к задаче обращения преобразования Радона от коэффициента ослабления. Методами численного моделирования на специально разработанном цифровом фантоме анализируется возможность однозначной или частичной идентификации вещества при варьировании длительности зондирующего импульса и числа энергетических уровней облучения среды.

Ключевые слова: импульсная томография, нестационарное уравнение переноса излучения, обратные задачи, коэффициент ослабления, идентификация вещества, мультиэнергетическое зондирование.

Nazarov V.G., Prokhorov I.V., Yarovenko I.P.
Identification of inhomogeneous matter by pulsed multienergy tomography methods
Computer Research and Modeling, 2025, v. 17, no. 4, pp. 621-639

The article considers the mathematical aspects of the problem of identifying a multicomponent scattering medium based on pulsed multienergy X-ray irradiation data. X-ray diagnostics problems are of considerable interest from both theoretical and practical points of view, and radiographic methods are indispensable in non-destructive testing of products.

Within the framework of a mathematical model based on a non-stationary integro-differential equation of radiation transfer, the inverse problem of finding the attenuation coefficient for radiation known at the boundary of the region and the problem of identifying a substance based on the found values of the attenuation coefficient on a discrete set of irradiation energies of the medium are formulated.

A preliminary processing of a wide list of substances of interest in computed tomography was carried out to determine the possibility of their identification by an approximately specified radiation attenuation coefficient characterizing the medium. When analyzing the degree of proximity of substances in a certain norm, it was found that the set of all possible substances potentially contained in the medium is divided into a finite number of non-intersecting clusters. For a sufficiently short duration of the probing signal, the scattering component of the radiation leaving the medium is asymptotically small. This circumstance allows us to reduce the inverse problem for the radiation transfer equation to the problem of inverting the Radon transform from the attenuation coefficient. The possibility of unambiguous or partial identification of a substance by varying the duration of the probing pulse and the number of energy levels of irradiation of the medium is analyzed using numerical modeling methods on a specially developed digital phantom.

Keywords: pulse tomography, non-stationary radiation transfer equation, inverse problems, attenuation coefficient, substance identification, multi-energy probing.
Антонов И.В., Бруттан Ю.В., Горелов М.А., Яковлев Ю.С.
Гибридная нейронная сеть для прогнозирования характеристик покрытия при газопламенном напылении
Компьютерные исследования и моделирование, 2026, т. 18, № 1, с. 101-116

Представлена модель гибридной искусственной нейронной сети, основанная на архитектуре, включающей сверточный энкодер изображений (Convolutional Neural Network, CNN) и модуль внимания (Attention-based Multiple Instance Learning, Attention MIL), обеспечивающий агрегирование информативных признаков из последовательности кадров процесса газопламенного напыления. Дополнительные технологические параметры — давление воздуха, давление пропана и расстояние от сопла до поверхности — интегрируются в модель через табличный канал, что позволяет учитывать взаимосвязь между визуальными и числовыми характеристиками технологического режима. Программная реализация выполнена на платформе Streamlit с использованием библиотеки PyTorch и включает интерактивный интерфейс для обучения и визуализации результатов, анализ весов внимания по кадрам, а также режим прогнозирования выходных характеристик — шероховатости поверхности ($R_a$) и массы нанесенного слоя ($m$). Проведены экспериментальные исследования на данных реальных технологических процессов, выполнен сравнительный анализ точности различных конфигураций модели. Показано, что гибридная нейронная сеть, объединяющая визуальные и табличные признаки, обеспечивает более высокую точность прогноза по сравнению с моделями, использующими только одну из модальностей. При сравнении вариантов реализации гибридной нейронной сети установлено, что использование механизма внимания при формировании признаков серии изображений процесса газопламенного напыления обеспечивает существенное увеличение точности результатов по сравнению с режимом усреднения признаков без использования механизма внимания. В приложении реализован модуль визуализации внимания, который создает монтаж наиболее значимых кадров и отображает их веса внимания, что позволяет определить, какие кадры оказали наибольшее влияние на прогноз. Реализована возможность экспорта модели в формат ONNX для интеграции в системы технологического контроля. Предложенный подход демонстрирует эффективность слияния визуальной и табличной информации для задач мониторинга технологических процессов. Модель может служить основой для создания системы поддержки принятия решений или системы автоматизированного контроля качества покрытия при газопламенном напылении. Рассмотрены ограничения реализованной модели и перспективы ее дальнейшего развития.

Ключевые слова: газопламенное напыление, прогнозирование, гибридная нейронная сеть, Attention MIL, компьютерное зрение, Streamlit, ONNX, контроль качества покрытия.

Antonov I.V., Bruttan I.V., Gorelov M.A., Iakovlev I.S.
Hybrid neural network for predicting coating characteristics in flame spraying
Computer Research and Modeling, 2026, v. 18, no. 1, pp. 101-116

The paper presents a hybrid artificial neural network model based on an architecture that incorporates a convolutional image encoder (CNN) and an attention module (Attention-based Multiple Instance Learning, Attention MIL). This module aggregates informative features from a sequence of frames capturing the flame spraying process. Additional technological parameters—air pressure, propane pressure, and standoff distance — are integrated into the model via a tabular channel, enabling it to account for the relationship between visual data and numerical process regime characteristics. The software implementation was developed using the Streamlit platform and the PyTorch library. It features an interactive interface for model training and result visualization, analysis of attention weights across frames, and a prediction mode for output characteristics: surface roughness ($R_a$) and the mass of the deposited coating ($m$). Experimental studies were conducted on data from real-world technological processes, and a comparative analysis of the accuracy of various model configurations was performed. The results demonstrate that the hybrid neural network, which combines visual and tabular features, achieves higher prediction accuracy compared to models using only a single modality. Furthermore, when comparing different implementations of the hybrid network, it was established that using the attention mechanism to process the series of flame spray images provides a significant increase in accuracy over a simple averaging of features without attention. The application includes an attention visualization module that creates a montage of the most significant frames and displays their attention weights, allowing users to identify which frames had the greatest influence on the prediction. The model’s capability for export to the ONNX format for integration into process control systems is also demonstrated. The proposed approach showcases the effectiveness of fusing visual and tabular information for manufacturing process monitoring tasks. The model can serve as a foundation for developing a decision support system or an automated quality control system for coatings produced by flame spraying. The limitations of the implemented model and prospects for its further development are also considered.

Keywords: flame spraying, forecasting, hybrid neural network, Attention MIL, computer vision, Streamlit, ONNX, coating quality control.
Куржанский А.А., Куржанский А.Б.
Перекресток в умном городе
Компьютерные исследования и моделирование, 2018, т. 10, № 3, с. 347-358

Надежность автоматизированных систем управления (АСУ) и безопасность автономных автомобилей основываются на предположении, что если система компьютерного зрения, установленная на автомобиле, способна идентифицировать объекты в поле видимости и АСУ способна достоверно оценить намерение и предсказать поведение каждого из этих объектов, то автомобиль может спокойно управляться без водителя. Однако как быть с объектами, которые не видны?

В данной статье мы рассматриваем задачу из двух частей: (1) статической (о потенциальных слепых зонах) и (2) динамической реального времени (об идентификации объектов в слепых зонах и информировании участников дорожного движения о таких объектах). Эта задача рассматривается в контексте городских перекрестков.

Ключевые слова: автономные автомобили, подключенные автомобили, подключенные перекрестки, слепые зоны, I2V, DSRC.

Kurzhanskiy A.A., Kurzhanski A.B.
Intersection in a smart city
Computer Research and Modeling, 2018, v. 10, no. 3, pp. 347-358

Intersections present a very demanding environment for all the parties involved. Challenges arise from complex vehicle trajectories; occasional absence of lane markings to guide vehicles; split phases that prevent determining who has the right of way; invisible vehicle approaches; illegal movements; simultaneous interactions among pedestrians, bicycles and vehicles. Unsurprisingly, most demonstrations of AVs are on freeways; but the full potential of automated vehicles — personalized transit, driverless taxis, delivery vehicles — can only be realized when AVs can sense the intersection environment to efficiently and safely maneuver through intersections.

AVs are equipped with an array of on-board sensors to interpret and suitably engage with their surroundings. Advanced algorithms utilize data streams from such sensors to support the movement of autonomous vehicles through a wide range of traffic and climatic conditions. However, there exist situations, in which additional information about the upcoming traffic environment would be beneficial to better inform the vehicles’ in-built tracking and navigation algorithms. A potential source for such information is from in-pavement sensors at an intersection that can be used to differentiate between motorized and non-motorized modes and track road user movements and interactions. This type of information, in addition to signal phasing, can be provided to the AV as it approaches an intersection, and incorporated into an improved prior for the probabilistic algorithms used to classify and track movement in the AV’s field of vision.

This paper is concerned with the situation in which there are objects that are not visible to the AV. The driving context is that of an intersection, and the lack of visibility is due to other vehicles that obstruct the AV’s view, leading to the creation of blind zones. Such obstruction is commonplace in intersections.

Our objective is:

1) inform a vehicle crossing the intersection about its potential blind zones;

2) inform the vehicle about the presence of agents (other vehicles, bicyclists or pedestrians) in those blind zones.

Keywords: autonomous driving, connected vehicles, connected intersections, blind zones, I2V, DSRC.
Views (last year): 29.
Минниханов Р.Н., Аникин И.В., Дагаева М.В., Аслямов Т.И., Большаков Т.Е.
Подходы к обработке изображений в системе поддержки принятия решений центра автоматизированной фиксации административных правонарушений дорожного движения
Компьютерные исследования и моделирование, 2021, т. 13, № 2, с. 405-415

В статье предлагается ряд подходов к обработке изображений в системе поддержки принятия решений (СППР) центра автоматизированной фиксации административных правонарушений дорожного движения (ЦАФАП). Основной задачей данной СППР является помощь человеку-оператору в получении точной информации о государственном регистрационном знаке (ГРЗ) и модели транспортного средства (ТС) на основании изображений, полученных с комплексов фотовидеофиксации (ФВФ). В статье предложены подходы к распознаванию ГРЗ и марки/модели ТС на изображении, основанные на современных нейросетевых моделях. Для распознавания ГРЗ использована нейросетевая модель LPRNet с дополнительно введенным Spatial Transformer Layer для предобработки изображения. Для автоматического определения марки/модели ТС на изображении использована нейросетевая архитектура ResNeXt-101-32x8d. Предложен подход к формированию обучающей выборки для нейросетевой модели распознавания ГРЗ, основанный на методах компьютерного зрения и алгоритмах машинного обучения. В данном подходе использован алгоритм SIFT для нахождения ключевых точек изображения с ГРЗ и вычисления их дескрипторов, а для удаления точек-выбросов использован алгоритм DBSCAN. Точность распознавания ГРЗ на тестовой выборке составила 96 %. Предложен подход к повышению производительности процедур дообучения и распознавания марки/модели ТС, основанный на использовании новой архитектуры сверточной нейронной сети с «заморозкой» весовых коэффициентов сверточных слоев, дополнительным сверточным слоем распараллеливания процесса классификации и множеством бинарных классификаторов на выходе. Применение новой архитектуры позволило на несколько порядков уменьшить время дообучения нейросетевой модели распознавания марки/модели ТС с итоговой точностью классификации, близкой к 99 %. Предложенные подходы были апробированы и внедрены в СППР ЦАФАП Республики Татарстан.

Ключевые слова: система поддержки принятия решений, изображение, компьютерное зрение, нейронные сети.

Minnikhanov R.N., Anikin I.V., Dagaeva M.V., Asliamov T.I., Bolshakov T.E.
Approaches for image processing in the decision support system of the center for automated recording of administrative offenses of the road traffic
Computer Research and Modeling, 2021, v. 13, no. 2, pp. 405-415

We suggested some approaches for solving image processing tasks in the decision support system (DSS) of the Center for Automated Recording of Administrative Offenses of the Road Traffic (CARAO). The main task of this system is to assist the operator in obtaining accurate information about the vehicle registration plate and the vehicle brand/model based on images obtained from the photo and video recording systems. We suggested the approach for vehicle registration plate recognition and brand/model classification on the images based on modern neural network models. LPRNet neural network model supplemented by Spatial Transformer Layer was used to recognize the vehicle registration plate. The ResNeXt-101-32x8d neural network model was used to classify for vehicle brand/model. We suggested the approach to construct the training set for the neural network of vehicle registration plate recognition. The approach is based on computer vision methods and machine learning algorithms. The SIFT algorithm was used to detect and describe local features on images with the vehicle registration plate. DBSCAN clustering was used to detect and delete outliers in such local features. The accuracy of vehicle registration plate recognition was 96% on the testing set. We suggested the approach to improve the efficiency of using the ResNeXt-101-32x8d model at additional training and classification stages. The approach is based on the new architecture of convolutional neural networks with “freezing” weight coefficients of convolutional layers, an additional convolutional layer for parallelizing the classification process, and a set of binary classifiers at the output. This approach significantly reduced the time of additional training of neural network when new vehicle brand/model classification was needed. The final accuracy of vehicle brand/model classification was 99% on the testing set. The proposed approaches were tested and implemented in the DSS of the CARAO of the Republic of Tatarstan.

Keywords: decision-support system, video image, computer vision, neural networks.

Pages: previous next

Indexed in Scopus

Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU

The journal is included in the Russian Science Citation Index

The journal is included in the RSCI

International Interdisciplinary Conference "Mathematics. Computing. Education"