Результаты поиска по 'machine learning':
Найдено статей: 38
  1. Umavovskiy A.V.
    Data-driven simulation of a two-phase flow in heterogenous porous media
    Computer Research and Modeling, 2021, v. 13, no. 4, pp. 779-792

    The numerical methods used to simulate the evolution of hydrodynamic systems require the considerable use of computational resources thus limiting the number of possible simulations. The data-driven simulation technique is one promising approach to the development of heuristic models, which may speed up the study of such models. In this approach, machine learning methods are used to tune the weights of an artificial neural network that predicts the state of a physical system at a given point in time based on initial conditions. This article describes an original neural network architecture and a novel multi-stage training procedure which create a heuristic model of a two-phase flow in a heterogeneous porous medium. The neural network-based model predicts the states of the grid cells at an arbitrary timestep (within the known constraints), taking in only the initial conditions: the properties of the heterogeneous permeability of the medium and the location of sources and sinks. The proposed model requires orders of magnitude less processor time in comparison with the classical numerical method, which served as a criterion for evaluating the effectiveness of the trained model. The proposed architecture includes a number of subnets trained in various combinations on several datasets. The techniques of adversarial training and weight transfer are utilized.

  2. Yifter T.T., Razoumny Y.N., Orlovsky A.V., Lobanov V.K.
    Monitoring the spread of Sosnowskyi’s hogweed using a random forest machine learning algorithm in Google Earth Engine
    Computer Research and Modeling, 2022, v. 14, no. 6, pp. 1357-1370

    Examining the spectral response of plants from data collected using remote sensing has a lot of potential for solving real-world problems in different fields of research. In this study, we have used the spectral property to identify the invasive plant Heracleum sosnowskyi Manden from satellite imagery. H. sosnowskyi is an invasive plant that causes many harms to humans, animals and the ecosystem at large. We have used data collected from the years 2018 to 2020 containing sample geolocation data from the Moscow Region where this plant exists and we have used Sentinel-2 imagery for the spectral analysis towards the aim of detecting it from the satellite imagery. We deployed a Random Forest (RF) machine learning model within the framework of Google Earth Engine (GEE). The algorithm learns from the collected data, which is made up of 12 bands of Sentinel-2, and also includes the digital elevation together with some spectral indices, which are used as features in the algorithm. The approach used is to learn the biophysical parameters of H. sosnowskyi from its reflectances by fitting the RF model directly from the data. Our results demonstrate how the combination of remote sensing and machine learning can assist in locating H. sosnowskyi, which aids in controlling its invasive expansion. Our approach provides a high detection accuracy of the plant, which is 96.93%.

  3. Skorik S.N., Pirau V.V., Sedov S.A., Dvinskikh D.M.
    Comparsion of stochastic approximation and sample average approximation for saddle point problem with bilinear coupling term
    Computer Research and Modeling, 2023, v. 15, no. 2, pp. 381-391

    Stochastic optimization is a current area of research due to significant advances in machine learning and their applications to everyday problems. In this paper, we consider two fundamentally different methods for solving the problem of stochastic optimization — online and offline algorithms. The corresponding algorithms have their qualitative advantages over each other. So, for offline algorithms, it is required to solve an auxiliary problem with high accuracy. However, this can be done in a distributed manner, and this opens up fundamental possibilities such as, for example, the construction of a dual problem. Despite this, both online and offline algorithms pursue a common goal — solving the stochastic optimization problem with a given accuracy. This is reflected in the comparison of the computational complexity of the described algorithms, which is demonstrated in this paper.

    The comparison of the described methods is carried out for two types of stochastic problems — convex optimization and saddles. For problems of stochastic convex optimization, the existing solutions make it possible to compare online and offline algorithms in some detail. In particular, for strongly convex problems, the computational complexity of the algorithms is the same, and the condition of strong convexity can be weakened to the condition of $\gamma$-growth of the objective function. From this point of view, saddle point problems are much less studied. Nevertheless, existing solutions allow us to outline the main directions of research. Thus, significant progress has been made for bilinear saddle point problems using online algorithms. Offline algorithms are represented by just one study. In this paper, this example demonstrates the similarity of both algorithms with convex optimization. The issue of the accuracy of solving the auxiliary problem for saddles was also worked out. On the other hand, the saddle point problem of stochastic optimization generalizes the convex one, that is, it is its logical continuation. This is manifested in the fact that existing results from convex optimization can be transferred to saddles. In this paper, such a transfer is carried out for the results of the online algorithm in the convex case, when the objective function satisfies the $\gamma$-growth condition.

  4. Kalitin K.Y., Nevzorov A.A., Spasov A.A., Mukha O.Y.
    Deep learning analysis of intracranial EEG for recognizing drug effects and mechanisms of action
    Computer Research and Modeling, 2024, v. 16, no. 3, pp. 755-772

    Predicting novel drug properties is fundamental to polypharmacology, repositioning, and the study of biologically active substances during the preclinical phase. The use of machine learning, including deep learning methods, for the identification of drug – target interactions has gained increasing popularity in recent years.

    The objective of this study was to develop a method for recognizing psychotropic effects and drug mechanisms of action (drug – target interactions) based on an analysis of the bioelectrical activity of the brain using artificial intelligence technologies.

    Intracranial electroencephalographic (EEG) signals from rats were recorded (4 channels at a sampling frequency of 500 Hz) after the administration of psychotropic drugs (gabapentin, diazepam, carbamazepine, pregabalin, eslicarbazepine, phenazepam, arecoline, pentylenetetrazole, picrotoxin, pilocarpine, chloral hydrate). The signals were divided into 2-second epochs, then converted into $2000\times 4$ images and input into an autoencoder. The output of the bottleneck layer was subjected to classification and clustering using t-SNE, and then the distances between resulting clusters were calculated. As an alternative, an approach based on feature extraction with dimensionality reduction using principal component analysis and kernel support vector machine (kSVM) classification was used. Models were validated using 5-fold cross-validation.

    The classification accuracy obtained for 11 drugs during cross-validation was $0.580 \pm 0.021$, which is significantly higher than the accuracy of the random classifier $(0.091 \pm 0.045, p < 0.0001)$ and the kSVM $(0.441 \pm 0.035, p < 0.05)$. t-SNE maps were generated from the bottleneck parameters of intracranial EEG signals. The relative proximity of the signal clusters in the parametric space was assessed.

    The present study introduces an original method for biopotential-mediated prediction of effects and mechanism of action (drug – target interaction). This method employs convolutional neural networks in conjunction with a modified selective parameter reduction algorithm. Post-treatment EEGs were compressed into a unified parameter space. Using a neural network classifier and clustering, we were able to recognize the patterns of neuronal response to the administration of various psychotropic drugs.

  5. Chuvilin K.V.
    An efficient algorithm for ${\mathrm{\LaTeX}}$ documents comparing
    Computer Research and Modeling, 2015, v. 7, no. 2, pp. 329-345

    The problem is constructing the differences that arise on ${\mathrm{\LaTeX}}$ documents editing. Each document is represented as a parse tree whose nodes are called tokens. The smallest possible text representation of the document that does not change the syntax tree is constructed. All of the text is splitted into fragments whose boundaries correspond to tokens. A map of the initial text fragment sequence to the similar sequence of the edited document corresponding to the minimum distance is built with Hirschberg algorithm A map of text characters corresponding to the text fragment sequences map is cunstructed. Tokens, that chars are all deleted, or all inserted, or all not changed, are selected in the parse trees. The map for the trees formed with other tokens is built using Zhang–Shasha algorithm.

    Views (last year): 2. Citations: 2 (RSCI).
  6. Shepelev V.D., Kostyuchenkov N.V., Shepelev S.D., Alieva A.A., Makarova I.V., Buyvol P.A., Parsin G.A.
    The development of an intelligent system for recognizing the volume and weight characteristics of cargo
    Computer Research and Modeling, 2021, v. 13, no. 2, pp. 437-450

    Industrial imaging or “machine vision” is currently a key technology in many industries as it can be used to optimize various processes. The purpose of this work is to create a software and hardware complex for measuring the overall and weight characteristics of cargo based on an intelligent system using neural network identification methods that allow one to overcome the technological limitations of similar complexes implemented on ultrasonic and infrared measuring sensors. The complex to be developed will measure cargo without restrictions on the volume and weight characteristics of cargo to be tariffed and sorted within the framework of the warehouse complexes. The system will include an intelligent computer program that determines the volume and weight characteristics of cargo using the machine vision technology and an experimental sample of the stand for measuring the volume and weight of cargo.

    We analyzed the solutions to similar problems. We noted that the disadvantages of the studied methods are very high requirements for the location of the camera, as well as the need for manual operations when calculating the dimensions, which cannot be automated without significant modifications. In the course of the work, we investigated various methods of object recognition in images to carry out subject filtering by the presence of cargo and measure its overall dimensions. We obtained satisfactory results when using cameras that combine both an optical method of image capture and infrared sensors. As a result of the work, we developed a computer program allowing one to capture a continuous stream from Intel RealSense video cameras with subsequent extraction of a three-dimensional object from the designated area and to calculate the overall dimensions of the object. At this stage, we analyzed computer vision techniques; developed an algorithm to implement the task of automatic measurement of goods using special cameras and the software allowing one to obtain the overall dimensions of objects in automatic mode.

    Upon completion of the work, this development can be used as a ready-made solution for transport companies, logistics centers, warehouses of large industrial and commercial enterprises.

  7. Makarov I.S., Bagantsova E.R., Iashin P.A., Kovaleva M.D., Zakharova E.M.
    Development of and research into a rigid algorithm for analyzing Twitter publications and its influence on the movements of the cryptocurrency market
    Computer Research and Modeling, 2023, v. 15, no. 1, pp. 157-170

    Social media is a crucial indicator of the position of assets in the financial market. The paper describes the rigid solution for the classification problem to determine the influence of social media activity on financial market movements. Reputable crypto traders influencers are selected. Twitter posts packages are used as data. The methods of text, which are characterized by the numerous use of slang words and abbreviations, and preprocessing consist in lemmatization of Stanza and the use of regular expressions. A word is considered as an element of a vector of a data unit in the course of solving the problem of binary classification. The best markup parameters for processing Binance candles are searched for. Methods of feature selection, which is necessary for a precise description of text data and the subsequent process of establishing dependence, are represented by machine learning and statistical analysis. First, the feature selection is used based on the information criterion. This approach is implemented in a random forest model and is relevant for the task of feature selection for splitting nodes in a decision tree. The second one is based on the rigid compilation of a binary vector during a rough check of the presence or absence of a word in the package and counting the sum of the elements of this vector. Then a decision is made depending on the superiority of this sum over the threshold value that is predetermined previously by analyzing the frequency distribution of mentions of the word. The algorithm used to solve the problem was named benchmark and analyzed as a tool. Similar algorithms are often used in automated trading strategies. In the course of the study, observations of the influence of frequently occurring words, which are used as a basis of dimension 2 and 3 in vectorization, are described as well.

  8. Borisova L.R., Kuznetsova A.V., Sergeeva N.V., Sen'ko O.V.
    Comparison of Arctic zone RF companies with different Polar Index ratings by economic criteria with the help of machine learning tools
    Computer Research and Modeling, 2020, v. 12, no. 1, pp. 201-215

    The paper presents a comparative analysis of the enterprises of the Arctic Zone of the Russian Federation (AZ RF) on economic indicators in accordance with the rating of the Polar index. This study includes numerical data of 193 enterprises located in the AZ RF. Machine learning methods are applied, both standard, from open source, and own original methods — the method of Optimally Reliable Partitions (ORP), the method of Statistically Weighted Syndromes (SWS). Held split, indicating the maximum value of the functional quality, this study used the simplest family of different one-dimensional partition with a single boundary point, as well as a collection of different two-dimensional partition with one boundary point on each of the two combining variables. Permutation tests allow not only to evaluate the reliability of the data of the revealed regularities, but also to exclude partitions with excessive complexity from the set of the revealed regularities. Patterns connected the class number and economic indicators are revealed using the SDT method on one-dimensional indicators. The regularities which are revealed within the framework of the simplest one-dimensional model with one boundary point and with significance not worse than p < 0.001 are also presented in the given study. The so-called sliding control method was used for reliable evaluation of such diagnostic ability. As a result of these studies, a set of methods that had sufficient effectiveness was identified. The collective method based on the results of several machine learning methods showed the high importance of economic indicators for the division of enterprises in accordance with the rating of the Polar index. Our study proved and showed that those companies that entered the top Rating of the Polar index are generally recognized by financial indicators among all companies in the Arctic Zone. However it would be useful to supplement the list of indicators with ecological and social criteria.

  9. Gesture recognition is an urgent challenge in developing systems of human-machine interfaces. We analyzed machine learning methods for gesture classification based on electromyographic muscle signals to identify the most effective one. Methods such as the naive Bayesian classifier (NBC), logistic regression, decision tree, random forest, gradient boosting, support vector machine (SVM), $k$-nearest neighbor algorithm, and ensembles (NBC and decision tree, NBC and gradient boosting, gradient boosting and decision tree) were considered. Electromyography (EMG) was chosen as a method of obtaining information about gestures. This solution does not require the location of the hand in the field of view of the camera and can be used to recognize finger movements. To test the effectiveness of the selected methods of gesture recognition, a device was developed for recording the EMG signal, which includes three electrodes and an EMG sensor connected to the microcontroller and the power supply. The following gestures were chosen: clenched fist, “thumb up”, “Victory”, squeezing an index finger and waving a hand from right to left. Accuracy, precision, recall and execution time were used to evaluate the effectiveness of classifiers. These parameters were calculated for three options for the location of EMG electrodes on the forearm. According to the test results, the most effective methods are $k$-nearest neighbors’ algorithm, random forest and the ensemble of NBC and gradient boosting, the average accuracy of ensemble for three electrode positions was 81.55%. The position of the electrodes was also determined at which machine learning methods achieve the maximum accuracy. In this position, one of the differential electrodes is located at the intersection of the flexor digitorum profundus and flexor pollicis longus, the second — above the flexor digitorum superficialis.

  10. Musaev A.A., Grigoriev D.A.
    Extracting knowledge from text messages: overview and state-of-the-art
    Computer Research and Modeling, 2021, v. 13, no. 6, pp. 1291-1315

    In general, solving the information explosion problem can be delegated to systems for automatic processing of digital data. These systems are intended for recognizing, sorting, meaningfully processing and presenting data in formats readable and interpretable by humans. The creation of intelligent knowledge extraction systems that handle unstructured data would be a natural solution in this area. At the same time, the evident progress in these tasks for structured data contrasts with the limited success of unstructured data processing, and, in particular, document processing. Currently, this research area is undergoing active development and investigation. The present paper is a systematic survey on both Russian and international publications that are dedicated to the leading trend in automatic text data processing: Text Mining (TM). We cover the main tasks and notions of TM, as well as its place in the current AI landscape. Furthermore, we analyze the complications that arise during the processing of texts written in natural language (NLP) which are weakly structured and often provide ambiguous linguistic information. We describe the stages of text data preparation, cleaning, and selecting features which, alongside the data obtained via morphological, syntactic, and semantic analysis, constitute the input for the TM process. This process can be represented as mapping a set of text documents to «knowledge». Using the case of stock trading, we demonstrate the formalization of the problem of making a trade decision based on a set of analytical recommendations. Examples of such mappings are methods of Information Retrieval (IR), text summarization, sentiment analysis, document classification and clustering, etc. The common point of all tasks and techniques of TM is the selection of word forms and their derivatives used to recognize content in NL symbol sequences. Considering IR as an example, we examine classic types of search, such as searching for word forms, phrases, patterns and concepts. Additionally, we consider the augmentation of patterns with syntactic and semantic information. Next, we provide a general description of all NLP instruments: morphological, syntactic, semantic and pragmatic analysis. Finally, we end the paper with a comparative analysis of modern TM tools which can be helpful for selecting a suitable TM platform based on the user’s needs and skills.

Pages: « first previous next

Indexed in Scopus

Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU

The journal is included in the Russian Science Citation Index

The journal is included in the RSCI

International Interdisciplinary Conference "Mathematics. Computing. Education"