Результаты поиска по 'training':
Найдено статей: 58
  1. Bakhvalov Y.N., Kopylov I.V.
    Training and assessment the generalization ability of interpolation methods
    Computer Research and Modeling, 2015, v. 7, no. 5, pp. 1023-1031

    We investigate machine learning methods with a certain kind of decision rule. In particular, inverse-distance method of interpolation, method of interpolation by radial basis functions, the method of multidimensional interpolation and approximation, based on the theory of random functions, the last method of interpolation is kriging. This paper shows a method of rapid retraining “model” when adding new data to the existing ones. The term “model” means interpolating or approximating function constructed from the training data. This approach reduces the computational complexity of constructing an updated “model” from $O(n^3)$ to $O(n^2)$. We also investigate the possibility of a rapid assessment of generalizing opportunities “model” on the training set using the method of cross-validation leave-one-out cross-validation, eliminating the major drawback of this approach — the necessity to build a new “model” for each element which is removed from the training set.

    Views (last year): 7. Citations: 5 (RSCI).
  2. This article explores a method of machine learning based on the theory of random functions. One of the main problems of this method is that decision rule of a model becomes more complicated as the number of training dataset examples increases. The decision rule of the model is the most probable realization of a random function and it's represented as a polynomial with the number of terms equal to the number of training examples. In this article we will show the quick way of the number of training dataset examples reduction and, accordingly, the complexity of the decision rule. Reducing the number of examples of training dataset is due to the search and removal of weak elements that have little effect on the final form of the decision function, and noise sampling elements. For each $(x_i,y_i)$-th element sample was introduced the concept of value, which is expressed by the deviation of the estimated value of the decision function of the model at the point $x_i$, built without the $i$-th element, from the true value $y_i$. Also we show the possibility of indirect using weak elements in the process of training model without increasing the number of terms in the decision function. At the experimental part of the article, we show how changed amount of data affects to the ability of the method of generalizing in the classification task.

    Views (last year): 5.
  3. Zatserkovnyy A.V., Nurminski E.A.
    Neural network analysis of transportation flows of urban aglomeration using the data from public video cameras
    Computer Research and Modeling, 2021, v. 13, no. 2, pp. 305-318

    Correct modeling of complex dynamics of urban transportation flows requires the collection of large volumes of empirical data to specify types of the modes and their identification. At the same time, setting a large number of observation posts is expensive and technically not always feasible. All this results in insufficient factographic support for the traffic control systems as well as for urban planners with the obvious consequences for the quality of their decisions. As one of the means to provide large-scale data collection at least for the qualitative situation analysis, the wide-area video cameras are used in different situation centers. There they are analyzed by human operators who are responsible for observation and control. Some video cameras provided their videos for common access, which makes them a valuable resource for transportation studies. However, there are significant problems with getting qualitative data from such cameras, which relate to the theory and practice of image processing. This study is devoted to the practical application of certain mainstream neuro-networking technologies for the estimation of essential characteristics of actual transportation flows. The problems arising in processing these data are analyzed, and their solutions are suggested. The convolution neural networks are used for tracking, and the methods for obtaining basic parameters of transportation flows from these observations are studied. The simplified neural networks are used for the preparation of training sets for the deep learning neural network YOLOv4 which is later used for the estimation of speed and density of automobile flows.

  4. We consider a model of spontaneous formation of a computational structure in the human brain for solving a given class of tasks in the process of performing a series of similar tasks. The model is based on a special definition of a numerical measure of the complexity of the solution algorithm. This measure has an informational property: the complexity of a computational structure consisting of two independent structures is equal to the sum of the complexities of these structures. Then the probability of spontaneous occurrence of the structure depends exponentially on the complexity of the structure. The exponential coefficient requires experimental determination for each type of problem. It may depend on the form of presentation of the source data and the procedure for issuing the result. This estimation method was applied to the results of a series of experiments that determined the strategy for solving a series of similar problems with a growing number of initial data. These experiments were described in previously published papers. Two main strategies were considered: sequential execution of the computational algorithm, or the use of parallel computing in those tasks where it is effective. These strategies differ in how calculations are performed. Using an estimate of the complexity of schemes, you can use the empirical probability of one of the strategies to calculate the probability of the other. The calculations performed showed a good match between the calculated and empirical probabilities. This confirms the hypothesis about the spontaneous formation of structures that solve the problem during the initial training of a person. The paper contains a brief description of experiments, detailed computational schemes and a strict definition of the complexity measure of computational structures and the conclusion of the dependence of the probability of structure formation on its complexity.

  5. Vostrikov D.D., Konin G.O., Lobanov A.V., Matyukhin V.V.
    Influence of the mantissa finiteness on the accuracy of gradient-free optimization methods
    Computer Research and Modeling, 2023, v. 15, no. 2, pp. 259-280

    Gradient-free optimization methods or zeroth-order methods are widely used in training neural networks, reinforcement learning, as well as in industrial tasks where only the values of a function at a point are available (working with non-analytical functions). In particular, the method of error back propagation in PyTorch works exactly on this principle. There is a well-known fact that computer calculations use heuristics of floating-point numbers, and because of this, the problem of finiteness of the mantissa arises.

    In this paper, firstly, we reviewed the most popular methods of gradient approximation: Finite forward/central difference (FFD/FCD), Forward/Central wise component (FWC/CWC), Forward/Central randomization on $l_2$ sphere (FSSG2/CFFG2); secondly, we described current theoretical representations of the noise introduced by the inaccuracy of calculating the function at a point: adversarial noise, random noise; thirdly, we conducted a series of experiments on frequently encountered classes of problems, such as quadratic problem, logistic regression, SVM, to try to determine whether the real nature of machine noise corresponds to the existing theory. It turned out that in reality (at least for those classes of problems that were considered in this paper), machine noise turned out to be something between adversarial noise and random, and therefore the current theory about the influence of the mantissa limb on the search for the optimum in gradient-free optimization problems requires some adjustment.

  6. Nebaba S.G., Markov N.G.
    Convolutional neural networks of YOLO family for mobile computer vision systems
    Computer Research and Modeling, 2024, v. 16, no. 3, pp. 615-631

    The work analyzes known classes of convolutional neural network models and studies selected from them promising models for detecting flying objects in images. Object detection here refers to the detection, localization in space and classification of flying objects. The work conducts a comprehensive study of selected promising convolutional neural network models in order to identify the most effective ones from them for creating mobile real-time computer vision systems. It is shown that the most suitable models for detecting flying objects in images, taking into account the formulated requirements for mobile real-time computer vision systems, are models of the YOLO family, and five models from this family should be considered: YOLOv4, YOLOv4-Tiny, YOLOv4-CSP, YOLOv7 and YOLOv7-Tiny. An appropriate dataset has been developed for training, validation and comprehensive research of these models. Each labeled image of the dataset includes from one to several flying objects of four classes: “bird”, “aircraft-type unmanned aerial vehicle”, “helicopter-type unmanned aerial vehicle”, and “unknown object” (objects in airspace not included in the first three classes). Research has shown that all convolutional neural network models exceed the specified threshold value by the speed of detecting objects in the image, however, only the YOLOv4-CSP and YOLOv7 models partially satisfy the requirements of the accuracy of detection of flying objects. It was shown that most difficult object class to detect is the “bird” class. At the same time, it was revealed that the most effective model is YOLOv7, the YOLOv4-CSP model is in second place. Both models are recommended for use as part of a mobile real-time computer vision system with condition of additional training of these models on increased number of images with objects of the “bird” class so that they satisfy the requirement for the accuracy of detecting flying objects of each four classes.

  7. Tarasov A.E., Serdobintsev E.V.
    Simulation of rail vehicles ride in Simpack Rail on the curved track
    Computer Research and Modeling, 2019, v. 11, no. 2, pp. 249-263

    The paper studies the determination for one of the dynamic quality parameter (PDK) of railway vehicles — car body lateral acceleration — by using of computer simulation system for railway vehicles dynamic Simpack Rail. This provide the complex simulation environment with variable velocity depending on the train schedule. The rail vehicle model of typical 1520 mm gauge fright locomotive section used for simulation has been verified by means of the chair “Electric multiple unit cars and locomotives” in the Russian University of Transport (RUT (MIIT)). Due to this homologation the questions of model creating and verification in preprocessor are excluded in this paper. The paper gives the detail description of cartographic track modeling in situation plane, heights plane and superelevation plane based on the real operating data. The statistic parameters (moments) for the rail related track excitation and used cartographic track data of the specified track section in this simulation are given as a numeric and graphical results of reading the prepared data files. The measurement of the car body residual lateral acceleration occur under consideration of the earth gravity acceleration part like the accelerometer measurement in the real world. Finally the desired quality parameter determined by simulation is compared with the same one given by a test drive. The calculation method in both cases is based on the middle value of the absolute maximums picked up within the nonstationary realizations of this parameter. Compared results confirm that this quality factor all the first depends on the velocity and track geometry properties. The simulation of the track in this application uses the strong conformity original track data of the test ride track section. The accepted simplification in the rail vehicle model of fright electric locomotive section (body properties related to the center of gravity, small displacements between the bodies) by keeping the geometric and force law characteristics of the force elements and constraints constant allow in Simpack Rail the simulation with necessary validity of system behavior (reactions).

    Views (last year): 20.
  8. Emaletdinova L.Y., Mukhametzyanov Z.I., Kataseva D.V., Kabirova A.N.
    A method of constructing a predictive neural network model of a time series
    Computer Research and Modeling, 2020, v. 12, no. 4, pp. 737-756

    This article studies a method of constructing a predictive neural network model of a time series based on determining the composition of input variables, constructing a training sample and training itself using the back propagation method. Traditional methods of constructing predictive models of the time series are: the autoregressive model, the moving average model or the autoregressive model — the moving average allows us to approximate the time series by a linear dependence of the current value of the output variable on a number of its previous values. Such a limitation as linearity of dependence leads to significant errors in forecasting.

    Mining Technologies using neural network modeling make it possible to approximate the time series by a nonlinear dependence. Moreover, the process of constructing of a neural network model (determining the composition of input variables, the number of layers and the number of neurons in the layers, choosing the activation functions of neurons, determining the optimal values of the neuron link weights) allows us to obtain a predictive model in the form of an analytical nonlinear dependence.

    The determination of the composition of input variables of neural network models is one of the key points in the construction of neural network models in various application areas that affect its adequacy. The composition of the input variables is traditionally selected from some physical considerations or by the selection method. In this work it is proposed to use the behavior of the autocorrelation and private autocorrelation functions for the task of determining the composition of the input variables of the predictive neural network model of the time series.

    In this work is proposed a method for determining the composition of input variables of neural network models for stationary and non-stationary time series, based on the construction and analysis of autocorrelation functions. Based on the proposed method in the Python programming environment are developed an algorithm and a program, determining the composition of the input variables of the predictive neural network model — the perceptron, as well as building the model itself. The proposed method was experimentally tested using the example of constructing a predictive neural network model of a time series that reflects energy consumption in different regions of the United States, openly published by PJM Interconnection LLC (PJM) — a regional network organization in the United States. This time series is non-stationary and is characterized by the presence of both a trend and seasonality. Prediction of the next values of the time series based on previous values and the constructed neural network model showed high approximation accuracy, which proves the effectiveness of the proposed method.

  9. Kutalev A.A., Lapina A.A.
    Modern ways to overcome neural networks catastrophic forgetting and empirical investigations on their structural issues
    Computer Research and Modeling, 2023, v. 15, no. 1, pp. 45-56

    This paper presents the results of experimental validation of some structural issues concerning the practical use of methods to overcome catastrophic forgetting of neural networks. A comparison of current effective methods like EWC (Elastic Weight Consolidation) and WVA (Weight Velocity Attenuation) is made and their advantages and disadvantages are considered. It is shown that EWC is better for tasks where full retention of learned skills is required on all the tasks in the training queue, while WVA is more suitable for sequential tasks with very limited computational resources, or when reuse of representations and acceleration of learning from task to task is required rather than exact retention of the skills. The attenuation of the WVA method must be applied to the optimization step, i. e. to the increments of neural network weights, rather than to the loss function gradient itself, and this is true for any gradient optimization method except the simplest stochastic gradient descent (SGD). The choice of the optimal weights attenuation function between the hyperbolic function and the exponent is considered. It is shown that hyperbolic attenuation is preferable because, despite comparable quality at optimal values of the hyperparameter of the WVA method, it is more robust to hyperparameter deviations from the optimal value (this hyperparameter in the WVA method provides a balance between preservation of old skills and learning a new skill). Empirical observations are presented that support the hypothesis that the optimal value of this hyperparameter does not depend on the number of tasks in the sequential learning queue. And, consequently, this hyperparameter can be picked up on a small number of tasks and used on longer sequences.

  10. Shabanov A.E., Petrov M.N., Chikitkin A.V.
    A multilayer neural network for determination of particle size distribution in Dynamic Light Scattering problem
    Computer Research and Modeling, 2019, v. 11, no. 2, pp. 265-273

    Solution of Dynamic Light Scattering problem makes it possible to determine particle size distribution (PSD) from the spectrum of the intensity of scattered light. As a result of experiment, an intensity curve is obtained. The experimentally obtained spectrum of intensity is compared with the theoretically expected spectrum, which is the Lorentzian line. The main task is to determine on the basis of these data the relative concentrations of particles of each class presented in the solution. The article presents a method for constructing and using a neural network trained on synthetic data to determine PSD in a solution in the range of 1–500 nm. The neural network has a fully connected layer of 60 neurons with the RELU activation function at the output, a layer of 45 neurons and the same activation function, a dropout layer and 2 layers with 15 and 1 neurons (network output). The article describes how the network has been trained and tested on synthetic and experimental data. On the synthetic data, the standard deviation metric (rmse) gave a value of 1.3157 nm. Experimental data were obtained for particle sizes of 200 nm, 400 nm and a solution with representatives of both sizes. The results of the neural network and the classical linear methods are compared. The disadvantages of the classical methods are that it is difficult to determine the degree of regularization: too much regularization leads to the particle size distribution curves are much smoothed out, and weak regularization gives oscillating curves and low reliability of the results. The paper shows that the neural network gives a good prediction for particles with a large size. For small sizes, the prediction is worse, but the error quickly decreases as the particle size increases.

    Views (last year): 16.
Pages: previous next last »

Indexed in Scopus

Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU

The journal is included in the Russian Science Citation Index

The journal is included in the RSCI

International Interdisciplinary Conference "Mathematics. Computing. Education"