Результаты поиска по 'feature selection':
Найдено статей: 15
  1. Microarray datasets are highly dimensional, with a small number of collected samples in comparison to thousands of features. This poses a significant challenge that affects the interpretation, applicability and validation of the analytical results. Matrix factorizations have proven to be a useful method for describing data in terms of a small number of meta-features, which reduces noise, while still capturing the essential features of the data. Three novel and mutually relevant methods are presented in this paper: 1) gradient-based matrix factorization with two adaptive learning rates (in accordance with the number of factor matrices) and their automatic updates; 2) nonparametric criterion for the selection of the number of factors; and 3) nonnegative version of the gradient-based matrix factorization which doesn't require any extra computational costs in difference to the existing methods. We demonstrate effectiveness of the proposed methods to the supervised classification of gene expression data.

    Citations: 4 (RSCI).
  2. Efficiency of production directly depends on quality of the management of technology which, in turn, relies on the accuracy and efficiency of the processing of control and measuring information. Development of the mathematical methods of research of the system communications and regularities of functioning and creation of the mathematical models taking into account structural features of object of researches, and also writing of the software products for realization of these methods are an actual task. Practice has shown that the list of parameters that take place in the study of complex object of modern production, ranging from a few dozen to several hundred names, and the degree of influence of each factor in the initial time is not clear. Before working for the direct determination of the model in these circumstances, it is impossible — the amount of the required information may be too great, and most of the work on the collection of this information will be done in vain due to the fact that the degree of influence on the optimization of most factors of the original list would be negligible. Therefore, a necessary step in determining a model of a complex object is to work to reduce the dimension of the factor space. Most industrial plants are hierarchical group processes and mass volume production, characterized by hundreds of factors. (For an example of realization of the mathematical methods and the approbation of the constructed models data of the Moldavian steel works were taken in a basis.) To investigate the systemic linkages and patterns of functioning of such complex objects are usually chosen several informative parameters, and carried out their sampling. In this article the sequence of coercion of the initial indices of the technological process of the smelting of steel to the look suitable for creation of a mathematical model for the purpose of prediction is described. The implementations of new types became also creation of a basis for development of the system of automated management of quality of the production. In the course of weak correlation the following stages are selected: collection and the analysis of the basic data, creation of the table the correlated of the parameters, abbreviation of factor space by means of the correlative pleiads and a method of weight factors. The received results allow to optimize process of creation of the model of multiple-factor process.

    Views (last year): 6. Citations: 1 (RSCI).
  3. Koldoba A.V., Skalko Y.I.
    Numerical simulation of inverse mode propagation in-situ combustion direct-flow waves
    Computer Research and Modeling, 2020, v. 12, no. 5, pp. 993-1006

    One of the promising technologies for enhanced oil recovery in the development of unconventional oil reservoirs is the thermo-gas method. The method is based on the injection of an oxygen-containing mixture into the formation and its transformation into a highly efficient displacing agent miscible with the formation of oil due to spontaneous in-situ oxidative processes. In some cases, this method has great potential compared to other methods of enhanced oil recovery. This paper discusses some issues of the propagation of in-situ combustion waves. Depending on the parameters of the reservoir and the injected mixture, such waves can propagate in different modes. In this paper, only the direct-flow inverse propagation mode is considered. In this mode, the combustion wave propagates in the direction of the oxidant flow and the reaction front lags behind the heatwave, in which the substance (hydrocarbon fractions, porous skeleton, etc.) is heated to temperatures sufficient for the oxidation reaction to occur. The paper presents the results of an analytical study and numerical simulation of the structure of the inverse wave of in-situ combustion. in two-phase flow in a porous layer. Some simplifying assumptions about the thermal properties of fluid phases was accepted, which allow, on the one hand, to modify the in-situ combustion model observable for analysis, and with another is to convey the main features of this process. The solution of the “running wave” type is considered and the conditions of its implementation are specified. Selected two modes of reaction trailing front regime in-situ combustion waves: hydrodynamic and kinetic. Numerical simulation of the in-situ combustion wave propagation was carried out with using the thermohydrodynamical simulator developed for the numerical integration of non-isothermal multicomponent filtration flows accompanied by phase transitions and chemical reaction.

  4. Petrov M.N., Zimina S.V., Dyachenko D.L., Dubodelov A.V., Simakov S.S.
    Dual-pass Feature-Fused SSD model for detecting multi-scale images of workers on the construction site
    Computer Research and Modeling, 2023, v. 15, no. 1, pp. 57-73

    When recognizing workers on images of a construction site obtained from surveillance cameras, a situation is typical in which the objects of detection have a very different spatial scale relative to each other and other objects. An increase in the accuracy of detection of small objects can be achieved by using the Feature-Fused modification of the SSD detector. Together with the use of overlapping image slicing on the inference, this model copes well with the detection of small objects. However, the practical use of this approach requires manual adjustment of the slicing parameters. This reduces the accuracy of object detection on scenes that differ from the scenes used in training, as well as large objects. In this paper, we propose an algorithm for automatic selection of image slicing parameters depending on the ratio of the characteristic geometric dimensions of objects in the image. We have developed a two-pass version of the Feature-Fused SSD detector for automatic determination of optimal image slicing parameters. On the first pass, a fast truncated version of the detector is used, which makes it possible to determine the characteristic sizes of objects of interest. On the second pass, the final detection of objects with slicing parameters selected after the first pass is performed. A dataset was collected with images of workers on a construction site. The dataset includes large, small and diverse images of workers. To compare the detection results for a one-pass algorithm without splitting the input image, a one-pass algorithm with uniform splitting, and a two-pass algorithm with the selection of the optimal splitting, we considered tests for the detection of separately large objects, very small objects, with a high density of objects both in the foreground and in the background, only in the background. In the range of cases we have considered, our approach is superior to the approaches taken in comparison, allows us to deal well with the problem of double detections and demonstrates a quality of 0.82–0.91 according to the mAP (mean Average Precision) metric.

  5. Zhdanova O.L., Zhdanov V.S., Neverova G.P.
    Modeling the dynamics of plankton community considering phytoplankton toxicity
    Computer Research and Modeling, 2022, v. 14, no. 6, pp. 1301-1323

    We propose a three-component discrete-time model of the phytoplankton-zooplankton community, in which toxic and non-toxic species of phytoplankton compete for resources. The use of the Holling functional response of type II allows us to describe an interaction between zooplankton and phytoplankton. With the Ricker competition model, we describe the restriction of phytoplankton biomass growth by the availability of external resources (mineral nutrition, oxygen, light, etc.). Many phytoplankton species, including diatom algae, are known not to release toxins if they are not damaged. Zooplankton pressure on phytoplankton decreases in the presence of toxic substances. For example, Copepods are selective in their food choices and avoid consuming toxin-producing phytoplankton. Therefore, in our model, zooplankton (predator) consumes only non-toxic phytoplankton species being prey, and toxic species phytoplankton only competes with non-toxic for resources.

    We study analytically and numerically the proposed model. Dynamic mode maps allow us to investigate stability domains of fixed points, bifurcations, and the evolution of the community. Stability loss of fixed points is shown to occur only through a cascade of period-doubling bifurcations. The Neimark – Sacker scenario leading to the appearance of quasiperiodic oscillations is found to realize as well. Changes in intrapopulation parameters of phytoplankton or zooplankton can lead to abrupt transitions from regular to quasi-periodic dynamics (according to the Neimark – Sacker scenario) and further to cycles with a short period or even stationary dynamics. In the multistability areas, an initial condition variation with the unchanged values of all model parameters can shift the current dynamic mode or/and community composition.

    The proposed discrete-time model of community is quite simple and reveals dynamics of interacting species that coincide with features of experimental dynamics. In particular, the system shows behavior like in prey-predator models without evolution: the predator fluctuations lag behind those of prey by about a quarter of the period. Considering the phytoplankton genetic heterogeneity, in the simplest case of two genetically different forms: toxic and non-toxic ones, allows the model to demonstrate both long-period antiphase oscillations of predator and prey and cryptic cycles. During the cryptic cycle, the prey density remains almost constant with fluctuating predators, which corresponds to the influence of rapid evolution masking the trophic interaction.

  6. Tumanyan A.G., Bartsev S.I.
    Model of formation of primary behavioral patterns with adaptive behavior based on the combination of random search and experience
    Computer Research and Modeling, 2016, v. 8, no. 6, pp. 941-950

    In this paper, we propose an adaptive algorithm that simulates the process of forming the initial behavioral skills on the example of the system ‘eye-arm’ animat. The situation is the formation of the initial behavioral skills occurs, for example, when a child masters the management of their hands by understanding the relationship between baseline unidentified spots on the retina of his eye and the position of the real object. Since the body control skills are not ‘hardcoded’ initially in the brain and the spinal cord at the level of instincts, the human child, like most young of other mammals, it is necessary to develop these skills in search behavior mode. Exploratory behavior begins with trial and error and then its contribution is gradually reduced as the development of the body and its environment. Since the correct behavior patterns at this stage of development of the organism does not exist for now, then the only way to select the right skills is a positive reinforcement to achieve the objective. A key feature of the proposed algorithm is to fix in the imprinting mode, only the final action that led to success, and that is very important, led to the familiar imprinted situation clearly leads to success. Over time, the continuous chain is lengthened right action — maximum use of previous positive experiences and negative ‘forgotten’ and not used.

    Thus there is the gradual replacement of the random search purposeful actions that observed in the real young. Thus, the algorithm is able to establish a correspondence between the laws of the world and the ‘inner feelings’, the internal state of the animat. The proposed animat model was used 2 types of neural networks: 1) neural network NET1 to the input current which is fed to the position of the brush arms and the target point, and the output of motor commands, directing ‘brush’ manipulator animat to the target point; 2) neural network NET2 is received at the input of target coordinates and the current coordinates of the ‘brush’ and the output value is formed likelihood that the animat already ‘know’ this situation, and he ‘knows’ how to react to it. With this architecture at the animat has to rely on the ‘experience’ of neural networks to recognize situations where the response from NET2 network of close to 1, and on the other hand, run a random search, when the experience of functioning in this area of the visual field in animat not (response NET2 close to 0).

    Views (last year): 6. Citations: 2 (RSCI).
  7. Malygina N.V., Surkov P.G.
    On the modeling of water obstacles overcoming by Rangifer tarandus L
    Computer Research and Modeling, 2019, v. 11, no. 5, pp. 895-910

    Seasonal migrations and herd instinct are traditionally recognized as wild reindeer (Rangifer tarandus L.) species-specific behavioral signs. These animals are forced to overcome water obstacles during the migrations. Behaviour peculiarities are considered as the result of the selection process, which has chosen among the sets of strategies, as the only evolutionarily stable one, determining the reproduction and biological survival of wild reindeer as a species. Natural processes in the Taimyr population wild reindeer are currently occurring against the background of an increase in the influence of negative factors due to the escalation of the industrial development of the Arctic. That is why the need to identify the ethological features of these animals completely arose. This paper presents the results of applying the classical methods of the theory of optimal control and differential games to the wild reindeer study of the migration patterns in overcoming water barriers, including major rivers. Based on these animals’ ethological features and behavior forms, the herd is presented as a controlled dynamic system, which presents also two classes of individuals: the leader and the rest of the herd, for which their models, describing the trajectories of their movement, are constructed. The models are based on hypotheses, which are the mathematical formalization of some animal behavior patterns. This approach made it possible to find the trajectory of the important one using the methods of the optimal control theory, and in constructing the trajectories of other individuals, apply the principle of control with a guide. Approbation of the obtained results, which can be used in the formation of a common “platform” for the adaptive behavior models systematic construction and as a reserve for the cognitive evolution models fundamental development, is numerically carried out using a model example with observational data on the Werchnyaya Taimyra River.

  8. Shokirov F.S.
    Interaction of a breather with a domain wall in a two-dimensional O(3) nonlinear sigma model
    Computer Research and Modeling, 2017, v. 9, no. 5, pp. 773-787

    By numerical simulation methods the interaction processes of oscillating soliton (breather) with a 180-degree Neel domain wall in the framework of a (2 + 1)-dimensional supersymmetric O(3) nonlinear sigma model is studied. The purpose of this paper is to investigate nonlinear evolution and stability of a system of interacting localized dynamic and topological solutions. To construct the interaction models, were used a stationary breather and domain wall solutions, where obtained in the framework of the two-dimensional sine-Gordon equation by adding specially selected perturbations to the A3-field vector in the isotopic space of the Bloch sphere. In the absence of an external magnetic field, nonlinear sigma models have formal Lorentz invariance, which allows constructing, in particular, moving solutions and analyses the experimental data of the nonlinear dynamics of an interacting solitons system. In this paper, based on the obtained moving localized solutions, models for incident and head-on collisions of breathers with a domain wall are constructed, where, depending on the dynamic parameters of the system, are observed the collisions and reflections of solitons from each other, a long-range interactions and also the decay of an oscillating soliton into linear perturbation waves. In contrast to the breather solution that has the dynamics of the internal degree of freedom, the energy integral of a topologically stable soliton in the all experiments the preserved with high accuracy. For each type of interaction, the range of values of the velocity of the colliding dynamic and topological solitons is determined as a function of the rotation frequency of the A3-field vector in the isotopic space. Numerical models are constructed on the basis of methods of the theory of finite difference schemes, using the properties of stereographic projection, taking into account the group-theoretical features of constructions of the O(N) class of nonlinear sigma models of field theory. On the perimeter of the two-dimensional modeling area, specially developed boundary conditions are established that absorb linear perturbation waves radiated by interacting soliton fields. Thus, the simulation of the interaction processes of localized solutions in an infinite two-dimensional phase space is carried out. A software module has been developed that allows to carry out a complex analysis of the evolution of interacting solutions of nonlinear sigma models of field theory, taking into account it’s group properties in a two-dimensional pseudo-Euclidean space. The analysis of isospin dynamics, as well the energy density and energy integral of a system of interacting dynamic and topological solitons is carried out.

    Views (last year): 6.
  9. Makarov I.S., Bagantsova E.R., Iashin P.A., Kovaleva M.D., Zakharova E.M.
    Development of and research into a rigid algorithm for analyzing Twitter publications and its influence on the movements of the cryptocurrency market
    Computer Research and Modeling, 2023, v. 15, no. 1, pp. 157-170

    Social media is a crucial indicator of the position of assets in the financial market. The paper describes the rigid solution for the classification problem to determine the influence of social media activity on financial market movements. Reputable crypto traders influencers are selected. Twitter posts packages are used as data. The methods of text, which are characterized by the numerous use of slang words and abbreviations, and preprocessing consist in lemmatization of Stanza and the use of regular expressions. A word is considered as an element of a vector of a data unit in the course of solving the problem of binary classification. The best markup parameters for processing Binance candles are searched for. Methods of feature selection, which is necessary for a precise description of text data and the subsequent process of establishing dependence, are represented by machine learning and statistical analysis. First, the feature selection is used based on the information criterion. This approach is implemented in a random forest model and is relevant for the task of feature selection for splitting nodes in a decision tree. The second one is based on the rigid compilation of a binary vector during a rough check of the presence or absence of a word in the package and counting the sum of the elements of this vector. Then a decision is made depending on the superiority of this sum over the threshold value that is predetermined previously by analyzing the frequency distribution of mentions of the word. The algorithm used to solve the problem was named benchmark and analyzed as a tool. Similar algorithms are often used in automated trading strategies. In the course of the study, observations of the influence of frequently occurring words, which are used as a basis of dimension 2 and 3 in vectorization, are described as well.

  10. Musaev A.A., Grigoriev D.A.
    Extracting knowledge from text messages: overview and state-of-the-art
    Computer Research and Modeling, 2021, v. 13, no. 6, pp. 1291-1315

    In general, solving the information explosion problem can be delegated to systems for automatic processing of digital data. These systems are intended for recognizing, sorting, meaningfully processing and presenting data in formats readable and interpretable by humans. The creation of intelligent knowledge extraction systems that handle unstructured data would be a natural solution in this area. At the same time, the evident progress in these tasks for structured data contrasts with the limited success of unstructured data processing, and, in particular, document processing. Currently, this research area is undergoing active development and investigation. The present paper is a systematic survey on both Russian and international publications that are dedicated to the leading trend in automatic text data processing: Text Mining (TM). We cover the main tasks and notions of TM, as well as its place in the current AI landscape. Furthermore, we analyze the complications that arise during the processing of texts written in natural language (NLP) which are weakly structured and often provide ambiguous linguistic information. We describe the stages of text data preparation, cleaning, and selecting features which, alongside the data obtained via morphological, syntactic, and semantic analysis, constitute the input for the TM process. This process can be represented as mapping a set of text documents to «knowledge». Using the case of stock trading, we demonstrate the formalization of the problem of making a trade decision based on a set of analytical recommendations. Examples of such mappings are methods of Information Retrieval (IR), text summarization, sentiment analysis, document classification and clustering, etc. The common point of all tasks and techniques of TM is the selection of word forms and their derivatives used to recognize content in NL symbol sequences. Considering IR as an example, we examine classic types of search, such as searching for word forms, phrases, patterns and concepts. Additionally, we consider the augmentation of patterns with syntactic and semantic information. Next, we provide a general description of all NLP instruments: morphological, syntactic, semantic and pragmatic analysis. Finally, we end the paper with a comparative analysis of modern TM tools which can be helpful for selecting a suitable TM platform based on the user’s needs and skills.

Pages: next

Indexed in Scopus

Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU

The journal is included in the Russian Science Citation Index

The journal is included in the RSCI

International Interdisciplinary Conference "Mathematics. Computing. Education"