Результаты поиска по 'sequence':
Найдено статей: 49
  1. Temlyakova E.A., Sorokin A.A.
    Detection of promoter and non-promoter E.coli sequences by analysis of their electrostatic profiles
    Computer Research and Modeling, 2015, v. 7, no. 2, pp. 347-359

    The article is devoted to the idea of using physical properties of DNA instead of sequence along for the aspect of accurate search and annotation of various prokaryotic genomic regions. Particulary, the possibility to use electrostatic potential distribution around DNA sequence as a classifier for identification of a few functional DNA regions was demonstrated. A number of classification models was built providing discrimination of promoters and non-promoter regions (random sequences, coding regions and promoter-like sequences) with accuracy value about 83–85%. The most valueable regions for the discrimination were determined and expected to play a certain role in the process of DNA-recognition by RNA-polymerase.

    Views (last year): 3.
  2. Yakushevich L.V., Balashova V.N., Zakiryanov F.K.
    Features of the DNA kink motion in the asynchronous switching on and off of the constant and periodic fields
    Computer Research and Modeling, 2018, v. 10, no. 4, pp. 545-558

    Investigation of the influence of external fields on living systems is one of the most interesting and rapidly developing areas of modern biophysics. However, the mechanisms of such an impact are still not entirely clear. One approach to the study of this issue is associated with modeling the interaction of external fields with internal mobility of biological objects. In this paper, this approach is used to study the effect of external fields on the motion of local conformational distortions — kinks, in the DNA molecule. Realizing and taking into account that on the whole this task is closely connected with the problem of the mechanisms of regulation of vital processes of cells and cellular systems, we set the problem — to investigate the physical mechanisms regulating the motion of kinks and also to answer the question whether permanent and periodic fields can play the role of regulators of this movement. The paper considers the most general case, when constant and periodic fields are switching on and off asynchronously. Three variants of asynchronous switching on/off are studied in detail. In the first variant, the time intervals (or diapasons) of the actions of the constant and periodic fields do not overlap, in the second — overlap, and in the third — the intervals are putting in each other. The calculations were performed for the sequence of plasmid pTTQ18. The kink motion was modeled by the McLaughlin–Scott equation, and the coefficients of the equation were calculated in a quasi-homogeneous approximation. Numerical experiments showed that constant and periodic fields exert a significant influence on the character of the kink motion and regulate it. So the switching on of a constant field leads to a rapid increase of the kink velocity and to the establishment of a stationary velocity of motion, and the switching on of a periodic field leads to the steady oscillations of the kink with the frequency of the external periodic field. It is shown that the behavior of the kink depends on the mutual arrangement of the diapasons of the action of the external fields. As it turned out, events occurring in one of the two diapasons can affect the events in the other diapason, even when the diapasons are sufficiently far apart. It is shown that the overlapping of the diapasons of action of the constant and periodic fields leads to a significant increase in the path traversed by the kink to a complete stop. Maximal growth of the path is observed when one diapason is putting in each other. In conclusion, the question of how the obtained model results could be related to the most important task of biology — the problem of the mechanisms of regulation of the processes of vital activity of cells and cellular systems is discussed.

    Views (last year): 29. Citations: 1 (RSCI).
  3. Musaev A.A., Grigoriev D.A.
    Extracting knowledge from text messages: overview and state-of-the-art
    Computer Research and Modeling, 2021, v. 13, no. 6, pp. 1291-1315

    In general, solving the information explosion problem can be delegated to systems for automatic processing of digital data. These systems are intended for recognizing, sorting, meaningfully processing and presenting data in formats readable and interpretable by humans. The creation of intelligent knowledge extraction systems that handle unstructured data would be a natural solution in this area. At the same time, the evident progress in these tasks for structured data contrasts with the limited success of unstructured data processing, and, in particular, document processing. Currently, this research area is undergoing active development and investigation. The present paper is a systematic survey on both Russian and international publications that are dedicated to the leading trend in automatic text data processing: Text Mining (TM). We cover the main tasks and notions of TM, as well as its place in the current AI landscape. Furthermore, we analyze the complications that arise during the processing of texts written in natural language (NLP) which are weakly structured and often provide ambiguous linguistic information. We describe the stages of text data preparation, cleaning, and selecting features which, alongside the data obtained via morphological, syntactic, and semantic analysis, constitute the input for the TM process. This process can be represented as mapping a set of text documents to «knowledge». Using the case of stock trading, we demonstrate the formalization of the problem of making a trade decision based on a set of analytical recommendations. Examples of such mappings are methods of Information Retrieval (IR), text summarization, sentiment analysis, document classification and clustering, etc. The common point of all tasks and techniques of TM is the selection of word forms and their derivatives used to recognize content in NL symbol sequences. Considering IR as an example, we examine classic types of search, such as searching for word forms, phrases, patterns and concepts. Additionally, we consider the augmentation of patterns with syntactic and semantic information. Next, we provide a general description of all NLP instruments: morphological, syntactic, semantic and pragmatic analysis. Finally, we end the paper with a comparative analysis of modern TM tools which can be helpful for selecting a suitable TM platform based on the user’s needs and skills.

  4. Ignatev N.A., Tuliev U.Y.
    Semantic structuring of text documents based on patterns of natural language entities
    Computer Research and Modeling, 2022, v. 14, no. 5, pp. 1185-1197

    The technology of creating patterns from natural language words (concepts) based on text data in the bag of words model is considered. Patterns are used to reduce the dimension of the original space in the description of documents and search for semantically related words by topic. The process of dimensionality reduction is implemented through the formation of patterns of latent features. The variety of structures of document relations is investigated in order to divide them into themes in the latent space.

    It is considered that a given set of documents (objects) is divided into two non-overlapping classes, for the analysis of which it is necessary to use a common dictionary. The belonging of words to a common vocabulary is initially unknown. Class objects are considered as opposition to each other. Quantitative parameters of oppositionality are determined through the values of the stability of each feature and generalized assessments of objects according to non-overlapping sets of features.

    To calculate the stability, the feature values are divided into non-intersecting intervals, the optimal boundaries of which are determined by a special criterion. The maximum stability is achieved under the condition that the boundaries of each interval contain values of one of the two classes.

    The composition of features in sets (patterns of words) is formed from a sequence ordered by stability values. The process of formation of patterns and latent features based on them is implemented according to the rules of hierarchical agglomerative grouping.

    A set of latent features is used for cluster analysis of documents using metric grouping algorithms. The analysis applies the coefficient of content authenticity based on the data on the belonging of documents to classes. The coefficient is a numerical characteristic of the dominance of class representatives in groups.

    To divide documents into topics, it is proposed to use the union of groups in relation to their centers. As patterns for each topic, a sequence of words ordered by frequency of occurrence from a common dictionary is considered.

    The results of a computational experiment on collections of abstracts of scientific dissertations are presented. Sequences of words from the general dictionary on 4 topics are formed.

  5. Nikulin V.N., Odintsova A.S.
    Statistically fair price for the European call options according to the discreet mean/variance model
    Computer Research and Modeling, 2014, v. 6, no. 5, pp. 861-874

    We consider a portfolio with call option and the corresponding underlying asset under the standard assumption that stock-market price represents a random variable with lognormal distribution. Minimizing the variance hedging risk of the portfolio on the date of maturity of the call option we find a fraction of the asset per unit call option. As a direct consequence we derive the statistically fair lookback call option price in explicit form. In contrast to the famous Black–Scholes theory, any portfolio cannot be regarded as  risk-free because no additional transactions are supposed to be conducted over the life of the contract, but the sequence of independent portfolios will reduce risk to zero asymptotically. This property is illustrated in the experimental section using a dataset of daily stock prices of 37 leading US-based companies for the period from April 2006 to January 2013.

    Views (last year): 1.
  6. Ablaev S.S., Makarenko D.V., Stonyakin F.S., Alkousa M.S., Baran I.V.
    Subgradient methods for non-smooth optimization problems with some relaxation of sharp minimum
    Computer Research and Modeling, 2022, v. 14, no. 2, pp. 473-495

    Non-smooth optimization often arises in many applied problems. The issues of developing efficient computational procedures for such problems in high-dimensional spaces are very topical. First-order methods (subgradient methods) are well applicable here, but in fairly general situations they lead to low speed guarantees for large-scale problems. One of the approaches to this type of problem can be to identify a subclass of non-smooth problems that allow relatively optimistic results on the rate of convergence. For example, one of the options for additional assumptions can be the condition of a sharp minimum, proposed in the late 1960s by B. T. Polyak. In the case of the availability of information about the minimal value of the function for Lipschitz-continuous problems with a sharp minimum, it turned out to be possible to propose a subgradient method with a Polyak step-size, which guarantees a linear rate of convergence in the argument. This approach made it possible to cover a number of important applied problems (for example, the problem of projecting onto a convex compact set). However, both the condition of the availability of the minimal value of the function and the condition of a sharp minimum itself look rather restrictive. In this regard, in this paper, we propose a generalized condition for a sharp minimum, somewhat similar to the inexact oracle proposed recently by Devolder – Glineur – Nesterov. The proposed approach makes it possible to extend the class of applicability of subgradient methods with the Polyak step-size, to the situation of inexact information about the value of the minimum, as well as the unknown Lipschitz constant of the objective function. Moreover, the use of local analogs of the global characteristics of the objective function makes it possible to apply the results of this type to wider classes of problems. We show the possibility of applying the proposed approach to strongly convex nonsmooth problems, also, we make an experimental comparison with the known optimal subgradient method for such a class of problems. Moreover, there were obtained some results connected to the applicability of the proposed technique to some types of problems with convexity relaxations: the recently proposed notion of weak $\beta$-quasi-convexity and ordinary quasiconvexity. Also in the paper, we study a generalization of the described technique to the situation with the assumption that the $\delta$-subgradient of the objective function is available instead of the usual subgradient. For one of the considered methods, conditions are found under which, in practice, it is possible to escape the projection of the considered iterative sequence onto the feasible set of the problem.

  7. Zavodskikh R.K., Efanov N.N.
    Performance prediction for chosen types of loops over one-dimensional arrays with embedding-driven intermediate representations analysis
    Computer Research and Modeling, 2023, v. 15, no. 1, pp. 211-224

    The method for mapping of intermediate representations (IR) set of C, C++ programs to vector embedding space is considered to create an empirical estimation framework for static performance prediction using LLVM compiler infrastructure. The usage of embeddings makes programs easier to compare due to avoiding Control Flow Graphs (CFG) and Data Flow Graphs (DFG) direct comparison. This method is based on transformation series of the initial IR such as: instrumentation — injection of artificial instructions in an instrumentation compiler’s pass depending on load offset delta in the current instruction compared to the previous one, mapping of instrumented IR into multidimensional vector with IR2Vec and dimension reduction with t-SNE (t-distributed stochastic neighbor embedding) method. The D1 cache miss ratio measured with perf stat tool is considered as performance metric. A heuristic criterion of programs having more or less cache miss ratio is given. This criterion is based on embeddings of programs in 2D-space. The instrumentation compiler’s pass developed in this work is described: how it generates and injects artificial instructions into IR within the used memory model. The software pipeline that implements the performance estimation based on LLVM compiler infrastructure is given. Computational experiments are performed on synthetic tests which are the sets of programs with the same CFGs but with different sequences of offsets used when accessing the one-dimensional array of a given size. The correlation coefficient between performance metric and distance to the worst program’s embedding is measured and proved to be negative regardless of t-SNE initialization. This fact proves the heuristic criterion to be true. The process of such synthetic tests generation is also considered. Moreover, the variety of performance metric in programs set in such a test is proposed as a metric to be improved with exploration of more tests generators.

  8. Maslakov A.S.
    Describing processes in photosynthetic reaction center ensembles using a Monte Carlo kinetic model
    Computer Research and Modeling, 2020, v. 12, no. 5, pp. 1207-1221

    Photosynthetic apparatus of a plant cell consists of multiple photosynthetic electron transport chains (ETC). Each ETC is capable of capturing and utilizing light quanta, that drive electron transport along the chain. Light assimilation efficiency depends on the plant’s current physiological state. The energy of the part of quanta that cannot be utilized, dissipates into heat, or is emitted as fluorescence. Under high light conditions fluorescence levels gradually rise to the maximum level. The curve describing that rise is called fluorescence rise (FR). It has a complex shape and that shape changes depending on the photosynthetic apparatus state. This gives one the opportunity to investigate that state only using the non invasive measuring of the FR.

    When measuring fluorescence in experimental conditions, we get a response from millions of photosynthetic units at a time. In order to reproduce the probabilistic nature of the processes in a photosynthetic ETC, we created a Monte Carlo model of this chain. This model describes an ETC as a sequence of electron carriers in a thylakoid membrane, connected with each other. Those carriers have certain probabilities of capturing light photons, transferring excited states, or reducing each other, depending on the current ETC state. The events that take place in each of the model photosynthetic ETCs are registered, accumulated and used to create fluorescence rise and electron carrier redox states accumulation kinetics. This paper describes the model structure, the principles of its operation and the relations between certain model parameters and the resulting kinetic curves shape. Model curves include photosystem II reaction center fluorescence rise and photosystem I reaction center redox state change kinetics under different conditions.

  9. Lotarev D.T.
    Allocation of steinerpoints in euclidean Steiner tree problem by means of MatLab package
    Computer Research and Modeling, 2015, v. 7, no. 3, pp. 707-713

    The problem of allocation of Steiner points in Euclidean Steiner Tree is considered. The cost of network is sum of building costs and cost of the information transportation. Euclidean Steiner tree problem in the form of topological network design is a good model of this problem.

    The package MatLab has the way to solve the second part of this problem — allocate Steiner points under condition that the adjacency matrix is set. The method to get solution has been worked out. The Steiner tree is formed by means of solving of the sequence of "three points" Steiner

    Views (last year): 4.
Pages: « first previous

Indexed in Scopus

Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU

The journal is included in the Russian Science Citation Index

The journal is included in the RSCI

International Interdisciplinary Conference "Mathematics. Computing. Education"