Результаты поиска по 'topic modeling':
Найдено статей: 9
  1. The 3rd BRICS Mathematics Conference
    Computer Research and Modeling, 2019, v. 11, no. 6, pp. 1015-1016
  2. Vorontsov K.V., Potapenko A.A.
    Regularization, robustness and sparsity of probabilistic topic models
    Computer Research and Modeling, 2012, v. 4, no. 4, pp. 693-706

    We propose a generalized probabilistic topic model of text corpora which can incorporate heuristics of Bayesian regularization, sampling, frequent parameters update, and robustness in any combinations. Wellknown models PLSA, LDA, CVB0, SWB, and many others can be considered as special cases of the proposed broad family of models. We propose the robust PLSA model and show that it is more sparse and performs better that regularized models like LDA.

    Views (last year): 25. Citations: 12 (RSCI).
  3. Oleynik E.B., Ivashina N.V., Shmidt Y.D.
    Migration processes modelling: methods and tools (overview)
    Computer Research and Modeling, 2021, v. 13, no. 6, pp. 1205-1232

    Migration has a significant impact on the shaping of the demographic structure of the territories population, the state of regional and local labour markets. As a rule, rapid change in the working-age population of any territory due to migration processes results in an imbalance in supply and demand on labour markets and a change in the demographic structure of the population. Migration is also to a large extent a reflection of socio-economic processes taking place in the society. Hence, the issues related to the study of migration factors, the direction, intensity and structure of migration flows, and the prediction of their magnitude are becoming topical issues these days.

    Mathematical tools are often used to analyze, predict migration processes and assess their consequences, allowing for essentially accurate modelling of migration processes for different territories on the basis of the available statistical data. In recent years, quite a number of scientific papers on modelling internal and external migration flows using mathematical methods have appeared both in Russia and in foreign countries in recent years. Consequently, there has been a need to systematize the currently most commonly used methods and tools applied in migration modelling to form a coherent picture of the main trends and research directions in this field.

    The presented review considers the main approaches to migration modelling and the main components of migration modelling methodology, i. e. stages, methods, models and model classification. Their comparative analysis was also conducted and general recommendations on the choice of mathematical tools for modelling were developed. The review contains two sections: migration modelling methods and migration models. The first section describes the main methods used in the model development process — econometric, cellular automata, system-dynamic, probabilistic, balance, optimization and cluster analysis. Based on the analysis of modern domestic and foreign publications on migration, the most common classes of models — regression, agent-based, simulation, optimization, probabilistic, balance, dynamic and combined — were identified and described. The features, advantages and disadvantages of different types of migration process models were considered.

  4. Petrov A.P., Podlipskaia O.G., Pronchev G.B.
    Modeling the dynamics of public attention to extended processes on the example of the COVID-19 pandemic
    Computer Research and Modeling, 2022, v. 14, no. 5, pp. 1131-1141

    The dynamics of public attention to COVID-19 epidemic is studied. The level of public attention is described by the daily number of search requests in Google made by users from a given country. In the empirical part of the work, data on the number of requests and the number of infected cases for a number of countries are considered. It is shown that in all cases the maximum of public attention occurs earlier than the maximum daily number of newly infected individuals. Thus, for a certain period of time, the growth of the epidemics occurs in parallel with the decline in public attention to it. It is also shown that the decline in the number of requests is described by an exponential function of time. In order to describe the revealed empirical pattern, a mathematical model is proposed, which is a modification of the model of the decline in attention after a one-time political event. The model develops the approach that considers decision-making by an individual as a member of the society in which the information process takes place. This approach assumes that an individual’s decision about whether or not to make a request on a given day about COVID is based on two factors. One of them is an attitude that reflects the individual’s long-term interest in a given topic and accumulates the individual’s previous experience, cultural preferences, social and economic status. The second is the dynamic factor of public attention to the epidemic, which changes during the process under consideration under the influence of informational stimuli. With regard to the subject under consideration, information stimuli are related to epidemic dynamics. The behavioral hypothesis is that if on some day the sum of the attitude and the dynamic factor exceeds a certain threshold value, then on that day the individual in question makes a search request on the topic of COVID. The general logic is that the higher the rate of infection growth, the higher the information stimulus, the slower decreases public attention to the pandemic. Thus, the constructed model made it possible to correlate the rate of exponential decrease in the number of requests with the rate of growth in the number of cases. The regularity found with the help of the model was tested on empirical data. It was found that the Student’s statistic is 4.56, which allows us to reject the hypothesis of the absence of a correlation with a significance level of 0.01.

  5. Vasiliev I.A., Dubinya N.V., Tikhotskiy S.A., Nachev V.A., Alexeev D.A.
    Numerical model of jack-up rig’s mechanical behavior under seismic loading
    Computer Research and Modeling, 2022, v. 14, no. 4, pp. 853-871

    The paper presents results of numerical modeling of stress-strain state of jack-up rigs used for shelf hydrocarbon reservoirs exploitation. The work studied the equilibrium stress state of a jack-up rig standing on seafloor and mechanical behavior of the rig under seismic loading. Surface elastic wave caused by a distant earthquake acts a reason for the loading. Stability of jack-up rig is the main topic of the research, as stability can be lost due to redistribution of stresses and strains in the elements of the rig due to seismic loading. Modeling results revealed that seismic loading can indeed lead to intermittent growth of stresses in particular elements of the rig’s support legs resulting into stability loss. These results were obtained using the finite element-based numerical scheme. The paper contains the proof of modeling results convergence obtained from analysis of one problem — the problem of stresses and strains distributions for the contact problem of a rigid cylinder indenting on elastic half space. The comparison between numerical and analytical solutions proved the used numerical scheme to be correct, as obtained results converged. The paper presents an analysis of the different factors influencing the mechanical behavior of the studied system. These factors include the degree of seismic loading, mechanical properties of seafloor sediments, and depth of support legs penetration. The results obtained from numerical modeling made it possible to formulate preliminary conclusions regarding the need to take site-specific conditions into account whenever planning the use of jack-up rigs, especially, in the regions with seismic activity. The approach presented in the paper can be used to evaluate risks related to offshore hydrocarbon reservoirs exploitation and development, while the reported numerical scheme can be used to solve some contact problems of theory of elasticity with the need to analyze dynamic processes.

  6. Ignatev N.A., Tuliev U.Y.
    Semantic structuring of text documents based on patterns of natural language entities
    Computer Research and Modeling, 2022, v. 14, no. 5, pp. 1185-1197

    The technology of creating patterns from natural language words (concepts) based on text data in the bag of words model is considered. Patterns are used to reduce the dimension of the original space in the description of documents and search for semantically related words by topic. The process of dimensionality reduction is implemented through the formation of patterns of latent features. The variety of structures of document relations is investigated in order to divide them into themes in the latent space.

    It is considered that a given set of documents (objects) is divided into two non-overlapping classes, for the analysis of which it is necessary to use a common dictionary. The belonging of words to a common vocabulary is initially unknown. Class objects are considered as opposition to each other. Quantitative parameters of oppositionality are determined through the values of the stability of each feature and generalized assessments of objects according to non-overlapping sets of features.

    To calculate the stability, the feature values are divided into non-intersecting intervals, the optimal boundaries of which are determined by a special criterion. The maximum stability is achieved under the condition that the boundaries of each interval contain values of one of the two classes.

    The composition of features in sets (patterns of words) is formed from a sequence ordered by stability values. The process of formation of patterns and latent features based on them is implemented according to the rules of hierarchical agglomerative grouping.

    A set of latent features is used for cluster analysis of documents using metric grouping algorithms. The analysis applies the coefficient of content authenticity based on the data on the belonging of documents to classes. The coefficient is a numerical characteristic of the dominance of class representatives in groups.

    To divide documents into topics, it is proposed to use the union of groups in relation to their centers. As patterns for each topic, a sequence of words ordered by frequency of occurrence from a common dictionary is considered.

    The results of a computational experiment on collections of abstracts of scientific dissertations are presented. Sequences of words from the general dictionary on 4 topics are formed.

  7. Irkhin I.A., Bulatov V.G., Vorontsov K.V.
    Additive regularizarion of topic models with fast text vectorizartion
    Computer Research and Modeling, 2020, v. 12, no. 6, pp. 1515-1528

    The probabilistic topic model of a text document collection finds two matrices: a matrix of conditional probabilities of topics in documents and a matrix of conditional probabilities of words in topics. Each document is represented by a multiset of words also called the “bag of words”, thus assuming that the order of words is not important for revealing the latent topics of the document. Under this assumption, the problem is reduced to a low-rank non-negative matrix factorization governed by likelihood maximization. In general, this problem is ill-posed having an infinite set of solutions. In order to regularize the solution, a weighted sum of optimization criteria is added to the log-likelihood. When modeling large text collections, storing the first matrix seems to be impractical, since its size is proportional to the number of documents in the collection. At the same time, the topical vector representation (embedding) of documents is necessary for solving many text analysis tasks, such as information retrieval, clustering, classification, and summarization of texts. In practice, the topical embedding is calculated for a document “on-the-fly”, which may require dozens of iterations over all the words of the document. In this paper, we propose a way to calculate a topical embedding quickly, by one pass over document words. For this, an additional constraint is introduced into the model in the form of an equation, which calculates the first matrix from the second one in linear time. Although formally this constraint is not an optimization criterion, in fact it plays the role of a regularizer and can be used in combination with other regularizers within the additive regularization framework ARTM. Experiments on three text collections have shown that the proposed method improves the model in terms of sparseness, difference, logLift and coherence measures of topic quality. The open source libraries BigARTM and TopicNet were used for the experiments.

  8. Degtyarev A.B., Myo Min S., Wunna K.
    Cloud computing for virtual testbed
    Computer Research and Modeling, 2015, v. 7, no. 3, pp. 753-758

    Nowadays cloud computing is an important topic in the field of information technology and computer system. Several companies and educational institutes have deployed cloud infrastructures to overcome their problems such as easy data access, software updates with minimal cost, large or unlimited storage, efficient cost factor, backup storage and disaster recovery, and some other benefits if compare with the traditional network infrastructures. The paper present the study of cloud computing technology for marine environmental data and processing. Cloud computing of marine environment information is proposed for the integration and sharing of marine information resources. It is highly desirable to perform empirical requiring numerous interactions with web servers and transfers of very large archival data files without affecting operational information system infrastructure. In this paper, we consider the cloud computing for virtual testbed to minimize the cost. That is related to real time infrastructure.

    Views (last year): 7.
  9. Ershov N.M., Popova N.N.
    Natural models of parallel computations
    Computer Research and Modeling, 2015, v. 7, no. 3, pp. 781-785

    Course “Natural models of parallel computing”, given for senior students of the Faculty of Computational Mathematics and Cybernetics, Moscow State University, is devoted to the issues of supercomputer implementation of natural computational models and is, in fact, an introduction to the theory of natural computing, a relatively new branch of science, formed at the intersection of mathematics, computer science and natural sciences (especially biology). Topics of the natural computing include both already classic subjects such as cellular automata, and relatively new, introduced in the last 10–20 years, such as swarm intelligence. Despite its biological origin, all these models are widely applied in the fields related to computer data processing. Research in the field of natural computing is closely related to issues and technology of parallel computing. Presentation of theoretical material of the course is accompanied by a consideration of the possible schemes for parallel computing, in the practical part of the course it is supposed to perform by the students a software implementation using MPI technology and numerical experiments to investigate the effectiveness of the chosen schemes of parallel computing.

    Views (last year): 17. Citations: 2 (RSCI).

Indexed in Scopus

Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU

The journal is included in the Russian Science Citation Index

The journal is included in the RSCI

International Interdisciplinary Conference "Mathematics. Computing. Education"