Результаты поиска по 'predictability':
Найдено статей: 95
  1. Salem N., Al-Tarawneh K., Hudaib A., Salem H., Tareef A., Salloum H., Mazzara M.
    Generating database schema from requirement specification based on natural language processing and large language model
    Computer Research and Modeling, 2024, v. 16, no. 7, pp. 1703-1713

    A Large Language Model (LLM) is an advanced artificial intelligence algorithm that utilizes deep learning methodologies and extensive datasets to process, understand, and generate humanlike text. These models are capable of performing various tasks, such as summarization, content creation, translation, and predictive text generation, making them highly versatile in applications involving natural language understanding. Generative AI, often associated with LLMs, specifically focuses on creating new content, particularly text, by leveraging the capabilities of these models. Developers can harness LLMs to automate complex processes, such as extracting relevant information from system requirement documents and translating them into a structured database schema. This capability has the potential to streamline the database design phase, saving significant time and effort while ensuring that the resulting schema aligns closely with the given requirements. By integrating LLM technology with Natural Language Processing (NLP) techniques, the efficiency and accuracy of generating database schemas based on textual requirement specifications can be significantly enhanced. The proposed tool will utilize these capabilities to read system requirement specifications, which may be provided as text descriptions or as Entity-Relationship Diagrams (ERDs). It will then analyze the input and automatically generate a relational database schema in the form of SQL commands. This innovation eliminates much of the manual effort involved in database design, reduces human errors, and accelerates development timelines. The aim of this work is to provide a tool can be invaluable for software developers, database architects, and organizations aiming to optimize their workflow and align technical deliverables with business requirements seamlessly.

  2. Shakhgeldyan K.I., Kuksin N.S., Domzhalov I.G., Pak R.L., Geltser B.I.
    Random forest of risk factors as a predictive tool for adverse events in clinical medicine
    Computer Research and Modeling, 2025, v. 17, no. 5, pp. 987-1004

    The aim of study was to develop an ensemble machine learning method for constructing interpretable predictive models and to validate it using the example of predicting in-hospital mortality (IHM) in patients with ST-segment elevation myocardial infarction (STEMI).

    A retrospective cohort study was conducted using data from 5446 electronic medical records of STEMI patients who underwent percutaneous coronary intervention (PCI). Patients were divided into two groups: 335 (6.2%) patients who died during hospitalization and 5111 (93.8%) patients with a favourable in-hospital outcome. A pool of potential predictors was formed using statistical methods. Through multimetric categorization (minimizing p-values, maximizing the area under the ROC curve (AUC), and SHAP value analysis), decision trees, and multivariable logistic regression (MLR), predictors were transformed into risk factors for IHM. Predictive models for IHM were developed using MLR, Random Forest Risk Factors (RandFRF), Stochastic Gradient Boosting (XGboost), Random Forest (RF), Adaptive boosting, Gradient Boosting, Light Gradient-Boosting Machine, Categorical Boosting (CatBoost), Explainable Boosting Machine and Stacking methods.

    Authors developed the RandFRF method, which integrates the predictive outcomes of modified decision trees, identifies risk factors and ranks them based on their contribution to the risk of adverse outcomes. RandFRF enables the development of predictive models with high discriminative performance (AUC 0.908), comparable to models based on CatBoost and Stacking (AUC 0.904 and 0.908, respectively). In turn, risk factors provide clinicians with information on the patient’s risk group classification and the extent of their impact on the probability of IHM. The risk factors identified by RandFRF can serve not only as rationale for the prediction results but also as a basis for developing more accurate models.

  3. Zharkova V.V., Schelyaev A.E., Fisher J.V.
    Numerical simulation of sportsman's external flow
    Computer Research and Modeling, 2017, v. 9, no. 2, pp. 331-344

    Numerical simulation of moving sportsman external flow is presented. The unique method is developed for obtaining integral aerodynamic characteristics, which were the function of the flow regime (i.e. angle of attack, flow speed) and body position. Individual anthropometric characteristics and moving boundaries of sportsman (or sports equipment) during the race are taken into consideration.

    Numerical simulation is realized using FlowVision CFD. The software is based on the finite volume method, high-performance numerical methods and reliable mathematical models of physical processes. A Cartesian computational grid is used by FlowVision, the grid generation is a completely automated process. Local grid adaptation is used for solving high-pressure gradient and object complex shape. Flow simulation process performed by solutions systems of equations describing movement of fluid and/or gas in the computational domain, including: mass, moment and energy conservation equations; state equations; turbulence model equations. FlowVision permits flow simulation near moving bodies by means of computational domain transformation according to the athlete shape changes in the motion. Ski jumper aerodynamic characteristics are studied during all phases: take-off performance in motion, in-run and flight. Projected investigation defined simulation method, which includes: inverted statement of sportsman external flow development (velocity of the motion is equal to air flow velocity, object is immobile); changes boundary of the body technology defining; multiple calculations with the national team member data projecting. The research results are identification of the main factors affected to jumping performance: aerodynamic forces, rotating moments etc. Developed method was tested with active sportsmen. Ski jumpers used this method during preparations for Sochi Olympic Games 2014. A comparison of the predicted characteristics and experimental data shows a good agreement. Method versatility is underlined by performing swimmer and skater flow simulation. Designed technology is applicable for sorts of natural and technical objects.

    Views (last year): 29.
  4. Tsybulin V.G., Khosaeva Z.K.
    Mathematical model of political differentiation under social tension
    Computer Research and Modeling, 2019, v. 11, no. 5, pp. 999-1012

    We comsider a model of the dynamics a political system of several parties, accompanied and controlled by the growth of social tension. A system of nonlinear ordinary differential equations is proposed with respect to fractions and an additional scalar variable characterizing the magnitude of tension in society the change of each party is proportional to the current value multiplied by a coefficient that consists of an influx of novice, a flow from competing parties, and a loss due to the growth of social tension. The change in tension is made up of party contributions and own relaxation. The number of parties is fixed, there are no mechanisms in the model for combining existing or the birth of new parties.

    To study of possible scenarios of the dynamic processes of the model we derive an approach based on the selection of conditions under which this problem belongs to the class of cosymmetric systems. For the case of two parties, it is shown that in the system under consideration may have two families of equilibria, as well as a family of limit cycles. The existence of cosymmetry for a system of differential equations is ensured by the presence of additional constraints on the parameters, and in this case, the emergence of continuous families of stationary and nonstationary solutions is possible. To analyze the scenarios of cosymmetry breaking, an approach based on the selective function is applied. In the case of one political party, there is no multistability, one stable solution corresponds to each set of parameters. For the case of two parties, it is shown that in the system under consideration may have two families of equilibria, as well as a family of limit cycles. The results of numerical experiments demonstrating the destruction of the families and the implementation of various scenarios leading to the stabilization of the political system with the coexistence of both parties or to the disappearance of one of the parties, when part of the population ceases to support one of the parties and becomes indifferent are presented.

    This model can be used to predict the inter-party struggle during the election campaign. In this case necessary to take into account the dependence of the coefficients of the system on time.

  5. Shmidt Y.D., Ivashina N.V., Ozerova G.P.
    Modelling interregional migration flows by the cellular automata
    Computer Research and Modeling, 2020, v. 12, no. 6, pp. 1467-1483

    The article dwells upon investigating the issue of the most adequate tools developing and justifying to forecast the interregional migration flows value and structure. Migration processes have a significant impact on the size and demographic structure of the population of territories, the state and balance of regional and local labor markets.

    To analyze the migration processes and to assess their impact an economic-mathematical tool is required which would be instrumental in modelling the migration processes and flows for different areas with the desired precision. The current methods and approaches to the migration processes modelling, including the analysis of their advantages and disadvantages, were considered. It is noted that to implement many of these methods mass aggregated statistical data is required which is not always available and doesn’t characterize the migrants behavior at the local level where the decision to move to a new dwelling place is made. This has a significant impact on the ability to apply appropriate migration processes modelling techniques and on the projection accuracy of the migration flows magnitude and structure.

    The cellular automata model for interregional migration flows modelling, implementing the integration of the households migration behavior model under the conditions of the Bounded Rationality into the general model of the area migration flow was developed and tested based on the Primorye Territory data. To implement the households migration behavior model under the conditions of the Bounded Rationality the integral attractiveness index of the regions with economic, social and ecological components was proposed in the work.

    To evaluate the prognostic capacity of the developed model, it was compared with the available cellular automata models used to predict interregional migration flows. The out of sample prediction method which showed statistically significant superiority of the proposed model was applied for this purpose. The model allows obtaining the forecasts and quantitative characteristics of the areas migration flows based on the households real migration behaviour at the local level taking into consideration their living conditions and behavioural motives.

  6. Krasnov F.V., Smaznevich I.S., Baskakova E.N.
    Bibliographic link prediction using contrast resampling technique
    Computer Research and Modeling, 2021, v. 13, no. 6, pp. 1317-1336

    The paper studies the problem of searching for fragments with missing bibliographic links in a scientific article using automatic binary classification. To train the model, we propose a new contrast resampling technique, the innovation of which is the consideration of the context of the link, taking into account the boundaries of the fragment, which mostly affects the probability of presence of a bibliographic links in it. The training set was formed of automatically labeled samples that are fragments of three sentences with class labels «without link» and «with link» that satisfy the requirement of contrast: samples of different classes are distanced in the source text. The feature space was built automatically based on the term occurrence statistics and was expanded by constructing additional features — entities (names, numbers, quotes and abbreviations) recognized in the text.

    A series of experiments was carried out on the archives of the scientific journals «Law enforcement review» (273 articles) and «Journal Infectology» (684 articles). The classification was carried out by the models Nearest Neighbors, RBF SVM, Random Forest, Multilayer Perceptron, with the selection of optimal hyperparameters for each classifier.

    Experiments have confirmed the hypothesis put forward. The highest accuracy was reached by the neural network classifier (95%), which is however not as fast as the linear one that showed also high accuracy with contrast resampling (91–94%). These values are superior to those reported for NER and Sentiment Analysis on comparable data. The high computational efficiency of the proposed method makes it possible to integrate it into applied systems and to process documents online.

  7. Makarov I.S., Bagantsova E.R., Iashin P.A., Kovaleva M.D., Gorbachev R.A.
    Development of and research on machine learning algorithms for solving the classification problem in Twitter publications
    Computer Research and Modeling, 2023, v. 15, no. 1, pp. 185-195

    Posts on social networks can both predict the movement of the financial market, and in some cases even determine its direction. The analysis of posts on Twitter contributes to the prediction of cryptocurrency prices. The specificity of the community is represented in a special vocabulary. Thus, slang expressions and abbreviations are used in posts, the presence of which makes it difficult to vectorize text data, as a result of which preprocessing methods such as Stanza lemmatization and the use of regular expressions are considered. This paper describes created simplest machine learning models, which may work despite such problems as lack of data and short prediction timeframe. A word is considered as an element of a binary vector of a data unit in the course of the problem of binary classification solving. Basic words are determined according to the frequency analysis of mentions of a word. The markup is based on Binance candlesticks with variable parameters for a more accurate description of the trend of price changes. The paper introduces metrics that reflect the distribution of words depending on their belonging to a positive or negative classes. To solve the classification problem, we used a dense model with parameters selected by Keras Tuner, logistic regression, a random forest classifier, a naive Bayesian classifier capable of working with a small sample, which is very important for our task, and the k-nearest neighbors method. The constructed models were compared based on the accuracy metric of the predicted labels. During the investigation we recognized that the best approach is to use models which predict price movements of a single coin. Our model deals with posts that mention LUNA project, which no longer exist. This approach to solving binary classification of text data is widely used to predict the price of an asset, the trend of its movement, which is often used in automated trading.

  8. Solovyov S.A., Rose J., Dzyublyk I.V., Trokhimenko E.P.
    Predictive models of efficacy and public health impact of vaccination with rotavirus vaccine in Ukraine
    Computer Research and Modeling, 2012, v. 4, no. 2, pp. 407-421

    There were presented the results of the computational and theoretical studies related to assessing of an efficacy and public health impact of a vaccination with a rotavirus vaccine in Ukraine. The required indicators are: the genotype-specific vaccine efficacy, number of the severe illness preventions, hospitalizations, outpatient visits and deaths. The results were obtained in a form of tree of decisions based on Makrov model by using mathematical model with computer simulation. The results showed the significant positive effect of the vaccination compared to no vaccination, in case of high level of vaccine coverage in Ukraine.

    Views (last year): 2.
  9. Shumov V.V.
    Mathematical models of combat and military operations
    Computer Research and Modeling, 2020, v. 12, no. 4, pp. 907-920

    Modeling the fight against terrorist, pirate and robbery acts at sea is an urgent scientific task due to the prevalence of force acts and the insufficient number of works on this issue. The actions of pirates and terrorists are diverse. Using a base ship, they can attack ships up to 450–500 miles from the coast. Having chosen the target, they pursue it and use the weapons to board the ship. Actions to free a ship captured by pirates or terrorists include: blocking the ship, predicting where pirates might be on the ship, penetrating (from board to board, by air or from under water) and cleaning up the ship’s premises. An analysis of the special literature on the actions of pirates and terrorists showed that the act of force (and actions to neutralize it) consists of two stages: firstly, blocking the vessel, which consists in forcing it to stop, and secondly, neutralizing the team (terrorist groups, pirates), including penetration of a ship (ship) and its cleaning. The stages of the cycle are matched by indicators — the probability of blocking and the probability of neutralization. The variables of the act of force model are the number of ships (ships, boats) of the attackers and defenders, as well as the strength of the capture group of the attackers and the crew of the ship - the victim of the attack. Model parameters (indicators of naval and combat superiority) were estimated using the maximum likelihood method using an international database of incidents at sea. The values of these parameters are 7.6–8.5. Such high values of superiority parameters reflect the parties' ability to act in force acts. An analytical method for calculating excellence parameters is proposed and statistically substantiated. The following indicators are taken into account in the model: the ability of the parties to detect the enemy, the speed and maneuverability characteristics of the vessels, the height of the vessel and the characteristics of the boarding equipment, the characteristics of weapons and protective equipment, etc. Using the Becker model and the theory of discrete choice, the probability of failure of the force act is estimated. The significance of the obtained models for combating acts of force in the sea space lies in the possibility of quantitative substantiation of measures to protect the ship from pirate and terrorist attacks and deterrence measures aimed at preventing attacks (the presence of armed guards on board the ship, assistance from warships and helicopters).

  10. Zavodskikh R.K., Efanov N.N.
    Performance prediction for chosen types of loops over one-dimensional arrays with embedding-driven intermediate representations analysis
    Computer Research and Modeling, 2023, v. 15, no. 1, pp. 211-224

    The method for mapping of intermediate representations (IR) set of C, C++ programs to vector embedding space is considered to create an empirical estimation framework for static performance prediction using LLVM compiler infrastructure. The usage of embeddings makes programs easier to compare due to avoiding Control Flow Graphs (CFG) and Data Flow Graphs (DFG) direct comparison. This method is based on transformation series of the initial IR such as: instrumentation — injection of artificial instructions in an instrumentation compiler’s pass depending on load offset delta in the current instruction compared to the previous one, mapping of instrumented IR into multidimensional vector with IR2Vec and dimension reduction with t-SNE (t-distributed stochastic neighbor embedding) method. The D1 cache miss ratio measured with perf stat tool is considered as performance metric. A heuristic criterion of programs having more or less cache miss ratio is given. This criterion is based on embeddings of programs in 2D-space. The instrumentation compiler’s pass developed in this work is described: how it generates and injects artificial instructions into IR within the used memory model. The software pipeline that implements the performance estimation based on LLVM compiler infrastructure is given. Computational experiments are performed on synthetic tests which are the sets of programs with the same CFGs but with different sequences of offsets used when accessing the one-dimensional array of a given size. The correlation coefficient between performance metric and distance to the worst program’s embedding is measured and proved to be negative regardless of t-SNE initialization. This fact proves the heuristic criterion to be true. The process of such synthetic tests generation is also considered. Moreover, the variety of performance metric in programs set in such a test is proposed as a metric to be improved with exploration of more tests generators.

Pages: « first previous next

Indexed in Scopus

Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU

The journal is included in the Russian Science Citation Index

The journal is included in the RSCI

International Interdisciplinary Conference "Mathematics. Computing. Education"