All issues
- 2024 Vol. 16
- 2023 Vol. 15
- 2022 Vol. 14
- 2021 Vol. 13
- 2020 Vol. 12
- 2019 Vol. 11
- 2018 Vol. 10
- 2017 Vol. 9
- 2016 Vol. 8
- 2015 Vol. 7
- 2014 Vol. 6
- 2013 Vol. 5
- 2012 Vol. 4
- 2011 Vol. 3
- 2010 Vol. 2
- 2009 Vol. 1
-
Tasks and algorithms for optimal clustering of multidimensional objects by a variety of heterogeneous indicators and their applications in medicine
Computer Research and Modeling, 2024, v. 16, no. 3, pp. 673-693The work is devoted to the description of the author’s formal statements of the clustering problem for a given number of clusters, algorithms for their solution, as well as the results of using this toolkit in medicine.
The solution of the formulated problems by exact algorithms of implementations of even relatively low dimensions before proving optimality is impossible in a finite time due to their belonging to the NP class.
In this regard, we have proposed a hybrid algorithm that combines the advantages of precise methods based on clustering in paired distances at the initial stage with the speed of methods for solving simplified problems of splitting by cluster centers at the final stage. In the development of this direction, a sequential hybrid clustering algorithm using random search in the paradigm of swarm intelligence has been developed. The article describes it and presents the results of calculations of applied clustering problems.
To determine the effectiveness of the developed tools for optimal clustering of multidimensional objects according to a variety of heterogeneous indicators, a number of computational experiments were performed using data sets including socio-demographic, clinical anamnestic, electroencephalographic and psychometric data on the cognitive status of patients of the cardiology clinic. An experimental proof of the effectiveness of using local search algorithms in the paradigm of swarm intelligence within the framework of a hybrid algorithm for solving optimal clustering problems has been obtained.
The results of the calculations indicate the actual resolution of the main problem of using the discrete optimization apparatus — limiting the available dimensions of task implementations. We have shown that this problem is eliminated while maintaining an acceptable proximity of the clustering results to the optimal ones. The applied significance of the obtained clustering results is also due to the fact that the developed optimal clustering toolkit is supplemented by an assessment of the stability of the formed clusters, which allows for known factors (the presence of stenosis or older age) to additionally identify those patients whose cognitive resources are insufficient to overcome the influence of surgical anesthesia, as a result of which there is a unidirectional effect of postoperative deterioration of complex visual-motor reaction, attention and memory. This effect indicates the possibility of differentiating the classification of patients using the proposed tools.
-
On the computer experiments of Kasman
Computer Research and Modeling, 2019, v. 11, no. 3, pp. 503-513Views (last year): 23.In 2007 Kasman conducted a series of original computer experiments with sine-Gordon kinks moving along artificial DNA sequences. Two sequences were considered. Each consisted of two parts separated by a boundary. The left part of the first sequence contained repeating TTA triplets that encode leucines, and the right part contained repeating CGC triplets that encode arginines. In the second sequence, the left part contained repeating CTG triplets encoding leucines, and the right part contained repeating AGA triplets encoding arginines. When modeling the kink movement, an interesting effect was discovered. It turned out that the kink, moving in one of the sequences, stopped without reaching the end of the sequence, and then “bounced off” as if he had hit a wall. At the same time, the kink movement in the other sequence did not stop during the entire time of the experiment. In these computer experiments, however, a simple DNA model proposed by Salerno was used. It takes into account differences in the interactions of complementary bases within pairs, but does not take into account differences in the moments of inertia of nitrogenous bases and in the distances between the centers of mass of the bases and the sugar-phosphate chain. The question of whether the Kasman effect will continue with the use of more accurate DNA models is still open. In this paper, we investigate the Kasman effect on the basis of a more accurate DNA model that takes both of these differences into account. We obtained the energy profiles of Kasman's sequences and constructed the trajectories of the motion of kinks launched in these sequences with different initial values of the energy. The results of our investigations confirmed the existence of the Kasman effect, but only in a limited interval of initial values of the kink energy and with a certain direction of the kinks movement. In other cases, this effect did not observe. We discussed which of the studied sequences were energetically preferable for the excitation and propagation of kinks.
-
Assessing the validity of clustering of panel data by Monte Carlo methods (using as example the data of the Russian regional economy)
Computer Research and Modeling, 2020, v. 12, no. 6, pp. 1501-1513The paper considers a method for studying panel data based on the use of agglomerative hierarchical clustering — grouping objects based on the similarities and differences in their features into a hierarchy of clusters nested into each other. We used 2 alternative methods for calculating Euclidean distances between objects — the distance between the values averaged over observation interval, and the distance using data for all considered years. Three alternative methods for calculating the distances between clusters were compared. In the first case, the distance between the nearest elements from two clusters is considered to be distance between these clusters, in the second — the average over pairs of elements, in the third — the distance between the most distant elements. The efficiency of using two clustering quality indices, the Dunn and Silhouette index, was studied to select the optimal number of clusters and evaluate the statistical significance of the obtained solutions. The method of assessing statistical reliability of cluster structure consisted in comparing the quality of clustering on a real sample with the quality of clustering on artificially generated samples of panel data with the same number of objects, features and lengths of time series. Generation was made from a fixed probability distribution. At the same time, simulation methods imitating Gaussian white noise and random walk were used. Calculations with the Silhouette index showed that a random walk is characterized not only by spurious regression, but also by “spurious clustering”. Clustering was considered reliable for a given number of selected clusters if the index value on the real sample turned out to be greater than the value of the 95% quantile for artificial data. A set of time series of indicators characterizing production in the regions of the Russian Federation was used as a sample of real data. For these data only Silhouette shows reliable clustering at the level p < 0.05. Calculations also showed that index values for real data are generally closer to values for random walks than for white noise, but it have significant differences from both. Since three-dimensional feature space is used, the quality of clustering was also evaluated visually. Visually, one can distinguish clusters of points located close to each other, also distinguished as clusters by the applied hierarchical clustering algorithm.
Indexed in Scopus
Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU
The journal is included in the Russian Science Citation Index
The journal is included in the RSCI
International Interdisciplinary Conference "Mathematics. Computing. Education"