All issues
- 2026 Vol. 18
- 2025 Vol. 17
- 2024 Vol. 16
- 2023 Vol. 15
- 2022 Vol. 14
- 2021 Vol. 13
- 2020 Vol. 12
- 2019 Vol. 11
- 2018 Vol. 10
- 2017 Vol. 9
- 2016 Vol. 8
- 2015 Vol. 7
- 2014 Vol. 6
- 2013 Vol. 5
- 2012 Vol. 4
- 2011 Vol. 3
- 2010 Vol. 2
- 2009 Vol. 1
Application of beta regression to the CD44 alternative splicing problem
pdf (809K)
Aberrant alternative splicing of the CD44 gene drives colorectal cancer progression and facilitates the emergence of cancer stem cells. Although biomedical research recognizes this transmembrane glycoprotein as a major catalyst of malignancy, deciphering its multi-isoform regulatory networks remains a complex analytical challenge. To address this knowledge gap, this study presents a machine learning framework designed to decode these biological mechanisms. The author constructed a neural network regressor based on beta regression to model bounded isoform proportions. This computational architecture jointly estimates both the mean and the precision parameters of the underlying probability distribution. Furthermore, the system employs elastic net regularization to perform quantitative feature selection from highdimensional molecular expression data.
The investigation evaluates the proposed framework using gene expression profiles from colorectal cancer patients. The primary objective involves identifying specific ribonucleic acid-binding proteins acting as regulatory splicing factors. The experimental design contrasts two distinct mathematical modeling strategies. The first configuration incorporates an independent ”one-vs-all” approach that treats each transcript variant as an isolated regression target. The second formulation utilizes a structured ”isoform tree” method that directly mirrors hierarchical exon inclusion relationships. Validation experiments on synthetically generated datasets confirmed the mathematical integrity of the network. The model recovered true distribution parameters with precision and exhibited no systematic bias. Comprehensive empirical comparisons subsequently demonstrated that the independent ”one-vs-all” layout consistently outperforms the hierarchical tree configuration in predictive stability and accuracy.
The computational analysis maps the regulatory landscape of the CD44 gene. The framework validates several established splicing factors while uncovering new candidate proteins, including ACO1, NUDT21, and AGO2. Based on these statistical associations, the paper introduces a biological hypothesis. This concept functionally connects intracellular iron metabolism via the ACO1 protein with the shifting balance of CD44 variants. These discoveries provide deeper insights into oncogenic splicing regulation. Ultimately, they highlight molecular targets for future therapeutic interventions aimed at suppressing the cancer stem cell phenotype.
Copyright © 2026 Pirogov A.A.
Indexed in Scopus
Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU
The journal is included in the Russian Science Citation Index
The journal is included in the RSCI
International Interdisciplinary Conference "Mathematics. Computing. Education"





