All issues
- 2024 Vol. 16
- 2023 Vol. 15
- 2022 Vol. 14
- 2021 Vol. 13
- 2020 Vol. 12
- 2019 Vol. 11
- 2018 Vol. 10
- 2017 Vol. 9
- 2016 Vol. 8
- 2015 Vol. 7
- 2014 Vol. 6
- 2013 Vol. 5
- 2012 Vol. 4
- 2011 Vol. 3
- 2010 Vol. 2
- 2009 Vol. 1
Automated citation graph building from a corpora of scientific documents
In this paper the problem of automated building of a citation graph from a collection of scientific documents is considered as a sequence of machine learning tasks. The overall data processing technology is described which consists of six stages: preprocessing, metainformation extraction, bibliography lists extraction, splitting bibliography lists into separate bibliography records, standardization of each bibliography record, and record linkage. The goal of this paper is to provide a survey of approaches and algorithms suitable for each stage, motivate the choice of the best combination of algorithms, and adapt some of them for multilingual bibliographies processing. For some of the tasks new algorithms and heuristics are proposed and evaluated on the mixed English and Russian documents corpora.
Indexed in Scopus
Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU
The journal is included in the Russian Science Citation Index
The journal is included in the RSCI
International Interdisciplinary Conference "Mathematics. Computing. Education"