Dual-pass Feature-Fused SSD model for detecting multi-scale images of workers on the construction site

 pdf (1605K)

When recognizing workers on images of a construction site obtained from surveillance cameras, a situation is typical in which the objects of detection have a very different spatial scale relative to each other and other objects. An increase in the accuracy of detection of small objects can be achieved by using the Feature-Fused modification of the SSD detector. Together with the use of overlapping image slicing on the inference, this model copes well with the detection of small objects. However, the practical use of this approach requires manual adjustment of the slicing parameters. This reduces the accuracy of object detection on scenes that differ from the scenes used in training, as well as large objects. In this paper, we propose an algorithm for automatic selection of image slicing parameters depending on the ratio of the characteristic geometric dimensions of objects in the image. We have developed a two-pass version of the Feature-Fused SSD detector for automatic determination of optimal image slicing parameters. On the first pass, a fast truncated version of the detector is used, which makes it possible to determine the characteristic sizes of objects of interest. On the second pass, the final detection of objects with slicing parameters selected after the first pass is performed. A dataset was collected with images of workers on a construction site. The dataset includes large, small and diverse images of workers. To compare the detection results for a one-pass algorithm without splitting the input image, a one-pass algorithm with uniform splitting, and a two-pass algorithm with the selection of the optimal splitting, we considered tests for the detection of separately large objects, very small objects, with a high density of objects both in the foreground and in the background, only in the background. In the range of cases we have considered, our approach is superior to the approaches taken in comparison, allows us to deal well with the problem of double detections and demonstrates a quality of 0.82–0.91 according to the mAP (mean Average Precision) metric.

Keywords: computer vision, construction site, single shot detector
Citation in English: Petrov M.N., Zimina S.V., Dyachenko D.L., Dubodelov A.V., Simakov S.S. Dual-pass Feature-Fused SSD model for detecting multi-scale images of workers on the construction site // Computer Research and Modeling, 2023, vol. 15, no. 1, pp. 57-73
Citation in English: Petrov M.N., Zimina S.V., Dyachenko D.L., Dubodelov A.V., Simakov S.S. Dual-pass Feature-Fused SSD model for detecting multi-scale images of workers on the construction site // Computer Research and Modeling, 2023, vol. 15, no. 1, pp. 57-73
DOI: 10.20537/2076-7633-2023-15-1-57-73

Indexed in Scopus

Full-text version of the journal is also available on the web site of the scientific electronic library eLIBRARY.RU

The journal is included in the Russian Science Citation Index

The journal is included in the RSCI

International Interdisciplinary Conference "Mathematics. Computing. Education"