SCIENTIA SINICA Informationis, Volume 50 , Issue 5 : 692-703(2020) https://doi.org/10.1360/N112019-00034

Obstacle visual sensing based on deep learning for low-altitude small unmanned aerial vehicles

More info
  • ReceivedFeb 15, 2019
  • AcceptedMay 5, 2019
  • PublishedApr 16, 2020


An obstacle real-time sensing method, which is based on deep learning and target tracking method and integrated with monocular vision and binocular vision, is proposed for unmanned aerial vehicles (UAVs) in this paper. Firstly, it uses the deep learning method to detect and recognize the first-frame figure collected by cameras. Then, it uses the target tracking algorithm to track the detection results for the first-frame figure in real time to improve the real-time performance of the detection system. Meanwhile, it uses the binocular vision technology to execute the three-dimensional reconstruction for the current frame of the entire figure to obtain the environmental spatial information. Subsequently, combined with the points clustering strategy and the information fusion method, it can resolve the types, spatial locations, and outlines of obstacles. Finally, to verify the proposed method, we developed a physical prototype, and the results showed that the real-time sensing for obstacles can be realized under the condition that UAVs are equipped with one binocular camera.

Funded by



[1] Fasano G, Accardo D, Moccia A. Multi-Sensor-Based Fully Autonomous Non-Cooperative Collision Avoidance System for Unmanned Air Vehicles. J Aerospace Computing Inf Communication, 2008, 5: 338-360 CrossRef Google Scholar

[2] Fränken D, Hupper A. Unified tracking and fusion for airborne collision avoidance using log-polar coordinates. In: Proceedings of the 15th IEEE International Conference on Information Fusion, Singapore, 2012. 1246--1253. Google Scholar

[3] Mueller M W, D'Andrea R. Critical subsystem failure mitigation in an indoor UAV testbed. In: Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vilamoura, 2012. 780--785. Google Scholar

[4] Ross S, Melik-Barkhudarov N, Shankar K S, et al. Learning monocular reactive UAV control in cluttered natural environments. In: Proceedings of the 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, 2013. 1765--1772. Google Scholar

[5] Shuai C, Wang H, Zhang W, et al. Binocular vision perception and obstacle avoidance of visual simulation system for power lines inspection with UAV. In: Proceedings of the 36th Chinese Control Conference, Dalian, 2017. 10480--10485. Google Scholar

[6] Wang H L, Wu J F, Yao P. UAV three-dimensional path planning based on Interfered Fluid Dynamical System: Methodology and application. Unmanned Syst Technol, 2018, 1: 72--82. Google Scholar

[7] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 2014. 580--587. Google Scholar

[8] Ren S, He K, Girshick R. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 1137-1149 CrossRef PubMed Google Scholar

[9] Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Seattle, 2016. 779--788. Google Scholar

[10] Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector. In: Proceedings of the 2016 European Conference on Computer Vision, Springer, 2016. 21--37. Google Scholar

[11] Redmon J, Farhadi A. YOLO9000: Better, Faster, Stronger. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 2017. 7263--7271. Google Scholar

[12] Henriques J F, Caseiro R, Martins P. High-Speed Tracking with Kernelized Correlation Filters.. IEEE Trans Pattern Anal Mach Intell, 2015, 37: 583-596 CrossRef PubMed Google Scholar

[13] Tao C B, Qiao L, Sun Y F, et al. Stereo matching algorithm of six rotor UAV based on binocular vision. Laser and Infrared, 2018, 48: 1181--1187. Google Scholar

[14] Gao H W. Computer Binocular Stereo Vision. Beijing: Electronic Industry Press, 2012: 132--135. Google Scholar

[15] Wang Z R, Guo X K, Zhao G. Depth image information fusion and three-dimensional reconstruction method of double binocular stereo vision. Laser and Infrared, 2019, 49: 246--250. Google Scholar

[16] Jacob B, Kligys S, Chen B, et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. 2704--2713. Google Scholar

[17] Zhang X, Zhou X, Lin M, et al. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 2018. 6848--6856. Google Scholar

  • Figure 1

    Block diagram of the vision-based real-time obstacle perception method

  • Figure 2

    (Color online) Samples in the data set

  • Figure 3

    (Color online) Target detection and recognition results using YOLOv2

  • Figure 4

    (Color online) 3D reconstruction based on binocular stereo vision

  • Figure 5

    Obstacle extraction using information fusion method

  • Figure 6

    Perception of the physical prototype. (a) Real-time detection result using YOLOv2 and KCF tracking algorithm; (b) point cloud for 3D environment reconstruction using binocular stereo vision

  • Figure 7

    (Color online) Detection and tracking results for the physical object using the proposed algorithm. (a) Image at the 1st frame; (b) image at the 60th frame; (c) image at the 120th frame

  • Table 1   Perception results of the physical prototype in indoor environments
    Center coordinate $X$ (m) Center coordinate $Y$ (m) Center coordinate $Z$ (m) Width (m) Height (m)
    Group 1 Real value 0.000 0.000 1.500 0.370 0.370
    Measured value $-$0.012 0.021 1.493 0.374 0.374
    Error $-$0.012 0.021 $-$0.007 0.004 0.004
    Group 2 Real value 0.000 0.400 1.980 0.370 0.370
    Measured value 0.009 0.443 1.981 0.374 0.374
    Error 0.009 0.043 0.001 0.004 0.004
    Group 3 Real value $-$0.400 0.400 2.500 0.370 0.370
    Measured value $-$0.378 0.433 2.550 0.374 0.382
    Error 0.022 0.033 0.050 0.004 0.012
    Group 4 Real value $-$0.400 0.400 3.000 0.370 0.370
    Measured value $-$0.447 0.447 2.987 0.402 0.382
    Error $-$0.047 0.047 0.013 0.032 0.012
    Group 5 Real value 0.000 0.400 4.000 0.370 0.370
    Measured value 0.037 0.442 3.951 0.397 0.420
    Error 0.037 0.042 0.049 0.027 0.050
  • Table 2   Statistical results of position error mean values and standard deviations$^{\rm~a)}$
    $X$-EM (cm) $X$-SD (cm) $Y$-EM (cm) $Y$-SD (cm) $Z$-EM (cm) $Z$-SD (cm)
    Group 1 1.4 0.5 1.5 0.4 1.1 0.3
    Group 2 1.4 0.4 1.3 0.4 1.1 0.4
    Group 3 1.8 0.6 2.2 0.4 2.2 0.1
    Group 4 2.3 0.6 3.2 1.0 3.8 0.5
    Group 5 2.6 0.6 2.8 0.6 5.4 0.9

    a$X/Y/Z$-EM represents $X/Y/Z$-axis position error mean value and $X/Y/Z$-SD represents $X/Y/Z$-axis standard deviation.

Copyright 2020  CHINA SCIENCE PUBLISHING & MEDIA LTD.  中国科技出版传媒股份有限公司  版权所有

京ICP备14028887号-23       京公网安备11010102003388号