logo

SCIENTIA SINICA Informationis, Volume 48, Issue 8: 1022-1034(2018) https://doi.org/10.1360/N112017-00208

Object detection models of remote sensing images using deep neural networks with weakly supervised training method

More info
  • ReceivedJan 2, 2018
  • AcceptedJan 19, 2018
  • PublishedAug 8, 2018

Abstract

An object detection framework of remote sensing images is proposed in this paper.It aims to bridge the difficulties in the object detection task of remote sensing images caused by small targets and complex backgrounds.The proposed framework consists of two deep neural network models: fully convolutional network model and convolutional neural network model.First, the fully convolutional network model is used to extract proposals that may contain targets to be detected in the remote sensing images,thus avoiding an exhaustive search in the whole image.Next, the convolutional neural network model is used to classify the proposals.High-level features are extracted to improve the rate of classification accuracy.Subsequently, a new dataset of remote sensing images used for object detection is provided.Image-level annotations are used to train all the models in the proposed framework.Simplified weakly supervised training method is used to solve the problem unachievable by object-level annotations in the field of object detection on remote sensing images.Finally, a novel proposal fusing algorithm is proposed, by which the positions of proposals are adjusted while overlapped proposals are fused.The proposed framework is tested on the proposed satellite aircraft dataset and the public aircraft dataset.The experimental results demonstrate that the proposed object detection framework improves the recognition rate as well as the detection efficiency of object detection when compared with other object detection frameworks using deep neural networks.


Funded by

国家自然科学基金(41471280,61701290,61701289)


References

[1] Xu C F, Duan H B. Artificial bee colony (ABC) optimized edge potential function (EPF) approach to target recognition for low-altitude aircraft. Pattern Recogn Lett, 2010, 31: 1759-1772 CrossRef Google Scholar

[2] Liu G, Sun X, Fu K. Aircraft recognition in high-resolution satellite images using coarse-to-fine shape prior. IEEE Geosci Remote Sens Lett, 2013, 10: 573-577 CrossRef ADS Google Scholar

[3] Sun H, Sun X, Wang H Q. Automatic target detection in high-resolution remote sensing images using spatial sparse coding bag-of-words model. IEEE Geosci Remote Sens Lett, 2012, 9: 109-113 CrossRef ADS Google Scholar

[4] He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016. 770--778. Google Scholar

[5] Ren S Q, He K M, Girshick R. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intel, 2017, 39: 1137-1149 CrossRef PubMed Google Scholar

[6] Wu H, Zhang H, Zhang J F, et al. Fast aircraft detection in satellite images based on convolutional neural networks. In: Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, 2015. 4210--4214. Google Scholar

[7] Wu H, Zhang H, Zhang J F, et al. Typical target detection in satellite images based on convolutional neural networks. In: Proceedings of the 2015 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Hong Kong, 2015. 2956--2961. Google Scholar

[8] Zhang P, Niu X, Dou Y, et al. Airport detection from remote sensing images using transferable convolutional neural networks. In: Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, 2016. 2590--2595. Google Scholar

[9] Cao Y S, Niu X, Dou Y. Region-based convolutional neural networks for object detection in very high resolution remote sensing images. In: Proceedings of the 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD), Changsha, 2016. 548--554. Google Scholar

[10] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 2014. 580--587. Google Scholar

[11] Uijlings J R, Sande K E, Gevers T. Selective search for object recognition. Int J Comput Vision, 2013, 104: 154-171 CrossRef Google Scholar

[12] Cheng G, Zhou P C, Han J W. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Trans Geosci Remote Sens, 2016, 54: 7405-7415 CrossRef ADS Google Scholar

[13] Chen X Y, Xiang S M, Liu C L, et al. Aircraft detection by deep belief nets. In: Proceedings of the 2nd IAPR Asian Conference on Pattern Recognition (ACPR), Okinawa, 2013. 54--58. Google Scholar

[14] Diao W H, Sun X, Zheng X W. Efficient saliency-based object detection in remote sensing images using deep belief networks. IEEE Geosci Remote Sens Lett, 2016, 13: 137-141 CrossRef ADS Google Scholar

[15] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, 2015. 3431--3440. Google Scholar

[16] Zhang D W, Han J W, Cheng G. Weakly supervised learning for target detection in remote sensing images. IEEE Geosci Remote Sens Lett, 2015, 12: 701-705 CrossRef ADS Google Scholar

[17] Han J W, Zhang D W, Cheng G. Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning. IEEE Trans Geosci Remote Sens, 2015, 53: 3325-3337 CrossRef ADS Google Scholar

[18] Zhang F, Du B, Zhang L P. Weakly supervised learning based on coupled convolutional neural networks for aircraft detection. IEEE Trans Geosci Remote Sens, 2016, 54: 5553-5563 CrossRef ADS Google Scholar

[19] Cheng G, Han J W. A survey on object detection in optical remote sensing images. ISPRS J Photogramm Remote Sens, 2016, 117: 11-28 CrossRef ADS arXiv Google Scholar

[20] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014,. arXiv Google Scholar

[21] Szegedy C, Liu W, Jia Y Q, et al. Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, 2015. Google Scholar

[22] Russakovsky O, Deng J, Su H. ImageNet large scale visual recognition challenge. Int J Comput Vision, 2015, 115: 211-252 CrossRef Google Scholar

[23] Everingham M, Gool L, Williams C K. The pascal visual object classes (VOC) challenge. Int J Comput Vision, 2010, 88: 303-338 CrossRef Google Scholar

[24] Jia Y Q, Shelhamer E, Donahue J, et al. Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, Orlando, 2014. 675--678. Google Scholar

[25] Cheng M M, Zhang Z M, Lin W Y, et al. BING: binarized normed gradients for objectness estimation at 300 fps. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 2014. 3286--3293. Google Scholar

  • Figure 1

    (Color online) The object detection framework of remote sensing images with machine learning and deep learning

  • Figure 2

    (Color online) The framework of the WS-DNN object detection model

  • Figure 3

    (Color online) The image samples of training set and validating set in (a) satellite aircrafts dataset andprotectłinebreak (b) aircrafts dataset [6]

  • Figure 4

    (Color online) The image samples of testing set in (a) satellite aircrafts dataset and (b) aircrafts dataset [6]

  • Figure 5

    (Color online) The training process of CNN and FCN model

  • Figure 6

    (Color online) The comparison of processing results of different FCN models. (a) Original image; (b) Pascal-FCN-32 model; (c) Pascal-FCN-16 model; (d) Pascal-FCN-8 model

  • Figure 7

    (Color online) The comparison of BBFT and NMS algorithm. (a) Image region 1; (b) NMS and (c) BBFT results of image region 1; (d) image region 2; (e) NMS and (f) BBFT results of image region 2

  • Figure 8

    (Color online) Some detected results of the WS-DNN model. Detections of image (a) 1, (b) 2, (c) 3

  • Table 1   The comparison of different proposal extracting algorithms
    Algorithm FPR-SAD (%) MR-SAD (%) Runtime-SAD (s) FPR-AD (%) MR-AD (%) Runtime-AD (s)
    Sliding window 91.21 0 0.59 95.29 0 1.01
    Selective search [11] 90.27 0.03 27.96 87.85 0 24.06
    FCN 78.16 0 7.73 84.08 0 11.15
  • Table 2   The experimental results of the WS-DNN model on satellite aircrafts dataset
    Evaluating method Results (%)
    FAR 7.69
    MR 3.42
    PR 93.22
  • Table 3   The comparison of different object detection methods on aircrafts dataset
    Given MR (%) FAR of BING-CNN [6] (%) PR of BING-CNN [6] (%) FAR of WS-DNN (%) PR of WS-DNN (%)
    25 7.28 90.50 4.65 94.12
    20 9.27 90.00 8.74 90.53
    15 17.66 84.00 10.26 89.33

Copyright 2019 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1