SCIENTIA SINICA Informationis, Volume 48, Issue 7: 888-902(2018) https://doi.org/10.1360/N112017-00290

Double discriminator generative adversarial networks and their application in detecting nests built in catenary and semisupervized learning

More info
  • ReceivedMar 20, 2018
  • AcceptedMay 3, 2018
  • PublishedJul 20, 2018


In image-based detection of catenary anomalies, detection of bird's nest anomalies is a typical situation. However, the image data containing the nests is only a small portion of the total data, which makes nest detection a typical problem of imbalanced data classification. For using a machine learning algorithm to solve imbalanced data classification, the learning ability of data features is of much importance. Generative adversarial networks (GANs) can learn prosperous data features from unlabeled data, which has been widely confirmed and applied. Nonetheless, because of the limitation of GANs' structure and theory, it is not an ideal model for image classification. In this research, the GANs model is improved with respect to image classification tasks. The improved model is named "double discriminator generative adversarial networks" (DDGANs). With DDGANs, the classification results of nest detection are satisfactory, and it is also an effective semisupervized learning model. Experiments on the MNIST standard dataset show that the accuracy and convergence rate have been significantly improved compared with other models.

Funded by




[1] Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. In: Proceedigns of International Conference on Neural Information Processing Systems. Cambrige: MIT Press, 2014. 2672--2680. Google Scholar

[2] Dai Z H, Yang Z L, Yang F, et al. Good semi-supervised learning that requires a bad gan. 2017,. arXiv Google Scholar

[3] Shrivastava A, Pfister T, Tuzel O, et al. Learning from simulated and unsupervised images through adversarial training. 2017,. arXiv Google Scholar

[4] Li J, Monroe W, Shi T, et al. Adversarial learning for neural dialogue generation. 2017,. arXiv Google Scholar

[5] Reed S, Akata Z, Yan X, et al. Generative adversarial text to image synthesis. 2016,. arXiv Google Scholar

[6] Mathieu M, Couprie C, LeCun Y. Deep multi-scale video prediction beyond mean square error. 2015,. arXiv Google Scholar

[7] Huang R, Zhang S, Li T, et al. Beyond face rotation: global and local perception gan for photorealistic and identity preserving frontal view synthesis. 2017,. arXiv Google Scholar

[8] Salimans T, Goodfellow I, Zaremba W, et al. Improved techniques for training gans. 2016,. arXiv Google Scholar

[9] Mirza M, Osindero S. Conditional generative adversarial nets. 2014,. arXiv Google Scholar

[10] Odena A, Olah C, Shlens J. Conditional image synthesis with auxiliary classifier GANs. 2016,. arXiv Google Scholar

[11] Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. 2015,. arXiv Google Scholar

[12] Liu J W, Liu Y, Luo X L. Semi-supervised learning methods. Chin J Comput, 2015, 38: 1592--1617. Google Scholar

[13] 蔡学敬. 基于图像处理技术的高速接触网动态检测系统研究. 博士学位论文. 成都: 西南交通大学, 2008. Google Scholar

[14] Lin P, Wang L, Luo L. High-speed locomotive porcelain positioning method based on SIFT. Electric Drive Locomot, 2012, 5: 87--89. Google Scholar

[15] 段旺旺, 唐鹏, 金炜东, 等. 基于关键区域HOG特征的铁路接触网鸟巢检测. 中国铁路, 2015, 8: 73--77. Google Scholar

[16] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, 2012. 1097--1105. Google Scholar

[17] Kingma D P, Mohamed S, Rezende D J, et al. Semi-supervised learning with deep generative models. In: Proceedings of Annual Conference on Neural Information Processing Systems, Montreal, 2014. 3581--3589. Google Scholar

[18] Springenberg J T. Unsupervised and semi-supervised learning with categorical generative adversarial networks. 2015,. arXiv Google Scholar

[19] Joachims T. Transductive inference for text classification using support vector machines. In: Proceedings of the 16th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 1999. 99: 200--209. Google Scholar

[20] Pitelis N, Russell C, Agapito L. Semi-supervised learning using an unsupervised atlas. In: Proceedings of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Berlin: Springer, 2014. 565--580. Google Scholar

[21] Miyato T, Maeda S, Koyama M, et al. Distributional smoothing by virtual adversarial examples. 2015,. arXiv Google Scholar

[22] Wang K F, Gou C, Duan Y J, et al. Generative adversarial networks: the state of the art and beyond. Acta Autom Sin, 2017, 43: 321--332. Google Scholar

  • Figure 1

    Structure diagram of GANs

  • Figure 4

    Structure of DDGANs

  • Figure 5

    Performance of classifier and Shannon entropy


    Algorithm 1 DDGANs的Minibatch随机梯度下降训练过程

    for number of training iterations

    Sample minibatch of $m$ noise samples $\{~{{z}}^{(1)},\ldots,{{z}}^{(m)}~\}$ from noise prior $p_{z}({{z}})$.

    Sample minibatch of $m$ category samples $\{~{{c}}^{(1)},\ldots,{{c}}^{(m)}~\}$ from noise prior $p_{c}({{c}})$.

    Sample minibatch of $m$ unlabeled examples $\{{{x}}_{ul}^{(1)},\ldots,{{x}}_{ul}^{(m)}~\}$ from data generation distribution $p_{\rm~data}({{x}}_{ul})$.

    Sample minibatch of $m$ labeled examples $\{({{x}}_{l}^{(1)},{{y}}^{(1)}),\ldots,({{x}}_{l}^{(m)},{{y}}^{(m)})\}$ from data generation distribution $p_{\rm~data}({{x}}_{l})$.

    Update $D_{1}$ by ascending its stochastic gradient:

  • Table 1   The structure of the discriminator network
    Discriminator in ACGAN $D_{2}$ in DDGANs
    Input imageInput image
    5$\times$5 convolutional layer 32 lReLU5$\times$5 convolutional layer 32 lReLU
    3$\times$3 max-pool, stride 23$\times$3 max-pool, stride 2
    3$\times$3 convolutional layer 64 lReLU3$\times$3 convolutional layer 64 lReLU
    3$\times$3 convolutional layer 64 lReLU3$\times$3 convolutional layer 64 lReLU
    3$\times$3 max-pool, stride 23$\times$3 max-pool, stride 2
    3$\times$3 convolutional layer 128 lReLU 3$\times$3 convolutional layer 128 lReLU
    1$\times$1 convolutional layer 10 lReLU1$\times$1 convolutional layer 10 lReLU
    flatten flatten
    128 fc ELU with l2(0.01) re 128 fc ELU with l2(0.01) re
    10 fc softmax and 1 fc sigmoid 10 fc softmax
  • Table 2   Test results of detecting nests in catenary
    Algorithm TP FP FN TN Recall rate (%)
    SIFT 341 322 159 678 68.2
    SURF 347 358 153 642 69.4
    CNN 414 157 86 843 82.8
    KEFEH 469 73 31 927 93.9
    DDGANs 490 85 10 915 98.0
  • Table 3   Classification results of semi-supervised learning on MNIST dataset
    100 83.19 87.13 91.90 97.88 96.67 98.09 98.12
    1000 93.84 94.67 96.32 98.68 97.60 98.27 99.26

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1       京公网安备11010102003388号