logo

SCIENCE CHINA Information Sciences, Volume 63 , Issue 11 : 212102(2020) https://doi.org/10.1007/s11432-019-2811-8

Discriminative fine-grained network for vehicle re-identification using two-stage re-ranking

More info
  • ReceivedAug 26, 2019
  • AcceptedJan 31, 2020
  • PublishedOct 13, 2020

Abstract

Research on the application of vehicle re-identification to video surveillance has attracted increasingly growing attention. Existing methods are associated with the difficulties of distinguishing different instances of the same car model owing to the incapability of recognizing subtle differences among these instances and the possibility that a subtle difference may lead to incorrect results of ranking. In this paper, a discriminative fine-grained network for vehicle re-identification based on a two-stage re-ranking framework is proposed to address these issues. This discriminative fine-grained network (DFN) is composed of fine-grained and Siamese networks. The proposed hybrid network can extract discriminative features of the vehicle instances with subtle differences. The Siamese network is rather suitable for general object re-identification using two streams of the network, while the fine-grained network is capable of detecting subtle differences. The proposed two-stage re-ranking method allows obtaining a more reliable ranking list by using the Jaccard metric and merging the first and second re-ranking lists, where the latter list is formed using the sample mean feature. Experimental results on the VeRi-776 and VehicleID datasets show that the proposed method achieves the superior performance compared to the state-of-the-art methods used in vehicle re-identification.


Acknowledgment

This work was supported by National Natural Science Foundation of China (Grant Nos. 61762061, 62076117), National Key RD Program of China (Grant Nos. 2017YFB0801701, 2017YFB0802805), Natural Science Foundation of Jiangxi Province (Grant No. 20161ACB20004), and Jiangxi Key Laboratory of Smart City (Grant No. 20192BCD40- 002).


References

[1] Gou C, Wang K, Yao Y. Vehicle License Plate Recognition Based on Extremal Regions and Restricted Boltzmann Machines. IEEE Trans Intell Transp Syst, 2016, 17: 1096-1107 CrossRef Google Scholar

[2] Min W, Li X, Wang Q. New approach to vehicle license plate location based on new model YOLO-L and plate pre-identification. 17 CrossRef Google Scholar

[3] Wang Y, Zhao C, Liu X. Fast Cartoon-Texture Decomposition Filtering Based License Plate Detection Method. Math Problems Eng, 2018, 2018(8): 1-9 CrossRef Google Scholar

[4] Wang T, Gong S, Zhu X. Person Re-Identification by Discriminative Selection in Video Ranking. IEEE Trans Pattern Anal Mach Intell, 2016, 38: 2501-2514 CrossRef Google Scholar

[5] Zhao R, Oyang W, Wang X. Person Re-Identification by Saliency Learning. IEEE Trans Pattern Anal Mach Intell, 2017, 39: 356-370 CrossRef Google Scholar

[6] Cho Y J, Yoon K J. Improving person re-identification via pose-aware multi-shot matching. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. Google Scholar

[7] Zhao H Y, Tian M Q, Sun S Y, et al. Spindle Net Person Re-identification with Human Body Region Guided. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. Google Scholar

[8] Feris R S, Siddiquie B, Petterson J. Large-Scale Vehicle Detection, Indexing, and Search in Urban Surveillance Videos. IEEE Trans Multimedia, 2012, 14: 28-42 CrossRef Google Scholar

[9] Liu X C, Liu W, Ma H D, et al. Large-scale vehicle re-identification in urban surveillance videos. In: Proceedings of IEEE International Conference on Multimedia and Expo, 2016. Google Scholar

[10] Loy C C, Xiang T, Gong S G. Multi-camera activity correlation analysis. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2009. Google Scholar

[11] Shen Y T, Xiao T, Li H S, et al. Learning deep neural networks for vehicle re-id with visual-spatio-temporal path proposals. In: Proceedings of IEEE International Conference on Computer Vision, 2017. Google Scholar

[12] Wang Z D, Tang L M, Liu X H, et al. Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Re-identification. In: Proceedings of IEEE International Conference on Computer Vision, 2017. Google Scholar

[13] Gao Y, Beijbom O, Zhang N, et al. Compact Bilinear Pooling. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. Google Scholar

[14] Matsukawa T, Okabe T, Suzuki E. Hierarchical Gaussian Descriptors with Application to Person Re-Identification. IEEE Trans Pattern Anal Mach Intell, 2019, : 1-1 CrossRef Google Scholar

[15] Chen D P, Yuan Z J, Chen B D, et al. Similarity Learning with Spatial Constraints for Person Re-identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. Google Scholar

[16] Zheng L, S L Y, Tian L, et al. Scalable Person Re-identification: A Benchmark. In: Proceedings of IEEE International Conference on Computer Vision, 2015. Google Scholar

[17] Rama Varior R, Wang G, Lu J. Learning Invariant Color Features for Person Reidentification. IEEE Trans Image Process, 2016, 25: 3395-3410 CrossRef ADS Google Scholar

[18] Liao S C, Hu Y, Zhu X Y, et al. Person Re-identification by Local Maximal Occurrence Representation and Metric Learning. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2015. Google Scholar

[19] Min W, Cui H, Rao H. Detection of Human Falls on Furniture Using Scene Analysis Based on Deep Learning and Activity Characteristics. IEEE Access, 2018, 6: 9324-9335 CrossRef Google Scholar

[20] Liao Y, Xiong P, Min W. Dynamic Sign Language Recognition Based on Video Sequence With BLSTM-3D Residual Networks. IEEE Access, 2019, 7: 38044-38054 CrossRef Google Scholar

[21] Min W, Fan M, Li J. Real-time face recognition based on pre-identification and multi-scale classification. IET Comput Vision, 2019, 36: 165-171 CrossRef Google Scholar

[22] Zhang K, Liu N, Yuan X F, et al. Fine-grained age estimation in the wild with attention LSTM networks. In: Proceedings of Computer Vision and Pattern Recognition, 2018. Google Scholar

[23] Ji Z, Xiong K, Pang Y. Video Summarization With Attention-Based Encoder-Decoder Networks. IEEE Trans Circuits Syst Video Technol, 2020, 30: 1709-1717 CrossRef Google Scholar

[24] Ji Z, Sun Y, Yu Y. Attribute-Guided Network for Cross-Modal Zero-Shot Hashing. IEEE Trans Neural Netw Learning Syst, 2020, 31: 321-330 CrossRef Google Scholar

[25] Zhao R, Ouyang W L, Wang X G. Learning Mid-level Filters for Person Re-identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2014. Google Scholar

[26] Zheng Z D, Zheng L, Yang Y. A Discriminatively Learned CNN Embedding for Person Reidentification. ACM Transactions on Multimedia Computing Communications and Applications, 2018, 14: 1-20. Google Scholar

[27] Cheng D, Gong Y H, Zhou S P, et al. Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. Google Scholar

[28] Xu S J, Cheng Y, Gu K, et al. Jointly Attentive Spatial-Temporal Pooling Networks for Video-based Person Re-Identification. In: Proceedings of IEEE International Conference on Computer Vision, 2017. Google Scholar

[29] Zhao C, Chen K, Zang D. Uncertainty-optimized deep learning model for small-scale person re-identification. Sci China Inf Sci, 2019, 62: 220102 CrossRef Google Scholar

[30] Paisitkriangkrai S, Shen C H, Hengel A V D. Learning to rank in person re-identification with metric ensembles. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2015. Google Scholar

[31] Yi D, Zhen L, Liao S C, et al. Deep Metric Learning for Person Re-Identification. In: Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods, 2014. Google Scholar

[32] Liu H Y, Tian Y H, Wang Y W, et al. Deep Relative Distance Learning:Tell the Difference Between Similar Vehicles. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. Google Scholar

[33] Tang Y, Wu D, Jin Z, et al. Multi-modal metric learning for vehicle re-identification in traffic surveillance environment. In: Proceedings of IEEE International Conference on Image Processing, 2017. Google Scholar

[34] Zapletal D, Herout A. Vehicle re-identification for automatic video traffic surveillance. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016. Google Scholar

[35] Yang L J, Lou P, Loy C C, et al. A Large-Scale Car Dataset for Fine-Grained Categorization and Verification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2015. Google Scholar

[36] Zhang Y H, Liu D, Zha Z J. Improving triplet-wise training of convolutional neural network for vehicle re-identification. In: Proceedings of IEEE International Conference on Mutimedia and Expo, 2017. Google Scholar

[37] Zhou Y, Shao L. Vehicle Re-Identification by Adversarial Bi-Directional LSTM Network. In: Proceedings of IEEE Winter Conference on Applications of Computer Vision, 2018. Google Scholar

[38] Zhou Y, Shao L. Viewpoint-aware Attentive Multi-view Inference for Vehicle Re-identification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. Google Scholar

[39] Zhou Y, Shao L. Cross-View GAN based vehicle generation for re-identification. In: Proceedings of British Machine Vision Conference, 2017. Google Scholar

[40] Zhu J, Zeng H, Jin X. Joint horizontal and vertical deep learning feature for vehicle re-identification. Sci China Inf Sci, 2019, 62: 199101 CrossRef Google Scholar

[41] Martin K, Hirze M, Wohihart P, et al. Large scale metric learning from equivalence constraints. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2012. Google Scholar

[42] Li Z, Tang J. Weakly Supervised Deep Metric Learning for Community-Contributed Image Retrieval. IEEE Trans Multimedia, 2015, 17: 1989-1999 CrossRef Google Scholar

[43] Zhong Z, Zheng L, Cao D L, et al. Re-ranking Person Re-identification with K-reciprocal Encoding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017. Google Scholar

[44] Ding S, Lin L, Wang G. Deep feature learning with relative distance comparison for person re-identification. Pattern Recognition, 2015, 48: 2993-3003 CrossRef Google Scholar

[45] Li Z, Chang S Y, Liang F, et al. Learning Locally-Adaptive Decision Functions for Person Verification. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2013. Google Scholar

[46] Valev K , Schumann A, Sommer L, et al. A systematic evaluation of recent deep learning architectures for fine-grained vehicle classification. In: Proceedings of Computer Vision and Pattern Recognition, 2018. Google Scholar

[47] Ma Z, Chang D, Li X. Channel Max Pooling Layer for Fine-Grained Vehicle Classification. In: Proceedings of Computer Vision and Pattern Recognition, 2019. Google Scholar

[48] Wang Q, Ding Y D. A Novel Fine-Grained Method for Vehicle Type Recognition Based on the Locally Enhanced PCANet Neural Network. J Comput Sci Technol, 2018, 33: 335-350 CrossRef Google Scholar

[49] Yu S, Wu Y, Li W. A model for fine-grained vehicle classification based on deep learning. Neurocomputing, 2017, 257: 97-103 CrossRef Google Scholar

[50] Hu B, Lai J H, Guo C C. Location-aware fine-grained vehicle type recognition using multi-task deep networks. Neurocomputing, 2017, 243: 60-68 CrossRef Google Scholar

[51] Zhang Q, Zhuo L, Hu X, et al. Fine-grained vehicle recognition using hierarchical fine-tuning strategy for Urban Surveillance Videos. In: Proceedings of International Conference on Progress in Informatics and Computing, 2017. Google Scholar

[52] Liu X C, Liu W, Mei T, et al. A deep learning-based approach to progressive vehicle re-identification for urban surveillance. In: Proceedings of Europeon Conference on Computer Vision, 2016. Google Scholar

  • Figure 1

    (Color online) The challenges associated with vehicle re-identification.

  • Figure 2

    (Color online) The overview of the proposed architecture for vehicle re-identification. Firstly, the dataset is input into the network. Then the discriminative fine-grained network part including the Siamese network in the upper part and the fine-grained network in the lower part is applied. Finally, the two-stage re-ranking part is executed to merge the feature vectors of the two sub-networks to obtain the final distance by the two-stage calculation.

  • Figure 3

    (Color online) The influence of the subtle feature information on vehicle re-identification.

  • Figure 4

    (Color online) Example of the selection of candidate $\overline{{p}}$ and the definition process of robust set ${H}^{*}({\rm~C},~20)$ in the second step of re-ranking.

  • Figure 5

    (Color online) CMC curves of different methods. (a) VeRi-776; (b) the small test set of VehicleID; (c) the medium test set of VehicleID; (d) the large test set of VehicleID.

  • Table 1  

    Table 1Comparison of the state-of-the-art methods on the VeRi-776 dataset

    Method rank1 (%) mAP (%)
    LOMO [18] 24.59 9.68
    FACT [9] 51.89 18.69
    Siamese-Visual [11] 41.12 29.48
    BOW-CN [16] 33.82 9.63
    VAMI [38] 77.03 50.13
    XVGAN [39] 60.20 24.65
    DLCNN [26] 82.42 49.88
    Ours 88.14 61.85
  • Table 2  

    Table 2Comparison of the state-of-the-art methods on the VehicleID dataset

    Method Small Medium Large
    rank1 (%) rank5 (%) rank1 (%) rank5 (%) rank1 (%) rank5 (%)
    LOMO [18] 19.92 32.83 19.52 29.91 15.72 25.56
    FACT [9] 49.93 68.37 45.01 64.75 40.12 60.59
    VGG+CCL [32] 43.92 65.01 38.84 61.91 34.58 55.72
    MixedDiff+CCL [32] 48.52 74.55 43.94 67.96 40.85 62.79
    VAMI [38] 63.08 83.12 52.69 75.08 47.28 70.06
    XVGAN [39] 52.79 80.69 49.47 71.42 44.92 66.71
    DLCNN [26] 73.01 82.70 66.50 77.06 61.00 73.17
    Ours 77.02 85.04 71.81 80.81 66.29 78.42
  • Table 3  

    Table 3Comparison of the results obtained using the methods with and without re-ranking on VeRi-776

    Method rank1 (%) mAP (%)
    Base 88.14 61.85
    Base+Zhong [43] 89.03 65.19
    Base+TR 90.11 66.10
  • Table 4  

    Table 4Comparison of the results obtained using the methods with and without re-ranking on VehicleID

    Method Small Medium Large
    rank1 (%) rank5 (%) rank1 (%) rank5 (%) rank1 (%) rank5 (%)
    Base 77.02 85.04 71.81 80.81 66.29 78.42
    Base+Zhong [43] 77.89 85.28 72.38 81.06 67.92 79.17
    Base+TR 79.00 86.01 74.06 82.19 69.50 79.79

Copyright 2020  CHINA SCIENCE PUBLISHING & MEDIA LTD.  中国科技出版传媒股份有限公司  版权所有

京ICP备14028887号-23       京公网安备11010102003388号