logo

SCIENCE CHINA Information Sciences, Volume 60, Issue 11: 113101(2017) https://doi.org/10.1007/s11432-017-9237-1

Patch-based topic model for group detection

More info
  • ReceivedJul 12, 2017
  • AcceptedSep 19, 2017
  • PublishedOct 10, 2017

Abstract

Pedestrians in crowd scenes tend to connect with each other and form coherent groups. In order to investigate the collective behaviors in crowds, plenty of studies have been conducted on group detection. However, most of the existing methods are limited to discover the underlying semantic priors of individuals. By segmenting the crowd image into patches, this paper proposes the Patch-based Topic Model (PTM) for group detection. The main contributions of this study are threefold: (1) the crowd dynamics are represented by patch-level descriptor, which provides a macroscopic-level representation; (2) the semantic topic label of each patch are inferred by integrating the Latent Dirichlet Allocation (LDA) model and the Markov Random Fields (MRF); (3) the optimal group number is determined automatically with an intro-class distance evaluation criterion. Experimental results on real-world crowd videos demonstrate the superior performance of the proposed method over the state-of-the-arts.


Acknowledgment

This work was supported by National Key Research and Development Program of China (Grant No. 2017YFB1002202), National Natural Science Foundation of China (Grant Nos. 61773316, 61379094), Fundamental Research Funds for the Central Universities (Grant No. 3102017AX010), and Open Research Fund of Key Laboratory of Spectral Imaging Technology, Chinese Academy of Sciences.


Supplement


References

[1] Zhang Y Y, Zhou D, Chen S Q, et al. Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 2016. 589--597. Google Scholar

[2] Wang Q, Fang J, Yuan Y. Multi-cue based tracking. Neurocomputing, 2014, 131: 227-236 CrossRef Google Scholar

[3] Yuan Y, Fang J W, Wang Q. Online anomaly detection in crowd scenes via structure analysis. IEEE Trans Syst Man Cybernet, 2015, 45: 562--575. Google Scholar

[4] Ali S, Shah M. A lagrangian particle dynamics approach for crowd flow segmentation and stability analysis. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, 2007. 1--6. Google Scholar

[5] Lin W, Mi Y, Wang W. A Diffusion and Clustering-Based Approach for Finding Coherent Motions and Understanding Crowd Scenes. IEEE Trans Image Process, 2016, 25: 1674-1687 CrossRef PubMed ADS arXiv Google Scholar

[6] Zhou B L, Tang X O, Wang X G. Coherent filtering: detecting coherent motions from crowd clutters. In: Proceedings of European Conference on Computer Vision, Florence, 2012. 857--871. Google Scholar

[7] Shao J, Loy C C, Wang X G. Scene-independent group profiling in crowd. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 2014. 2227--2234. Google Scholar

[8] Zhou B, Tang X, Zhang H. Measuring Crowd Collectiveness.. IEEE Trans Pattern Anal Mach Intell, 2014, 36: 1586-1599 CrossRef PubMed Google Scholar

[9] Li X L, Chen M L, Nie F P, et al. A multiview-based parameter free framework for group detection. In: Proceedings of AAAI Conference on Artificial Intelligence, San Francisco, 2017. 4147--4153. Google Scholar

[10] Wang Q, Chen M L, Li X L. Quantifying and detecting collective motion by manifold learning. In: Proceedings of AAAI Conference on Artificial Intelligence, San Francisco, 2017. 4292--4298. Google Scholar

[11] Chen M L, Wang Q, Li X L. Anchor-based group detection in crowd scenes. In: Proceedings of International Conference on Acoustics, Speech and Signal Processing, New Orleans, 2017. 1378--1382. Google Scholar

[12] Blei D, Ng A, Jordan M. Latent dirichlet allocation. J Mach Learn Res, 2003, 3: 993--1022. Google Scholar

[13] Lu H Y, Xie L Y, Kang N, et al. Don't forget the quantifiable relationship between words: using recurrent neural network for short text topic discovery. In: Proceedings of AAAI Conference on Artificial Intelligence, San Francisco, 2017. 1193--1198. Google Scholar

[14] Zhao B, Li F-F, Xing E P. Image segmentation with topic random field. In: Proceedings of European Conference on Computer Vision, Heraklion, 2010. 785--798. Google Scholar

[15] Zhou B L, Wang X G, Tang X O. Random field topic model for semantic region analysis in crowded scenes from tracklets. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs, 2011. 3441--3448. Google Scholar

[16] Zheng Lu , Xiaokang Yang , Weiyao Lin . Inferring User Image-Search Goals Under the Implicit Guidance of Users. IEEE Trans Circuits Syst Video Technol, 2014, 24: 394-406 CrossRef Google Scholar

  • Figure 1

    (Color online) Pipeline of the proposed PTM. First, the patch-level descriptor is constructed with the distribution of the feature points over the orientation space. Then, the obtained descriptor is fed into the LDA model, and an MRF prior is imposed on the hidden priors to enforce the spatial coherence. After model inference, the semantic motion prior within each patch is learned. Finally, the feature points are combined according to the prior of the corresponding patch. Scatters with different colors indicate different detected groups.

  • Figure 2

    ACC curves of our method under different $\lambda$ and $\beta$ values.

  • Figure 3

    (Color online) Representative results of group detection. (a) Ground truth; (b) segmented patches; (c) topics learned by the proposed model. Patches with the same topic are visualized with the same color. (d)–(g) group detection results of the proposed PTM, CF, CT and MCC. Scatters with different colors indicate different detected groups. It can be seen that our method achieves the consistent results with the ground truth.

  •   

    Algorithm 1 Algorithm of the proposed method

    Require:Input: Image patches, feature points, parameter $\lambda$, threshold $\beta$

    Require:1 for $K~=~1,2,\ldots,8$

    Require:2 Compute the patch-level descriptor;

    Require:3 Learn the topic of each patch with $K$;

    Require:4 Combine the patches with the same topic and obtain groups;

    Require:5 Calculate the ID of groups;

    Require:6 end

    Require:7 Set the groups with smallest ID as the final result.

    Require:Output: Detected groups

Copyright 2019 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1