logo

SCIENCE CHINA Information Sciences, Volume 61, Issue 1: 012205(2018) https://doi.org/10.1007/s11432-016-9150-x

Automatic salient object sequence rebuilding for video segment analysis

More info
  • ReceivedMay 3, 2017
  • AcceptedJun 30, 2017
  • PublishedSep 29, 2017

Abstract

Detection of salient object sequences from video data is challenging when the salient object changes between consecutive frames. In this study, we addressed the salient object sequence rebuilding problem with video segment analysis. We reformulated the problem as a binary labeling problem, analyzed the potential salient object sequences in the video using a clustering method, and separated the salient object sequence from the background by applying an energy optimization method. Our proposed approach determines whether temporal consecutive pixels belong to the same salient object sequence. The conditional random field is then learned to effectively integrate the salient features and the sequence consecutive constraints. A dynamic programming algorithm was developed to resolve the energy minimization problem efficiently. Experimental results confirmed the ability of our approach to address the salient object rebuilding problem in automatic visual attention applications and video content analysis.


Acknowledgment

This work was supported by National Key RD Program of China (Grant No. 2016YFB100 1001), National Natural Science Foundation of China (Grant No. 61603022), and China Postdoctoral Science Foundation and Aeronautical Science Foundation of China (Grant No. 20135851042).


References

[1] Borji A, Itti L. State-of-the-art in visual attention modeling. IEEE Trans Pattern Anal Mach Intel, 2011, 35: 185--207. Google Scholar

[2] Ma L, Chen L, Zhang X J, et al. A waterborne salient ship detection method on SAR imagery. Sci China Inf Sci, 2015, 58: 089301. Google Scholar

[3] Liu T, Yuan Z J, Sun J, et al. Learning to detect a salient object. IEEE Trans Pattern Anal Mach Intel, 2011, 33: 353--367. Google Scholar

[4] Liu T, Zheng N N, Yuan Z J, et al. Video attention: learning to detect a salient object sequence. In: Proceedings of the International Conference on Pattern Recognition, Tampa, 2008. 1--4. Google Scholar

[5] Feng J, Wei Y, Tao L, et al. Salient object detection by composition. In: Proceedings of the IEEE International Conference on Computer Vision, Barcelona, 2011. 1028--1035. Google Scholar

[6] Yang C, Zhang L, Lu H, et al. Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, 2013. 3166--3173. Google Scholar

[7] Santella A, Agrawala M, Decarlo D, et al. Gaze-based interaction for semi-automatic photo cropping. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Montreal, 2006. 771--780. Google Scholar

[8] Chen L Q, Xie X, Fan X, et al. A visual attention mode for adapting images on small displays. Multim Syst, 2003, 9: 353--364. Google Scholar

[9] Rother C, Bordeaux L, Hamadi Y, et al. Autocollage. In: Proceedings of the International Conference and Exhibition on Computer Graphics and Interactive Techniques (SIGGRAPH), Boston, 2006. 847--852. Google Scholar

[10] Jiang H, Wang J, Yuan Z, et al. Automatic salient object segmentation based on context and shape prior. In: Proceedings of the British Machine Vision Conference. Durham: BMVA Press, 2011. Google Scholar

[11] Jiang H, Wang J, Yuan Z, et al. Salient object detection: a discriminative regional feature integration approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, 2013. 2083--2090. Google Scholar

[12] Zhao Y, Lu S J, Qian H L, et al. Robust mesh deformation with salient features preservation. Sci China Inf Sci, 2016, 59: 052106. Google Scholar

[13] Wu X M, Du M N, Chen W H, et al. Salient object detection via region contrast and graph regularization. Sci China Inf Sci, 2016, 59: 032104. Google Scholar

[14] Comaniciu D, Ramesh V, Meer P. Real-time tracking of nonrigid objects using mean shift. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hilton Head Island, 2000. 142--149. Google Scholar

[15] Wei Y, Sun J, Tang X, et al. Interactive offline tracking for color objects. In: Proceedings of the IEEE International Conference on Computer Vision, Rio de Janeiro, 2007. Google Scholar

[16] Zhang G, Yuan Z, Zheng N N, et al. Visual saliency based object tracking. In: Proceedings of the Asian Conference on Computer Vision. Berlin: Springer, 2009. 193--203. Google Scholar

[17] Liu D, Chen T. A topic-motion model for unsupervised video object discovery. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, 2007. Google Scholar

[18] Yun X, Jing Z L, Xiao G, et al. A compressive tracking based on time-space Kalman fusion model. Sci China Inf Sci, 2016, 59: 012106. Google Scholar

[19] Yang Y X, Yang J, Zhang Z X, et al. High-speed visual target tracking with mixed rotation invariant description and skipping searching. Sci China Inf Sci, 2017, 60: 062401. Google Scholar

[20] Doucet A, Freitas N, Gordon N. Sequential Monte Carlo Methods in Practice. Berlin: Springer, 2001. Google Scholar

[21] Lafferty J, McCallum A, Pereira F. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers, 2001. 282--289. Google Scholar

[22] Shi J, Malik J. Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intel, 2000, 22: 888--905. Google Scholar

[23] Blake A, Rother C, Brown M, et al. Interactive image segmentation using an adaptive GMMRF model. In: Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2004. 428--441. Google Scholar

[24] Sun J, Zhang W, Tang X, et al. Bi-directional tracking using trajectory segment analysis. In: Proceedings of the IEEE International Conference on Computer Vision, Beijing, 2005. Google Scholar

[25] Zhang Y, Su A, Zhu X, et al. Salient object detection approach in UAV video. In: Proceedings of the 8th International Symposium on Multispectral Image Processing and Pattern Recognition. Bellingham: SPIE Proceedings, 2013. 8224. Google Scholar

  • Figure 1

    (Color online) An example in which the salient object changes across consecutive images (frame ${\#}10$, ${\#}13$, ${\#}15$, ${\#}22$).

  • Figure 2

    (Color online) An example in which multiple salient objects appear, while the sequence index is defined to distinguish between the different salient object sequences. Previous approaches [3,4]assume a single salient object sequence, and output one rectangle for all the salient objects.

  • Figure 3

    Salient object features from Figure 1.

  • Figure 4

    (Color online) Clustering result of SSA with different segments marked by red and green points. (a) SSA for Figure 1; (b) SSA for Figure 10.

  • Figure 5

    (Color online) A coarse-to-fine algorithm speeds up the dynamic programming of a large 3D graph.

  • Figure 6

    Flow chart of the proposed algorithm.

  • Figure 7

    (Color online) Effectiveness of SSA. (a) Salient object detection algorithm from[3]with a single image;protectłinebreak (b) salient object tracking without SSA; (c) our approach.

  • Figure 8

    (Color online) Examples used to compare the effectiveness of our approach with SSA. (a) Car sequence with cottages; (b) two different people appearing successively; (c) a person walks past a car; (d) a person walks in front of a sculpture.

  • Figure 9

    (Color online) Clustering result of SSA with different segments marked by red and green points, from the examples in Figure 8. (a) SSA for Figure 8(a); (b) SSA for Figure 8(b); (c) SSA for Figure 8(c); (d) SSA for Figure 8(d).

  • Figure 10

    (Color online) Comparison of algorithms. From left to right: frame #5, #12, #13, #17. (a) Zhangs approach [16]; (b) our approach.

  • Figure 11

    (Color online) Salient object tracking with UAV vision system. The salient object is rebuilt well in frames ${\#}318,~{\#}338$. (a) From left to right: frame ${\#}216,~{\#}236,~{\#}256,~{\#}276$; (b) from left to right: frame ${\#}296,~{\#}308,~{\#}318,~{\#}338$.

Copyright 2019 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1