logo

SCIENCE CHINA Information Sciences, Volume 60, Issue 12: 123101(2017) https://doi.org/10.1007/s11432-017-9252-5

Semantic segmentation of high-resolution images

More info
  • ReceivedMay 28, 2017
  • AcceptedOct 13, 2017
  • PublishedNov 7, 2017

Abstract

Image semantic segmentation is a research topic that has emerged recently. Although existing approaches have achieved satisfactory accuracy, they are limited to handling low-resolution images owing to their large memory consumption. In this paper, we present a semantic segmentation method for high-resolution images. First, we downsample the input image to a lower resolution and then obtain a low-resolution semantic segmentation image using state-of-the-art methods. Next, we use joint bilateral upsampling to upsample the low-resolution solution and obtain a high-resolution semantic segmentation image. To modify joint bilateral upsampling to handle discrete semantic segmentation data, we propose using voting instead of interpolation in filtering computation. Compared to state-of-the-art methods, our method significantly reduces memory cost without reducing result quality.


Acknowledgment

This work was supported by National Natural Science Foundation of China (Grant No. 61521002), a research grant from the Beijing Higher Institution Engineering Research Center, and the Tsinghua-Tencent Joint Laboratory for Internet Innovation Technology.


Supplement


References

[1] Carneiro G, Chan A B, Moreno P J, et al. Supervised learning of semantic classes for image annotation and retrieval.IEEE Trans Pattern Anal Mach Intell,2007, 29: 394--410. Google Scholar

[2] Gould S, Fulton R, Koller D. Decomposing a scene into geometric and semantically consistent regions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Kyoto, 2009. 1--8. Google Scholar

[3] Ren X, Bo L, Fox D. RGB-(D) scene labeling: features and algorithms. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, 2012. 2759--2766. Google Scholar

[4] Farabet C, Couprie C, Najman L. Learning hierarchical features for scene labeling.. IEEE Trans Pattern Anal Mach Intell, 2013, 35: 1915-1929 CrossRef PubMed Google Scholar

[5] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 2015. 3431--3440. Google Scholar

[6] Kopf J, Cohen M F, Lischinski D, et al. Joint bilateral upsampling. ACM Trans Graph, 2007, 26: 96. Google Scholar

[7] Tomasi C, Manduchi R. Bilateral filtering for gray and color images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Bombay, 1998. 839--846. Google Scholar

[8] Zhou B, Zhao H, Puig X, et al. Scene parsing through ADE20K dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 2017. Google Scholar

[9] Li X, Liu K, Dong Y. Superpixel-based foreground extraction with fast adaptive trimaps. IEEE Trans Cybern, 2017, doi: 10.1109/TCYB.2017.2747143. Google Scholar

[10] Huang H, Fang X, Ye Y, et al. Practical automatic background substitution for live video. Comp Visual Media, 2017, 3: 273–284. Google Scholar

[11] Li X, Liu K, Dong Y, et al. Patch alignment manifold matting. IEEE Trans Neural Netw Learn Syst, 2017, doi: 10.1109/TNNLS.2017.2727140. Google Scholar

[12] Zheng Z H, Zhang H T, Zhang F L, et al. Image-based clothes changing system. Comput Vis Media, 2017, in press. Google Scholar

[13] Maerki N, Perazzi F, Wang O, et al. Bilateral space video segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016. 743--751. Google Scholar

  • Figure 1

    (Color online) Examples of an indoor video: (a) two frames of the input video; (b) high-resolution semantic segmentation results by [5,13]; (c) low-resolution semantic segmentation results by [5,13]; (d) our high-resolution semantic segmentation results.

  • Figure 2

    (Color online) Examples of street view panoramas: (a) input images; (b) low-resolution semantic segmentation results by [5]; (c) our high resolution semantic segmentation results.

Copyright 2019 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1