logo

SCIENCE CHINA Information Sciences, Volume 62, Issue 12: 229101(2019) https://doi.org/10.1007/s11432-018-9828-3

Multi-view based neural network for semantic segmentation on 3D scenes

More info
  • ReceivedOct 28, 2018
  • AcceptedJan 31, 2019
  • PublishedSep 4, 2019

Abstract

There is no abstract available for this article.


Acknowledgment

This work was supported by GRF (Grant No. 16203518), Hong Kong RGC (Grant Nos. 16208614, T22-603/15N), Hong Kong ITC (Grant No. PSKL12EG02), and National Basic Research Program of China (973 Program) (Grant No. 2012CB316300).


References

[1] Wang J L, Lu Y H, Liu J B. A robust three-stage approach to large-scale urban scene recognition. Sci China Inf Sci, 2017, 60: 103101 CrossRef Google Scholar

[2] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015. Google Scholar

[3] Kalogerakis E, Averkiou M, Maji S, et al. 3D Shape segmentation with projective convolutional networks. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 6630--6639. Google Scholar

[4] Kalogerakis E, Hertzmann A, Singh K. Learning 3D mesh segmentation and labeling. ACM Trans Graph, 2010, 29: 102. Google Scholar

[5] Dai A, Chang A X, Savva M, et al. Scannet: richly-annotated 3D reconstructions of indoor scenes. In: Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 2432--2443. Google Scholar

[6] Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV 2015), 2015. 1520--1528. Google Scholar

[7] He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016), 2016. 770--778. Google Scholar

[8] Riemenschneider H, Bodis-Szomoru A, Weissenberg J, et al. Learning where to classify in multi-view semantic segmentation. In: Proceedings of European Conference on Computer Vision (ECCV 2014), 2014. 516--532. Google Scholar

[9] Gadde R, Jampani V, Marlet R. Efficient 2D and 3D facade segmentation using auto-context. 2016,. arXiv Google Scholar

  • Figure 1

    (Color online) (a) The proposed multi-view based neural network architecture for semantic segmentation on the 3D scenes. The encoder-decoder structure is based on the ResNet101 backbone and multi-stage decoder is used to restore resolution. The feature aggregation module is used to aggregate the features from multi-view feature maps. The multi-view optimization module is used to optimize the semantic segmentation result by conditional random field (CRF). (b) The performance (mean intersection of unison) comparison with other methods on the 3D scenes of the RueMonge dataset.

Copyright 2020 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1