Back to Search Start Over

ResDLPS-Net: Joint residual-dense optimization for large-scale point cloud semantic segmentation.

Authors :
Du, Jing
Cai, Guorong
Wang, Zongyue
Huang, Shangfeng
Su, Jinhe
Marcato Junior, José
Smit, Julian
Li, Jonathan
Source :
ISPRS Journal of Photogrammetry & Remote Sensing. Dec2021, Vol. 182, p37-51. 15p.
Publication Year :
2021

Abstract

[Display omitted] • A feature extraction module is proposed to extract the geometric and semantic features of each point efficiently. Then, the attention mechanism is deployed to aggregate the learned features. • The proposed ResDLPS-Net is optimized by joint training of residual connections and dense convolutional connections. • Experiments demonstrate that ResDLPS-Net outperforms the state-of-the-art deep learning networks on the indoor dataset S3DIS and the outdoor large-scale dataset Toronto-3D. Significantly, the Mean Intersection over Union (mIoU) of ResDLPS-Net on the Toronto-3D dataset is 80.27%. Semantic segmentation methods based on three-dimensional (3D) point clouds are mostly limited to input point clouds that have been divided into blocks for training. This is mainly attributed to the requirement of constant trade-offs between computational resources and accuracy for directly processing large-scale point clouds. Specifically, the block dividing strategy will add the data preprocessing time to some extent and may disturb the complete geometry of the object. Therefore, this paper proposes a large-scale point cloud semantic segmentation network without block dividing operation, referred to as ResDLPS-Net. This network can take the complete point cloud of the whole large scene as input and process up to nearly a million points on one single GPU. In particular, a novel feature extraction module is designed to efficiently extract neighbor, geometric, and semantic features. The learned features are then aggregated through the attention mechanism to form local feature descriptors. In addition, the proposed ResDLPS-Net is jointly trained by residual connections and dense convolutional connections to optimize the feature aggregation operation. As a result, the ResDLPS-Net performs brilliantly on multiple objects, such as windows, road markings, fences, etc. For example, the Mean Intersection over Union (mIoU) of road markings on the Toronto-3D dataset is 37.76% higher than the state-of-the-art algorithm. Moreover, this paper outperforms most deep learning methods on three well-known benchmark datasets, including the indoor dataset S3DIS and the outdoor large-scale scene datasets Semantic3D and Toronto-3D. The proposed ResDLPS-Net achieves the best performance on the S3DIS dataset. The average accuracy (mA) and overall accuracy (OA) are 82.3% and 88.1%, respectively. Notably, the proposed ResDLPS-Net attains a mIoU of 80.27% on the Toronto-3D dataset, which is 6.00% higher than the best results published currently. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09242716
Volume :
182
Database :
Academic Search Index
Journal :
ISPRS Journal of Photogrammetry & Remote Sensing
Publication Type :
Academic Journal
Accession number :
153526443
Full Text :
https://doi.org/10.1016/j.isprsjprs.2021.09.024