Back to Search
Start Over
ResDLPS-Net: Joint residual-dense optimization for large-scale point cloud semantic segmentation.
- Source :
-
ISPRS Journal of Photogrammetry & Remote Sensing . Dec2021, Vol. 182, p37-51. 15p. - Publication Year :
- 2021
-
Abstract
- [Display omitted] • A feature extraction module is proposed to extract the geometric and semantic features of each point efficiently. Then, the attention mechanism is deployed to aggregate the learned features. • The proposed ResDLPS-Net is optimized by joint training of residual connections and dense convolutional connections. • Experiments demonstrate that ResDLPS-Net outperforms the state-of-the-art deep learning networks on the indoor dataset S3DIS and the outdoor large-scale dataset Toronto-3D. Significantly, the Mean Intersection over Union (mIoU) of ResDLPS-Net on the Toronto-3D dataset is 80.27%. Semantic segmentation methods based on three-dimensional (3D) point clouds are mostly limited to input point clouds that have been divided into blocks for training. This is mainly attributed to the requirement of constant trade-offs between computational resources and accuracy for directly processing large-scale point clouds. Specifically, the block dividing strategy will add the data preprocessing time to some extent and may disturb the complete geometry of the object. Therefore, this paper proposes a large-scale point cloud semantic segmentation network without block dividing operation, referred to as ResDLPS-Net. This network can take the complete point cloud of the whole large scene as input and process up to nearly a million points on one single GPU. In particular, a novel feature extraction module is designed to efficiently extract neighbor, geometric, and semantic features. The learned features are then aggregated through the attention mechanism to form local feature descriptors. In addition, the proposed ResDLPS-Net is jointly trained by residual connections and dense convolutional connections to optimize the feature aggregation operation. As a result, the ResDLPS-Net performs brilliantly on multiple objects, such as windows, road markings, fences, etc. For example, the Mean Intersection over Union (mIoU) of road markings on the Toronto-3D dataset is 37.76% higher than the state-of-the-art algorithm. Moreover, this paper outperforms most deep learning methods on three well-known benchmark datasets, including the indoor dataset S3DIS and the outdoor large-scale scene datasets Semantic3D and Toronto-3D. The proposed ResDLPS-Net achieves the best performance on the S3DIS dataset. The average accuracy (mA) and overall accuracy (OA) are 82.3% and 88.1%, respectively. Notably, the proposed ResDLPS-Net attains a mIoU of 80.27% on the Toronto-3D dataset, which is 6.00% higher than the best results published currently. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 09242716
- Volume :
- 182
- Database :
- Academic Search Index
- Journal :
- ISPRS Journal of Photogrammetry & Remote Sensing
- Publication Type :
- Academic Journal
- Accession number :
- 153526443
- Full Text :
- https://doi.org/10.1016/j.isprsjprs.2021.09.024