23 results
Search Results
2. Research on Remote-Sensing Identification Method of Typical Disaster-Bearing Body Based on Deep Learning and Spatial Constraint Strategy.
- Author
-
Wang, Lei, Xu, Yingjun, Chen, Qiang, Wu, Jidong, Luo, Jianhui, Li, Xiaoxuan, Peng, Ruyi, and Li, Jiaxin
- Subjects
DEEP learning ,DATABASES ,IMAGE recognition (Computer vision) ,DATA integrity ,DAMS - Abstract
The census and management of hazard-bearing entities, along with the integrity of data quality, form crucial foundations for disaster risk assessment and zoning. By addressing the challenge of feature confusion, prevalent in single remotely sensed image recognition methods, this paper introduces a novel method, Spatially Constrained Deep Learning (SCDL), that combines deep learning with spatial constraint strategies for the extraction of disaster-bearing bodies, focusing on dams as a typical example. The methodology involves the creation of a dam dataset using a database of dams, followed by the training of YOLOv5, Varifocal Net, Faster R-CNN, and Cascade R-CNN models. These models are trained separately, and highly confidential dam location information is extracted through parameter thresholding. Furthermore, three spatial constraint strategies are employed to mitigate the impact of other factors, particularly confusing features, in the background region. To assess the method's applicability and efficiency, Qinghai Province serves as the experimental area, with dam images from the Google Earth Pro database used as validation samples. The experimental results demonstrate that the recognition accuracy of SCDL reaches 94.73%, effectively addressing interference from background factors. Notably, the proposed method identifies six dams not recorded in the GOODD database, while also detecting six dams in the database that were previously unrecorded. Additionally, four dams misdirected in the database are corrected, contributing to the enhancement and supplementation of the global dam geo-reference database and providing robust support for disaster risk assessment. In conclusion, leveraging open geographic data products, the comprehensive framework presented in this paper, encompassing deep learning target detection technology and spatial constraint strategies, enables more efficient and accurate intelligent retrieval of disaster-bearing bodies, specifically dams. The findings offer valuable insights and inspiration for future advancements in related fields. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Spectral-Spatial Mamba for Hyperspectral Image Classification.
- Author
-
Huang, Lingbo, Chen, Yushi, and He, Xin
- Subjects
IMAGE recognition (Computer vision) ,TRANSFORMER models ,POWER transformers ,COMPUTATIONAL complexity ,DEEP learning - Abstract
Recently, transformer has gradually attracted interest for its excellence in modeling the long-range dependencies of spatial-spectral features in HSI. However, transformer has the problem of the quadratic computational complexity due to the self-attention mechanism, which is heavier than other models and thus has limited adoption in HSI processing. Fortunately, the recently emerging state space model-based Mamba shows great computational efficiency while achieving the modeling power of transformers. Therefore, in this paper, we first proposed spectral-spatial Mamba (SS-Mamba) for HSI classification. Specifically, SS-Mamba mainly includes a spectral-spatial token generation module and several stacked spectral-spatial Mamba blocks. Firstly, the token generation module converts any given HSI cube to spatial and spectral tokens as sequences. And then these tokens are sent to stacked spectral-spatial mamba blocks (SS-MB). Each SS-MB includes two basic mamba blocks and a spectral-spatial feature enhancement module. The spatial and spectral tokens are processed separately by the two basic mamba blocks, correspondingly. Moreover, the feature enhancement module modulates spatial and spectral tokens using HSI sample's center region information. Therefore, the spectral and spatial tokens cooperate with each other and achieve information fusion within each block. The experimental results conducted on widely used HSI datasets reveal that the proposed SS-Mamba requires less processing time compared with transformer. The Mamba-based method thus opens a new window for HSI classification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. A Deep Learning Classification Scheme for PolSAR Image Based on Polarimetric Features.
- Author
-
Zhang, Shuaiying, Cui, Lizhen, Dong, Zhen, and An, Wentao
- Subjects
DEEP learning ,SYNTHETIC aperture radar ,IMAGE recognition (Computer vision) ,CLASSIFICATION - Abstract
Polarimetric features extracted from polarimetric synthetic aperture radar (PolSAR) images contain abundant back-scattering information about objects. Utilizing this information for PolSAR image classification can improve accuracy and enhance object monitoring. In this paper, a deep learning classification method based on polarimetric channel power features for PolSAR is proposed. The distinctive characteristic of this method is that the polarimetric features input into the deep learning network are the power values of polarimetric channels and contain complete polarimetric information. The other two input data schemes are designed to compare the proposed method. The neural network can utilize the extracted polarimetric features to classify images, and the classification accuracy analysis is employed to compare the strengths and weaknesses of the power-based scheme. It is worth mentioning that the polarized characteristics of the data input scheme mentioned in this article have been derived through rigorous mathematical deduction, and each polarimetric feature has a clear physical meaning. By testing different data input schemes on the Gaofen-3 (GF-3) PolSAR image, the experimental results show that the method proposed in this article outperforms existing methods and can improve the accuracy of classification to a certain extent, validating the effectiveness of this method in large-scale area classification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. MEA-EFFormer: Multiscale Efficient Attention with Enhanced Feature Transformer for Hyperspectral Image Classification.
- Author
-
Sun, Qian, Zhao, Guangrui, Fang, Yu, Fang, Chenrong, Sun, Le, and Li, Xingying
- Subjects
IMAGE recognition (Computer vision) ,CONVOLUTIONAL neural networks ,DEEP learning ,TRANSFORMER models ,FEATURE extraction - Abstract
Hyperspectral image classification (HSIC) has garnered increasing attention among researchers. While classical networks like convolution neural networks (CNNs) have achieved satisfactory results with the advent of deep learning, they are confined to processing local information. Vision transformers, despite being effective at establishing long-distance dependencies, face challenges in extracting high-representation features for high-dimensional images. In this paper, we present the multiscale efficient attention with enhanced feature transformer (MEA-EFFormer), which is designed for the efficient extraction of spectral–spatial features, leading to effective classification. MEA-EFFormer employs a multiscale efficient attention feature extraction module to initially extract 3D convolution features and applies effective channel attention to refine spectral information. Following this, 2D convolution features are extracted and integrated with local binary pattern (LBP) spatial information to augment their representation. Then, the processed features are fed into a spectral–spatial enhancement attention (SSEA) module that facilitates interactive enhancement of spectral–spatial information across the three dimensions. Finally, these features undergo classification through a transformer encoder. We evaluate MEA-EFFormer against several state-of-the-art methods on three datasets and demonstrate its outstanding HSIC performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Hyperspectral Image Classification Based on Mutually Guided Image Filtering.
- Author
-
Zhan, Ying, Hu, Dan, Yu, Xianchuan, and Wang, Yufeng
- Subjects
IMAGE recognition (Computer vision) ,ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,FEATURE extraction ,GENERATIVE adversarial networks ,HYPERSPECTRAL imaging systems ,REMOTE sensing - Abstract
Hyperspectral remote sensing images (HSIs) have both spectral and spatial characteristics. The adept exploitation of these attributes is central to enhancing the classification accuracy of HSIs. In order to effectively utilize spatial and spectral features to classify HSIs, this paper proposes a method for the spatial feature extraction of HSIs based on a mutually guided image filter (muGIF) and combined with the band-distance-grouped principal component. Firstly, aiming at the problem that previously guided image filtering cannot effectively deal with the inconsistent information structure between the guided and target information, a method for extracting spatial features using muGIF is proposed. Then, aiming at the problem of the information loss caused by a single principal component as a guided image in the traditional GIF-based spatial–spectral classification, a spatial feature-extraction framework based on the band-distance-grouped principal component is proposed. The method groups the bands according to the band distance and extracts the principal components of each set of band subsets as the guide map of the current band subset to filter the HSIs. A deep convolutional neural network model and a generative adversarial network model for the filtered HSIs are constructed and then trained using samples for HSIs' spatial–spectral classification. Experiments show that compared with the traditional methods and several popular spatial–spectral HSI classification methods based on a filter, the proposed methods based on muGIF can effectively extract the spatial–spectral features and improve the classification accuracy of HSIs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Sea Ice Extraction via Remote Sensing Imagery: Algorithms, Datasets, Applications and Challenges.
- Author
-
Huang, Wenjun, Yu, Anzhu, Xu, Qing, Sun, Qun, Guo, Wenyue, Ji, Song, Wen, Bowei, and Qiu, Chunping
- Subjects
SEA ice ,DEEP learning ,REMOTE sensing ,IMAGE recognition (Computer vision) ,GEOGRAPHIC information systems ,ALGORITHMS - Abstract
Deep learning, which is a dominating technique in artificial intelligence, has completely changed image understanding over the past decade. As a consequence, the sea ice extraction (SIE) problem has reached a new era. We present a comprehensive review of four important aspects of SIE, including algorithms, datasets, applications and future trends. Our review focuses on research published from 2016 to the present, with a specific focus on deep-learning-based approaches in the last five years. We divided all related algorithms into three categories, including the conventional image classification approach, the machine learning-based approach and deep-learning-based methods. We reviewed the accessible ice datasets including SAR-based datasets, the optical-based datasets and others. The applications are presented in four aspects including climate research, navigation, geographic information systems (GIS) production and others. This paper also provides insightful observations and inspiring future research directions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Multi-View Scene Classification Based on Feature Integration and Evidence Decision Fusion.
- Author
-
Zhou, Weixun, Shi, Yongxin, and Huang, Xiao
- Subjects
FEATURE extraction ,IMAGE recognition (Computer vision) ,IMAGE fusion ,CONVOLUTIONAL neural networks ,DEEP learning - Abstract
Leveraging multi-view remote sensing images in scene classification tasks significantly enhances the precision of such classifications. This approach, however, poses challenges due to the simultaneous use of multi-view images, which often leads to a misalignment between the visual content and semantic labels, thus complicating the classification process. In addition, as the number of image viewpoints increases, the quality problem for remote sensing images further limits the effectiveness of multi-view image classification. Traditional scene classification methods predominantly employ SoftMax deep learning techniques, which lack the capability to assess the quality of remote sensing images or to provide explicit explanations for the network's predictive outcomes. To address these issues, this paper introduces a novel end-to-end multi-view decision fusion network specifically designed for remote sensing scene classification. The network integrates information from multi-view remote sensing images under the guidance of image credibility and uncertainty, and when the multi-view image fusion process encounters conflicts, it greatly alleviates the conflicts and provides more reasonable and credible predictions for the multi-view scene classification results. Initially, multi-scale features are extracted from the multi-view images using convolutional neural networks (CNNs). Following this, an asymptotic adaptive feature fusion module (AAFFM) is constructed to gradually integrate these multi-scale features. An adaptive spatial fusion method is then applied to assign different spatial weights to the multi-scale feature maps, thereby significantly enhancing the model's feature discrimination capability. Finally, an evidence decision fusion module (EDFM), utilizing evidence theory and the Dirichlet distribution, is developed. This module quantitatively assesses the uncertainty in the multi-perspective image classification process. Through the fusing of multi-perspective remote sensing image information in this module, a rational explanation for the prediction results is provided. The efficacy of the proposed method was validated through experiments conducted on the AiRound and CV-BrCT datasets. The results show that our method not only improves single-view scene classification results but also advances multi-view remote sensing scene classification results by accurately characterizing the scene and mitigating the conflicting nature of the fusion process. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Spatial-Spectral BERT for Hyperspectral Image Classification.
- Author
-
Ashraf, Mahmood, Zhou, Xichuan, Vivone, Gemine, Chen, Lihui, Chen, Rong, and Majdard, Reza Seifi
- Subjects
IMAGE recognition (Computer vision) ,LANGUAGE models ,DEEP learning ,TRANSFORMER models ,CONVOLUTIONAL neural networks ,SPECTRAL imaging - Abstract
Several deep learning and transformer models have been recommended in previous research to deal with the classification of hyperspectral images (HSIs). Among them, one of the most innovative is the bidirectional encoder representation from transformers (BERT), which applies a distance-independent approach to capture the global dependency among all pixels in a selected region. However, this model does not consider the local spatial-spectral and spectral sequential relations. In this paper, a dual-dimensional (i.e., spatial and spectral) BERT (the so-called D
2 BERT) is proposed, which improves the existing BERT model by capturing more global and local dependencies between sequential spectral bands regardless of distance. In the proposed model, two BERT branches work in parallel to investigate relations among pixels and spectral bands, respectively. In addition, the layer intermediate information is used for supervision during the training phase to enhance the performance. We used two widely employed datasets for our experimental analysis. The proposed D2 BERT shows superior classification accuracy and computational efficiency with respect to some state-of-the-art neural networks and the previously developed BERT model for this task. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
10. Hyperspectral Image Classification on Large-Scale Agricultural Crops: The Heilongjiang Benchmark Dataset, Validation Procedure, and Baseline Results.
- Author
-
Zhang, Hongzhe, Feng, Shou, Wu, Di, Zhao, Chunhui, Liu, Xi, Zhou, Yuan, Wang, Shengnan, Deng, Hongtao, and Zheng, Shuang
- Subjects
IMAGE recognition (Computer vision) ,CROPS ,AGRICULTURE ,DEEP learning ,RESEARCH personnel ,INTERCROPPING - Abstract
Over the past few decades, researchers have shown sustained and robust investment in exploring methods for hyperspectral image classification (HSIC). The utilization of hyperspectral imagery (HSI) for crop classification in agricultural areas has been widely demonstrated for its feasibility, flexibility, and cost-effectiveness. However, numerous coexisting issues in agricultural scenarios, such as limited annotated samples, uneven distribution of crops, and mixed cropping, could not be explored insightfully in the mainstream datasets. The limitations within these impractical datasets have severely restricted the widespread application of HSIC methods in agricultural scenarios. A benchmark dataset named Heilongjiang (HLJ) for HSIC is introduced in this paper, which is designed for large-scale crop classification. For practical applications, the HLJ dataset covers a wide range of genuine agricultural regions in Heilongjiang Province; it provides rich spectral diversity enriched through two images from diverse time periods and vast geographical areas with intercropped multiple crops. Simultaneously, considering the urgent demand of deep learning models, the two images in the HLJ dataset have 319,685 and 318,942 annotated samples, along with 151 and 149 spectral bands, respectively. To validate the suitability of the HLJ dataset as a baseline dataset for HSIC, we employed eight classical classification models in fundamental experiments on the HLJ dataset. Most of the methods achieved an overall accuracy of more than 80% with 10% of the labeled samples used for training. Furthermore, the advantages of the HLJ dataset and the impact of real-world factors on experimental results are comprehensively elucidated. The comprehensive baseline experimental evaluation and analysis affirm the research potential of the HLJ dataset as a large-scale crop classification dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Multiscale Feature Search-Based Graph Convolutional Network for Hyperspectral Image Classification.
- Author
-
Wu, Ke, Zhan, Yanting, An, Ying, and Li, Suyi
- Subjects
IMAGE recognition (Computer vision) ,FEATURE extraction ,DEEP learning ,MULTISCALE modeling - Abstract
With the development of hyperspectral sensors, the availability of hyperspectral images (HSIs) has increased significantly, prompting advancements in deep learning-based hyperspectral image classification (HSIC) methods. Recently, graph convolutional networks (GCNs) have been proposed to process graph-structured data in non-Euclidean domains, and have been used for HSIC. The superpixel segmentation should be implemented first in the GCN-based methods, however, it is difficult to manually select the optimal superpixel segmentation sizes to obtain the useful information for classification. To solve this problem, we constructed a HSIC model based on a multiscale feature search-based graph convolutional network (MFSGCN) in this study. Firstly, pixel-level features of HSIs are extracted sequentially using 3D asymmetric decomposition convolution and 2D convolution. Then, superpixel-level features at different scales are extracted using multilayer GCNs. Finally, the neural architecture search (NAS) method is used to automatically assign different weights to different scales of superpixel features. Thus, a more discriminative feature map is obtained for classification. Compared with other GCN-based networks, the MFSGCN network can automatically capture features and obtain higher classification accuracy. The proposed MFSGCN model was implemented on three commonly used HSI datasets and compared to some state-of-the-art methods. The results confirm that MFSGCN effectively improves accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Improving Hyperspectral Image Classification with Compact Multi-Branch Deep Learning.
- Author
-
Islam, Md. Rashedul, Islam, Md. Touhid, Uddin, Md Palash, and Ulhaq, Anwaar
- Subjects
IMAGE recognition (Computer vision) ,DEEP learning ,FEATURE extraction ,FACTOR analysis - Abstract
The progress in hyperspectral image (HSI) classification owes much to the integration of various deep learning techniques. However, the inherent 3D cube structure of HSIs presents a unique challenge, necessitating an innovative approach for the efficient utilization of spectral data in classification tasks. This research focuses on HSI classification through the adoption of a recently validated deep-learning methodology. Challenges in HSI classification encompass issues related to dimensionality, data redundancy, and computational expenses, with CNN-based methods prevailing due to architectural limitations. In response to these challenges, we introduce a groundbreaking model known as "Crossover Dimensionality Reduction and Multi-branch Deep Learning" (CMD) for hyperspectral image classification. The CMD model employs a multi-branch deep learning architecture incorporating Factor Analysis and MNF for crossover feature extraction, with the selection of optimal features from each technique. Experimental findings underscore the CMD model's superiority over existing methods, emphasizing its potential to enhance HSI classification outcomes. Notably, the CMD model exhibits exceptional performance on benchmark datasets such as Salinas Scene (SC), Pavia University (PU), Kennedy Space Center (KSC), and Indian Pines (IP), achieving impressive overall accuracy rates of 99.35% and 99.18% using only 5% of the training data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Hyperspectral Image Classification Based on Double-Branch Multi-Scale Dual-Attention Network.
- Author
-
Zhang, Heng, Liu, Hanhu, Yang, Ronghao, Wang, Wei, Luo, Qingqu, and Tu, Changda
- Subjects
IMAGE recognition (Computer vision) ,GEOLOGY ,CONVOLUTIONAL neural networks ,DEEP learning ,PETROLOGY - Abstract
Although extensive research shows that CNNs achieve good classification results in HSI classification, they still struggle to effectively extract spectral sequence information from HSIs. Additionally, the high-dimensional features of HSIs, the limited number of labeled samples, and the common sample imbalance significantly restrict classification performance improvement. To address these issues, this article proposes a double-branch multi-scale dual-attention (DBMSDA) network that fully extracts spectral and spatial information from HSIs and fuses them for classification. The designed multi-scale spectral residual self-attention (MSeRA), as a fundamental component of dense connections, can fully extract high-dimensional and intricate spectral information from HSIs, even with limited labeled samples and imbalanced distributions. Additionally, this article adopts a dataset partitioning strategy to prevent information leakage. Finally, this article introduces a hyperspectral geological lithology dataset to evaluate the accuracy and applicability of deep learning methods in geology. Experimental results on the geological lithology hyperspectral dataset and three other public datasets demonstrate that the DBMSDA method exhibits superior classification performance and robust generalization ability compared to existing methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Adaptive Learnable Spectral–Spatial Fusion Transformer for Hyperspectral Image Classification.
- Author
-
Wang, Minhui, Sun, Yaxiu, Xiang, Jianhong, Sun, Rui, and Zhong, Yu
- Subjects
IMAGE recognition (Computer vision) ,CONVOLUTIONAL neural networks ,TRANSFORMER models ,FEATURE extraction ,DEEP learning ,MULTISPECTRAL imaging - Abstract
In hyperspectral image classification (HSIC), every pixel of the HSI is assigned to a land cover category. While convolutional neural network (CNN)-based methods for HSIC have significantly enhanced performance, they encounter challenges in learning the relevance of deep semantic features and grappling with escalating computational costs as network depth increases. In contrast, the transformer framework is adept at capturing the relevance of high-level semantic features, presenting an effective solution to address the limitations encountered by CNN-based approaches. This article introduces a novel adaptive learnable spectral–spatial fusion transformer (ALSST) to enhance HSI classification. The model incorporates a dual-branch adaptive spectral–spatial fusion gating mechanism (ASSF), which captures spectral–spatial fusion features effectively from images. The ASSF comprises two key components: the point depthwise attention module (PDWA) for spectral feature extraction and the asymmetric depthwise attention module (ADWA) for spatial feature extraction. The model efficiently obtains spectral–spatial fusion features by multiplying the outputs of these two branches. Furthermore, we integrate the layer scale and DropKey into the traditional transformer encoder and multi-head self-attention (MHSA) to form a new transformer with a layer scale and DropKey (LD-Former). This innovation enhances data dynamics and mitigates performance degradation in deeper encoder layers. The experiments detailed in this article are executed on four renowned datasets: Trento (TR), MUUFL (MU), Augsburg (AU), and the University of Pavia (UP). The findings demonstrate that the ALSST model secures optimal performance, surpassing some existing models. The overall accuracy (OA) is 99.70%, 89.72%, 97.84%, and 99.78% on four famous datasets: Trento (TR), MUUFL (MU), Augsburg (AU), and University of Pavia (UP), respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. A Multi-Hyperspectral Image Collaborative Mapping Model Based on Adaptive Learning for Fine Classification.
- Author
-
Zhang, Xiangrong, Liu, Zitong, Zhang, Xianhao, and Liu, Tianzhu
- Subjects
MULTISPECTRAL imaging ,IMAGE recognition (Computer vision) ,IMAGE fusion ,SPECTRAL sensitivity ,DEEP learning ,SPATIAL resolution ,CLASSIFICATION - Abstract
Hyperspectral (HS) data, encompassing hundreds of spectral channels for the same area, offer a wealth of spectral information and are increasingly utilized across various fields. However, their limitations in spatial resolution and imaging width pose challenges for precise recognition and fine classification in large scenes. Conversely, multispectral (MS) data excel in providing spatial details for vast landscapes but lack spectral precision. In this article, we proposed an adaptive learning-based mapping model, including an image fusion module, spectral super-resolution network, and adaptive learning network. Spectral super-resolution networks learn the mapping between multispectral and hyperspectral images based on the attention mechanism. The image fusion module leverages spatial and spectral consistency in training data, providing pseudo labels for spectral super-resolution training. And the adaptive learning network incorporates spectral response priors via unsupervised learning, adjusting the output of the super-resolution network to preserve spectral information in reconstructed data. Through the experiment, the model eliminates the need for the manual setting of image prior information and complex parameter selection, and can adjust the network structure and parameters dynamically, eventually enhancing the reconstructed image quality, and enabling the fine classification of large-scale scenes with high spatial resolution. Compared with the recent dictionary learning and deep learning spectral super-resolution methods, our approach exhibits superior performance in terms of both image similarity and classification accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Improved Landsat Operational Land Imager (OLI) Cloud and Shadow Detection with the Learning Attention Network Algorithm (LANA).
- Author
-
Zhang, Hankui K., Luo, Dong, and Roy, David P.
- Subjects
LANDSAT satellites ,CONVOLUTIONAL neural networks ,IMAGE recognition (Computer vision) ,ALGORITHMS ,CLASSIFICATION algorithms ,HOUGH transforms - Abstract
Landsat cloud and cloud shadow detection has a long heritage based on the application of empirical spectral tests to single image pixels, including the Landsat product Fmask algorithm, which uses spectral tests applied to optical and thermal bands to detect clouds and uses the sun-sensor-cloud geometry to detect shadows. Since the Fmask was developed, convolutional neural network (CNN) algorithms, and in particular U-Net algorithms (a type of CNN with a U-shaped network structure), have been developed and are applied to pixels in square patches to take advantage of both spatial and spectral information. The purpose of this study was to develop and assess a new U-Net algorithm that classifies Landsat 8/9 Operational Land Imager (OLI) pixels with higher accuracy than the Fmask algorithm. The algorithm, termed the Learning Attention Network Algorithm (LANA), is a form of U-Net but with an additional attention mechanism (a type of network structure) that, unlike conventional U-Net, uses more spatial pixel information across each image patch. The LANA was trained using 16,861 512 × 512 30 m pixel annotated Landsat 8 OLI patches extracted from 27 images and 69 image subsets that are publicly available and have been used by others for cloud mask algorithm development and assessment. The annotated data were manually refined to improve the annotation and were supplemented with another four annotated images selected to include clear, completely cloudy, and developed land images. The LANA classifies image pixels as either clear, thin cloud, cloud, or cloud shadow. To evaluate the classification accuracy, five annotated Landsat 8 OLI images (composed of >205 million 30 m pixels) were classified, and the results compared with the Fmask and a publicly available U-Net model (U-Net Wieland). The LANA had a 78% overall classification accuracy considering cloud, thin cloud, cloud shadow, and clear classes. As the LANA, Fmask, and U-Net Wieland algorithms have different class legends, their classification results were harmonized to the same three common classes: cloud, cloud shadow, and clear. Considering these three classes, the LANA had the highest (89%) overall accuracy, followed by Fmask (86%), and then U-Net Wieland (85%). The LANA had the highest F1-scores for cloud (0.92), cloud shadow (0.57), and clear (0.89), and the other two algorithms had lower F1-scores, particularly for cloud (Fmask 0.90, U-Net Wieland 0.88) and cloud shadow (Fmask 0.45, U-Net Wieland 0.52). In addition, a time-series evaluation was undertaken to examine the prevalence of undetected clouds and cloud shadows (i.e., omission errors). The band-specific temporal smoothness index (TSI
λ ) was applied to a year of Landsat 8 OLI surface reflectance observations after discarding pixel observations labelled as cloud or cloud shadow. This was undertaken independently at each gridded pixel location in four 5000 × 5000 30 m pixel Landsat analysis-ready data (ARD) tiles. The TSIλ results broadly reflected the classification accuracy results and indicated that the LANA had the smallest cloud and cloud shadow omission errors, whereas the Fmask had the greatest cloud omission error and the second greatest cloud shadow omission error. Detailed visual examination, true color image examples and classification results are included and confirm these findings. The TSIλ results also highlight the need for algorithm developers to undertake product quality assessment in addition to accuracy assessment. The LANA model, training and evaluation data, and application codes are publicly available for other researchers. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
17. A Novel Transformer Network with a CNN-Enhanced Cross-Attention Mechanism for Hyperspectral Image Classification.
- Author
-
Wang, Xinyu, Sun, Le, Lu, Chuhan, and Li, Baozhu
- Subjects
IMAGE recognition (Computer vision) ,CONVOLUTIONAL neural networks ,TRANSFORMER models ,DEEP learning ,IMAGE processing - Abstract
Recently, with the remarkable advancements of deep learning in the field of image processing, convolutional neural networks (CNNs) have garnered widespread attention from researchers in the domain of hyperspectral image (HSI) classification. Moreover, due to the high performance demonstrated by the transformer architecture in classification tasks, there has been a proliferation of neural networks combining CNNs and transformers for HSI classification. However, the majority of the current methods focus on extracting spatial–spectral features from the HSI data of a single size for a pixel, overlooking the rich multi-scale feature information inherent to the data. To address this problem, we designed a novel transformer network with a CNN-enhanced cross-attention (TNCCA) mechanism for HSI classification. It is a dual-branch network that utilizes different scales of HSI input data to extract shallow spatial–spectral features using a multi-scale 3D and 2D hybrid convolutional neural network. After converting the feature maps into tokens, a series of 2D convolutions and dilated convolutions are employed to generate two sets of Q (queries), K (keys), and V (values) at different scales in a cross-attention module. This transformer with CNN-enhanced cross-attention explores multi-scale CNN-enhanced features and fuses them from both branches. Experimental evaluations conducted on three widely used hyperspectral image (HSI) datasets, under the constraint of limited sample size, demonstrate excellent classification performance of the proposed network. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. AL-MRIS: An Active Learning-Based Multipath Residual Involution Siamese Network for Few-Shot Hyperspectral Image Classification.
- Author
-
Yang, Jinghui, Qin, Jia, Qian, Jinxi, Li, Anqi, and Wang, Liguo
- Subjects
IMAGE recognition (Computer vision) ,DEEP learning ,SHOT peening ,SUPPLY & demand - Abstract
In hyperspectral image (HSI) classification scenarios, deep learning-based methods have achieved excellent classification performance, but often rely on large-scale training datasets to ensure accuracy. However, in practical applications, the acquisition of hyperspectral labeled samples is time consuming, labor intensive and costly, which leads to a scarcity of obtained labeled samples. Suffering from insufficient training samples, few-shot sample conditions limit model training and ultimately affect HSI classification performance. To solve the above issues, an active learning (AL)-based multipath residual involution Siamese network for few-shot HSI classification (AL-MRIS) is proposed. First, an AL-based Siamese network framework is constructed. The Siamese network, which has relatively low demand for sample data, is adopted for classification, and the AL strategy is integrated to select more representative samples to improve the model's discriminative ability and reduce the costs of labeling samples in practice. Then, the multipath residual involution (MRIN) module is designed for the Siamese subnetwork to obtain the comprehensive features of the HSI. The involution operation was used to capture the fine-grained features and effectively aggregate the contextual semantic information of the HSI through dynamic weights. The MRIN module comprehensively considers the local features, dynamic features and global features through multipath residual connections, which improves the representation ability of HSIs. Moreover, a cosine distance-based contrastive loss is proposed for the Siamese network. By utilizing the directional similarity of high-dimensional HSI data, the discriminability of the Siamese classification network is improved. A large number of experimental results show that the proposed AL-MRIS method can achieve excellent classification performance with few-shot training samples, and compared with several state-of-the-art classification methods, the AL-MRIS method obtains the highest classification accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Multi-Scale Feature Fusion Network with Symmetric Attention for Land Cover Classification Using SAR and Optical Images.
- Author
-
Xu, Dongdong, Li, Zheng, Feng, Hao, Wu, Fanlu, and Wang, Yongcheng
- Subjects
ZONING ,OPTICAL images ,LAND cover ,DEEP learning ,IMAGE recognition (Computer vision) ,COMPUTATIONAL complexity - Abstract
The complementary characteristics of SAR and optical images are beneficial in improving the accuracy of land cover classification. Deep learning-based models have achieved some notable results. However, how to effectively extract and fuse the unique features of multi-modal images for pixel-level classification remains challenging. In this article, a two-branch supervised semantic segmentation framework without any pretrained backbone is proposed. Specifically, a novel symmetric attention module is designed with improved strip pooling. The multiple long receptive fields can better perceive irregular objects and obtain more anisotropic contextual information. Meanwhile, to solve the semantic absence and inconsistency of different modalities, we construct a multi-scale fusion module, which is composed of atrous spatial pyramid pooling, varisized convolutions and skip connections. A joint loss function is introduced to constrain the backpropagation and reduce the impact of class imbalance. Validation experiments were implemented on the DFC2020 and WHU-OPT-SAR datasets. The proposed model achieved the best quantitative values on the metrics of OA, Kappa and mIoU, and its class accuracy was also excellent. It is worth mentioning that the number of parameters and the computational complexity of the method are relatively low. The adaptability of the model was verified on RGB–thermal segmentation task. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. TAG-Net: Target Attitude Angle-Guided Network for Ship Detection and Classification in SAR Images.
- Author
-
Pan, Dece, Wu, Youming, Dai, Wei, Miao, Tian, Zhao, Wenchao, Gao, Xin, and Sun, Xian
- Subjects
DEEP learning ,IMAGE recognition (Computer vision) ,SYNTHETIC aperture radar ,RADARSAT satellites ,SHIPS ,ATTITUDE (Psychology) - Abstract
Synthetic aperture radar (SAR) ship detection and classification has gained unprecedented attention due to its important role in maritime transportation. Many deep learning-based detectors and classifiers have been successfully applied and achieved great progress. However, ships in SAR images present discrete and multi-centric features, and their scattering characteristics and edge information are sensitive to variations in target attitude angles (TAAs). These factors pose challenges for existing methods to obtain satisfactory results. To address these challenges, a novel target attitude angle-guided network (TAG-Net) is proposed in this article. The core idea of TAG-Net is to leverage TAA information as guidance and use an adaptive feature-level fusion strategy to dynamically learn more representative features that can handle the target imaging diversity caused by TAA. This is achieved through a TAA-aware feature modulation (TAFM) module. It uses the TAA information and foreground information as prior knowledge and establishes the relationship between the ship scattering characteristics and TAA information. This enables a reduction in the intra-class variability and highlights ship targets. Additionally, considering the different requirements of the detection and classification tasks for the scattering information, we propose a layer-wise attention-based task decoupling detection head (LATD). Unlike general deep learning methods that use shared features for both detection and classification tasks, LATD extracts multi-level features and uses layer attention to achieve feature decoupling and select the most suitable features for each task. Finally, we introduce a novel salient-enhanced feature balance module (SFB) to provide richer semantic information and capture the global context to highlight ships in complex scenes, effectively reducing the impact of background noise. A large-scale ship detection dataset (LSSDD+) is used to verify the effectiveness of TAG-Net, and our method achieves state-of-the-art performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Spectrally Segmented-Enhanced Neural Network for Precise Land Cover Object Classification in Hyperspectral Imagery.
- Author
-
Islam, Touhid, Islam, Rashedul, Uddin, Palash, and Ulhaq, Anwaar
- Subjects
IMAGE recognition (Computer vision) ,CONVOLUTIONAL neural networks ,SUPPORT vector machines ,DEEP learning ,OPTIMAL stopping (Mathematical statistics) ,LAND cover - Abstract
The paradigm shift brought by deep learning in land cover object classification in hyperspectral images (HSIs) is undeniable, particularly in addressing the intricate 3D cube structure inherent in HSI data. Leveraging convolutional neural networks (CNNs), despite their architectural constraints, offers a promising solution for precise spectral data classification. However, challenges persist in object classification in hyperspectral imagery or hyperspectral image classification, including the curse of dimensionality, data redundancy, overfitting, and computational costs. To tackle these hurdles, we introduce the spectrally segmented-enhanced neural network (SENN), a novel model integrating segmentation-based, multi-layer CNNs, SVM classification, and spectrally segmented dimensionality reduction. SENN adeptly integrates spectral–spatial data and is particularly crucial for agricultural land classification. By strategically fusing CNNs and support vector machines (SVMs), SENN enhances class differentiation while mitigating overfitting through dropout and early stopping techniques. Our contributions extend to effective dimensionality reduction, precise CNN-based classification, and enhanced performance via CNN-SVM fusion. SENN harnesses spectral information to surmount challenges in "hyperspectral image classification in hyperspectral imagery", marking a significant advancement in accuracy and efficiency within this domain. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Lightweight-VGG: A Fast Deep Learning Architecture Based on Dimensionality Reduction and Nonlinear Enhancement for Hyperspectral Image Classification.
- Author
-
Fei, Xuan, Wu, Sijia, Miao, Jianyu, Wang, Guicai, and Sun, Le
- Subjects
DEEP learning ,IMAGE recognition (Computer vision) ,IMAGE intensifiers ,PRINCIPAL components analysis - Abstract
In the past decade, deep learning methods have proven to be highly effective in the classification of hyperspectral images (HSI), consistently outperforming traditional approaches. However, the large number of spectral bands in HSI data can lead to interference during the learning process. To address this issue, dimensionality reduction techniques can be employed to minimize data redundancy and improve HSI classification performance. Hence, we have developed an efficient lightweight learning framework consisting of two main components. Firstly, we utilized band selection and principal component analysis to reduce the dimensionality of HSI data, thereby reducing redundancy while retaining essential features. Subsequently, the pre-processed data was input into a modified VGG-based learning network for HSI classification. This method incorporates an improved dynamic activation function for the multi-layer perceptron to enhance non-linearity, and reduces the number of nodes in the fully connected layers of the original VGG architecture to improve speed while maintaining accuracy. This modified network structure, referred to as lightweight-VGG (LVGG), was specifically designed for HSI classification. Comprehensive experiments conducted on three publicly available HSI datasets consistently demonstrated that the LVGG method exhibited similar or better performance compared to other typical methods in the field of HSI classification. Our approach not only addresses the challenge of interference in deep learning methods for HSI classification, but also offers a lightweight and efficient solution for achieving high classification accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Hyperspectral Image Classification Using Spectral–Spatial Double-Branch Attention Mechanism.
- Author
-
Kang, Jianfang, Zhang, Yaonan, Liu, Xinchao, and Cheng, Zhongxin
- Subjects
DEEP learning ,IMAGE recognition (Computer vision) ,CONVOLUTIONAL neural networks ,FEATURE extraction - Abstract
In recent years, deep learning methods utilizing convolutional neural networks have been extensively employed in hyperspectral image classification (HSI) applications. Nevertheless, while a substantial number of stacked 3D convolutions can indeed achieve high classification accuracy, they also introduce a significant number of parameters to the model, resulting in inefficiency. Furthermore, such intricate models often exhibit limited classification accuracy when confronted with restricted sample data, i.e., small sample problems. Therefore, we propose a spectral–spatial double-branch network (SSDBN) with an attention mechanism for HSI classification. The SSDBN is designed with two independent branches to extract spectral and spatial features, respectively, incorporating multi-scale 2D convolution modules, long short-term memory (LSTM), and an attention mechanism. The flexible use of 2D convolution, instead of 3D convolution, significantly reduces the model's parameter count, while the effective spectral–spatial double-branch feature extraction method allows SSDBN to perform exceptionally well in handling small sample problems. When tested on 5%, 0.5%, and 5% of the Indian Pines, Pavia University, and Kennedy Space Center datasets, SSDBN achieved classification accuracies of 97.56%, 96.85%, and 98.68%, respectively. Additionally, we conducted a comparison of training and testing times, with results demonstrating the remarkable efficiency of SSDBN. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.