1,249 results on '"scene classification"'
Search Results
202. Scene Classification Using Deep Learning Technique
- Author
-
Shah, Aayushi, Rana, Keyur, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Singh, Pradeep Kumar, editor, Pawłowski, Wiesław, editor, Tanwar, Sudeep, editor, Kumar, Neeraj, editor, Rodrigues, Joel J. P. C., editor, and Obaidat, Mohammad Salameh, editor
- Published
- 2020
- Full Text
- View/download PDF
203. Indoor–Outdoor Scene Classification with Residual Convolutional Neural Network
- Author
-
Kumari, Seema, Jha, Ranjeet Ranjan, Bhavsar, Arnav, Nigam, Aditya, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Chaudhuri, Bidyut B., editor, Nakagawa, Masaki, editor, Khanna, Pritee, editor, and Kumar, Sanjeev, editor
- Published
- 2020
- Full Text
- View/download PDF
204. Integrated YOLO Based Object Detection for Semantic Outdoor Natural Scene Classification
- Author
-
Laulkar, C. A., Kulkarni, P. J., Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Iyer, Brijesh, editor, Rajurkar, A. M., editor, and Gudivada, Venkat, editor
- Published
- 2020
- Full Text
- View/download PDF
205. Road Weather Condition Estimation Using Fixed and Mobile Based Cameras
- Author
-
Ozcan, Koray, Sharma, Anuj, Knickerbocker, Skylar, Merickel, Jennifer, Hawkins, Neal, Rizzo, Matthew, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Arai, Kohei, editor, and Kapoor, Supriya, editor
- Published
- 2020
- Full Text
- View/download PDF
206. Scene Classification of Remote Sensing Images Based on ConvNet Features and Multi-grained Forest
- Author
-
Tombe, Ronald, Viriri, Serestina, Dombeu, Jean Vincent Fonou, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bebis, George, editor, Yin, Zhaozheng, editor, Kim, Edward, editor, Bender, Jan, editor, Subr, Kartic, editor, Kwon, Bum Chul, editor, Zhao, Jian, editor, Kalkofen, Denis, editor, and Baciu, George, editor
- Published
- 2020
- Full Text
- View/download PDF
207. Broad-Classifier for Remote Sensing Scene Classification with Spatial and Channel-Wise Attention
- Author
-
Chen, Zhihua, Liu, Yunna, Zhang, Han, Sheng, Bin, Li, Ping, Xue, Guangtao, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Magnenat-Thalmann, Nadia, editor, Stephanidis, Constantine, editor, Wu, Enhua, editor, Thalmann, Daniel, editor, Sheng, Bin, editor, Kim, Jinman, editor, Papagiannakis, George, editor, and Gavrilova, Marina, editor
- Published
- 2020
- Full Text
- View/download PDF
208. Active Scene Classification via Dynamically Learning Prototypical Views
- Author
-
Daniels, Zachary A., Metaxas, Dimitris N., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Darema, Frederica, editor, Blasch, Erik, editor, Ravela, Sai, editor, and Aved, Alex, editor
- Published
- 2020
- Full Text
- View/download PDF
209. PolSAR Scene Classification via Low-Rank Constrained Multimodal Tensor Representation.
- Author
-
Ren, Bo, Chen, Mengqian, Hou, Biao, Hong, Danfeng, Ma, Shibin, Chanussot, Jocelyn, and Jiao, Licheng
- Subjects
- *
POLARIMETRY , *SYNTHETIC aperture radar , *LOW-rank matrices , *SYNTHETIC apertures , *WEATHER - Abstract
Polarimetric synthetic aperture radar (PolSAR) data can be acquired at all times and are not impacted by weather conditions. They can efficiently capture geometrical and geographical structures on the ground. However, due to the complexity of the data and the difficulty of data availability, PolSAR image scene classification remains a challenging task. To this end, in this paper, a low-rank constrained multimodal tensor representation method (LR-MTR) is proposed to integrate PolSAR data in multimodal representations. To preserve the multimodal polarimetric information simultaneously, the target decompositions in a scene from multiple spaces (e.g., Freeman, H/A/ α , Pauli, etc.) are exploited to provide multiple pseudo-color images. Furthermore, a representation tensor is constructed via the representation matrices and constrained by the low-rank norm to keep the cross-information from multiple spaces. A projection matrix is also calculated by minimizing the differences between the whole cascaded data set and the features in the corresponding space. It also reduces the redundancy of those multiple spaces and solves the out-of-sample problem in the large-scale data set. To support the experiments, two new PolSAR image data sets are built via ALOS-2 full polarization data, covering the areas of Shanghai, China, and Tokyo, Japan. Compared with state-of-the-art (SOTA) dimension reduction algorithms, the proposed method achieves the best quantitative performance and demonstrates superiority in fusing multimodal PolSAR features for image scene classification. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
210. A Lightweight Self-Supervised Representation Learning Algorithm for Scene Classification in Spaceborne SAR and Optical Images.
- Author
-
Xiao, Xiao, Li, Changjian, and Lei, Yinjie
- Subjects
- *
SPACE-based radar , *SUPERVISED learning , *OPTICAL images , *CLASSIFICATION algorithms , *MACHINE learning , *CONVOLUTIONAL neural networks , *SYNTHETIC aperture radar - Abstract
Despite the increasing amount of spaceborne synthetic aperture radar (SAR) images and optical images, only a few annotated data can be used directly for scene classification tasks based on convolution neural networks (CNNs). For this situation, self-supervised learning methods can improve scene classification accuracy through learning representations from extensive unlabeled data. However, existing self-supervised scene classification algorithms are hard to deploy on satellites, due to the high computation consumption. To address this challenge, we propose a simple, yet effective, self-supervised representation learning (Lite-SRL) algorithm for the scene classification task. First, we design a lightweight contrastive learning structure for Lite-SRL, we apply a stochastic augmentation strategy to obtain augmented views from unlabeled spaceborne images, and Lite-SRL maximizes the similarity of augmented views to learn valuable representations. Then, we adopt the stop-gradient operation to make Lite-SRL's training process not rely on large queues or negative samples, which can reduce the computation consumption. Furthermore, in order to deploy Lite-SRL on low-power on-board computing platforms, we propose a distributed hybrid parallelism (DHP) framework and a computation workload balancing (CWB) module for Lite-SRL. Experiments on representative datasets including OpenSARUrban, WHU-SAR6, NWPU-Resisc45, and AID dataset demonstrate that Lite-SRL can improve the scene classification accuracy under limited annotated data, and it is generalizable to both SAR and optical images. Meanwhile, compared with six state-of-the-art self-supervised algorithms, Lite-SRL has clear advantages in overall accuracy, number of parameters, memory consumption, and training latency. Eventually, to evaluate the proposed work's on-board operational capability, we transplant Lite-SRL to the low-power computing platform NVIDIA Jetson TX2. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
211. NIGAN: A Framework for Mountain Road Extraction Integrating Remote Sensing Road-Scene Neighborhood Probability Enhancements and Improved Conditional Generative Adversarial Network.
- Author
-
Chen, Weitao, Zhou, Gaodian, Liu, Zhuoyue, Li, Xianju, Zheng, Xiongwei, and Wang, Lizhe
- Subjects
- *
GENERATIVE adversarial networks , *REMOTE sensing , *DATA mining , *DEEP learning , *NEIGHBORHOODS - Abstract
Mountain roads are a source of important basic geographic data used in various fields. The automatic extraction of road images through high-resolution remote sensing imagery using deep learning has attracted considerable attention. But the interference of context information limited extraction accuracy, especially for roads in mountain area. Furthermore, when pursuing research in a new district, many algorithms are difficult to train due to a lack of data. To address these issues, a framework based on remote sensing road-scene neighborhood probability enhancement and improved conditional generative adversarial network (NIGAN) is proposed in this article. This framework can be divided into two sections: 1) road scenes classification section. A remote sensing road-scene neighborhood confidence enhancement method was designed for classifying road scenes of the study area to reduce the impact of nonroad information on subsequent fine-road segmentation and 2) fine-road segmentation section. An improved dilated convolution module, which is helpful in extracting small objects such as road, was added into the conditional generative adversarial network (CGAN) to increase the receptive field and pay attention to global information, and segment roads from the results of road scenes classification section. To validate the NIGAN framework, new mountain road-scene and label datasets were constructed, and diverse comparison experiments were performed. The results indicate that the NIGAN framework can improve the integrity and accuracy of mountain road-scene extraction in diverse and complex conditions. The results further confirm the validity of the NIGAN framework in small samples. In addition, the mountain road-scene datasets can serve as benchmark datasets for studying mountain road extraction. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
212. GenericConv: A Generic Model for Image Scene Classification Using Few-Shot Learning.
- Author
-
Soudy, Mohamed, Afify, Yasmine M., and Badr, Nagwa
- Subjects
- *
IMAGE analysis , *MACHINE tools , *COMPUTER vision , *CLASSIFICATION , *MACHINE learning - Abstract
Scene classification is one of the most complex tasks in computer-vision. The accuracy of scene classification is dependent on other subtasks such as object detection and object classification. Accurate results may be accomplished by employing object detection in scene classification since prior information about objects in the image will lead to an easier interpretation of the image content. Machine and transfer learning are widely employed in scene classification achieving optimal performance. Despite the promising performance of existing models in scene classification, there are still major issues. First, the training phase for the models necessitates a large amount of data, which is a difficult and time-consuming task. Furthermore, most models are reliant on data previously seen in the training set, resulting in ineffective models that can only identify samples that are similar to the training set. As a result, few-shot learning has been introduced. Although few attempts have been reported applying few-shot learning to scene classification, they resulted in perfect accuracy. Motivated by these findings, in this paper we implement a novel few-shot learning model—GenericConv—for scene classification that has been evaluated using benchmarked datasets: MiniSun, MiniPlaces, and MIT-Indoor 67 datasets. The experimental results show that the proposed model GenericConv outperforms the other benchmark models on the three datasets, achieving accuracies of 52.16 ± 0.015, 35.86 ± 0.014, and 37.26 ± 0.014 for five-shots on MiniSun, MiniPlaces, and MIT-Indoor 67 datasets, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
213. A HYBRID DILATION APPROACH FOR REMOTE SENSING SCENE IMAGE CLASSIFICATION.
- Author
-
Balarabe, Anas Tukur and Jordanov, Ivan
- Subjects
RECEIVER operating characteristic curves ,REMOTE sensing ,CLASSIFICATION ,DEEP learning - Abstract
While fine-tuning a transfer learning model alleviates the need for a vast amount of training data, it still comes with a few challenges. One of them is the range of image dimensions that the input layer of a model accepts. This issue is of interest, especially in tasks that require the use of a transfer learning model. In scene classification, for instance, images could come in varying sizes that could be too large/small to be fed into the first layer of the architecture. While resizing could be used to trim images to a required shape, that is usually not possible for images with tiny dimensions, for example, in the case of the EuroSAT dataset. This paper proposes an Xception model-based framework that accepts images of arbitrary size and then resizes or interpolates them before extracting and enhancing the discriminative features using an adaptive dilation module. After applying the approach for scene classification problems and carrying out a number of experiments and simulations, we achieved 98.55% accuracy on the EuroSAT dataset, 99.22% on UCM, 96.15% on AID and 96.04% on the SIRI-WHU dataset, respectively. We also monitored the micro-average and macro-average ROC curve scores for all the datasets to further evaluate the proposed model's effectiveness. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
214. Intelligent Deep Data Analytics Based Remote Sensing Scene Classification Model.
- Author
-
Althobaiti, Ahmed, Alhumaidi Alotaibi, Abdullah, Abdel-Khalek, Sayed, Alsuhibany, Suliman A., and Mansour, Romany F.
- Subjects
REMOTE sensing ,LANDSLIDES ,FOREST fires ,DATABASES ,DRONE aircraft ,DECISION making ,EARTH sciences - Abstract
Latest advancements in the integration of camera sensors paves a way for new Unmanned Aerial Vehicles (UAVs) applications such as analyzing geographical (spatial) variations of earth science in mitigating harmful environmental impacts and climate change. UAVs have achieved significant attention as a remote sensing environment, which captures high-resolution images from different scenes such as land, forest fire, flooding threats, road collision, landslides, and so on to enhance data analysis and decision making. Dynamic scene classification has attracted much attention in the examination of earth data captured by UAVs. This paper proposes a new multi-modal fusion based earth data classification (MMF-EDC) model. The MMF-EDC technique aims to identify the patterns that exist in the earth data and classifies them into appropriate class labels. The MMF-EDC technique involves a fusion of histogram of gradients (HOG), local binary patterns (LBP), and residual network (ResNet) models. This fusion process integrates many feature vectors and an entropy based fusion process is carried out to enhance the classification performance. In addition, the quantum artificial flora optimization (QAFO) algorithm is applied as a hyperparameter optimization technique. The AFO algorithm is inspired by the reproduction and the migration of flora helps to decide the optimal parameters of the ResNet model namely learning rate, number of hidden layers, and their number of neurons. Besides, Variational Autoencoder (VAE) based classification model is applied to assign appropriate class labels for a useful set of feature vectors. The proposed MMF-EDC model has been tested using UCM and WHU-RS datasets. The proposed MMFEDC model attains exhibits promising classification results on the applied remote sensing images with the accuracy of 0.989 and 0.994 on the test UCM and WHU-RS dataset respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
215. Improved transfer learning of CNN through fine-tuning and classifier ensemble for scene classification.
- Author
-
Thirumaladevi, S., Veera Swamy, K., and Sailaja, M.
- Subjects
- *
CONVOLUTIONAL neural networks , *PLURALITY voting , *REMOTE sensing - Abstract
In high-resolution remote sensing imageries, the scene classification is one of the challenging problems due to the similarity of image structure and available datasets are all small. Performing training with small datasets on new convolutional neural network (CNN) is inclined to overfitting, and attainability is poor. To overcome this, we go for a stream of transfer learning, fine-tuning strategy. Here, we consider AlexNet, VGG 19, and VGG 16 pre-trained CNNs. First, design a network by replacing the classifier stage layers with revised ones through transfer learning. Second, apply fine-tuning from right to left and perform retraining on the classifier stage and part of the feature extraction stage (last convolutional block). Third, form a classifier ensemble by using the majority voting learner strategy to explore better classification results. The datasets called UCM and SIRI-WHU were used and compared with the state-of-the-art methods. Finally, to check the usefulness of our proposed methods, form sub-datasets from AID and WHU-RS19 datasets with likely labeled class names. To assess the performance of the proposed classifiers compute overall accuracy using confusion matrix and F1-score. The results of the proposed methods improve the accuracy from 93.57 to 99.04% for UCM and 91.34 to 99.16% for SIRI-WHU. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
216. A Weakly Pseudo-Supervised Decorrelated Subdomain Adaptation Framework for Cross-Domain Land-Use Classification.
- Author
-
Zhu, Qiqi, Sun, Yuwen, Guan, Qingfeng, Wang, Lizhe, and Lin, Weihua
- Subjects
- *
REMOTE-sensing images , *MACHINE learning , *SPATIAL resolution , *REMOTE sensing , *CLASSIFICATION , *PHYSIOLOGICAL adaptation - Abstract
High spatial resolution (HSR) remote-sensing image scene classification is a crucial way for land-use interpretation. However, most of the current scene classification methods assume that the training and test sets of remote-sensing images follow the same feature distribution. In practical application, this assumption is difficult to guarantee. Domain adaptation (DA) is a machine-learning paradigm that can effectively alleviate such problems. However, previous studies mostly focused on aligning the global distribution of the source domain (SD) and the target domain (TD), which loses the inter-subdomain contextual relations between both domains and ignores the redundancy among features. However, most DA methods usually only use the manually designed measurement criteria to establish the relationship between the SD and the TD, which is insufficient or complicated. In this article, a weakly pseudo-supervised decorrelated subdomain adaptation (WPS-DSA) framework is proposed for HSR cross-domain land-use classification. In WPS-DSA, a feature extractor based on the subdomain adaptation network is used to extract the inter-subdomain characteristics of both domains. To weaken the influence of the features redundancy among remote-sensing images, the switchable whitening (SW) module is introduced. In addition, a domain hierarchical sampling (DHS) mechanism is designed to strengthen the connection between SD and the TD simply. Moreover, the WuHan-ShangHai (WH-SH) dataset that is sampled from two typical Chinese cities is constructed to verify the generalization of the proposed framework. The experimental results of the cross-domain tasks on three publicly available HSR datasets and the WH-SH DA dataset display considerable performance and generalization ability of WPS-DSA. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
217. Pair-Wise Similarity Knowledge Distillation for RSI Scene Classification.
- Author
-
Zhao, Haoran, Sun, Xin, Gao, Feng, and Dong, Junyu
- Subjects
- *
MACHINE learning , *CONVOLUTIONAL neural networks , *MULTISPECTRAL imaging , *REMOTE sensing - Abstract
Remote sensing image (RSI) scene classification aims to identify the semantic categories of remote sensing images based on their contents. Owing to the strong learning capability of deep convolutional neural networks (CNNs), RSI scene classification methods based on CNNs have drawn much attention and achieved remarkable performance. However, such outstanding deep neural networks are usually computationally expensive and time-consuming, making them impossible to apply on resource-constrained edge devices, such as the embedded systems used on drones. To tackle this problem, we introduce a novel pair-wise similarity knowledge distillation method, which could reduce the model complexity while maintaining satisfactory accuracy, to obtain a compact and efficient deep neural network for RSI scene classification. Different from the existing knowledge distillation methods, we design a novel distillation loss to transfer the valuable discriminative information, which could reduce the within-class variations and restrain the between-class similarity, from the cumbersome model to the compact model. This method could obtain the compact student model with higher performance compared with existing knowledge distillation methods in RSI scene classification. To be specific, we distill the probability outputs between sample pairs with the same label and match the probability outputs between the teacher and student models. Experiments on three public benchmark datasets for RSI scene classification, i.e., AID, UCMerced, and NWPU-RESISC datasets, verify that the proposed method could effectively distill the knowledge and result in a higher performance. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
218. Class-Shared SparsePCA for Few-Shot Remote Sensing Scene Classification.
- Author
-
Wang, Jiayan, Wang, Xueqin, Xing, Lei, Liu, Bao-Di, and Li, Zongmin
- Subjects
- *
REMOTE sensing , *PRINCIPAL components analysis - Abstract
In recent years, few-shot remote sensing scene classification has attracted significant attention, aiming to obtain excellent performance under the condition of insufficient sample numbers. A few-shot remote sensing scene classification framework contains two phases: (i) the pre-training phase seeks to adopt base data to train a feature extractor, and (ii) the meta-testing phase uses the pre-training feature extractor to extract novel data features and design classifiers to complete classification tasks. Because of the difference in the data category, the pre-training feature extractor cannot adapt to the novel data category, named negative transfer problem. We propose a novel method for few-shot remote sensing scene classification based on shared class Sparse Principal Component Analysis (SparsePCA) to solve this problem. First, we propose, using self-supervised learning, to assist-train a feature extractor. We construct a self-supervised assisted classification task to improve the robustness of the feature extractor in the case of fewer training samples and make it more suitable for the downstream classification task. Then, we propose a novel classifier for the few-shot remote sensing scene classification named Class-Shared SparsePCA classifier (CSSPCA). The CSSPCA projects novel data features into subspace to make reconstructed features more discriminative and complete the classification task. We have conducted many experiments on remote sensing datasets, and the results show that the proposed method dramatically improves classification accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
219. Multispectral Scene Classification via Cross-Modal Knowledge Distillation.
- Author
-
Liu, Hao, Qu, Ying, and Zhang, Liqiang
- Subjects
- *
KNOWLEDGE transfer , *MACHINE learning , *REMOTE sensing , *TEACHERS' assistants , *CLASSIFICATION , *THEMATIC mapper satellite - Abstract
Scene classification is a fundamental task for numeral remote sensing (RS) applications, which aims to assign semantic labels to image patches. Although deep neural networks (DNNs) demonstrated unique strength in scene classification, their performances are still limited due to the lack of training samples in the RS field. Recent studies show that the performance of scene classification can be improved by taking advantage of the knowledge transferred from models pretrained on RGB images. However, the modalities’ differences between input images hinder the knowledge transfer across models, especially when the input of the models has distinct spectral bands. To tackle the challenges, we propose a cross-modal knowledge distillation framework to improve the performance of multispectral scene classification by transferring the prior knowledge from teacher models pretrained on RGB images to the student network with limited samples. Moreover, a teacher assistant (TA) network is introduced to further improve the classification performance by bridging the gap between the teacher and student networks. The proposed strategy is evaluated on models with multimodality inputs with distinct spectral bands and demonstrates superior performance compared to the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
220. Satellite Video Scene Classification Using Low-Rank Sparse Representation Two-Stream Networks.
- Author
-
Wang, Tengfei, Gu, Yanfeng, and Gao, Guoming
- Subjects
- *
FEATURE extraction , *REMOTE sensing , *SIGNAL-to-noise ratio , *VIDEOS , *CLASSIFICATION - Abstract
Satellite video scene classification (SVSC) is a challenging work in remote sensing. The main procedure of SVSC is spatial–temporal feature extraction. Unfortunately, massive numbers of dim small moving targets and the low signal-to-noise ratio (SNR) of satellite video bring great challenges to feature extraction. It is difficult to apply traditional feature extraction methods to SVSC because they are used to classify the actions of high-quality video. According to the theory of low-rank sparse decomposition, a Low-rank Sparse Representation Two-stream Network (LSRTN) is designed to increase the classification accuracy of two-stream networks. First, we propose a Low-rank Sparse Component Analysis Network (LSCAN) to decompose satellite videos into low-rank background images and sparse moving target sequences. The LSCAN possesses the advantage of low-rank sparse decomposition to solve small targets and has the capability to adjust the features using the data. Moreover, the LSCAN can efficiently improve the feature extraction of low SNR video. Second, a two-stream structure that was proven to be effective for multiclass video classification was applied to obtain the spatial features and temporal features in each stream. Finally, a fully connected layer integrates the features to classify the satellite video scenes. To utilize the label information, we refine the loss function to adjust the degree of low-rank sparse characteristics and ensure the classification accuracy of training. The experimental results demonstrate that the proposed method achieves better performance than the baseline methods for the SVSC task. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
221. Remote sensing scene classification based on high-order graph convolutional network
- Author
-
Yue Gao, Jun Shi, Jun Li, and Ruoyu Wang
- Subjects
remote sensing ,scene classification ,feature representation ,graph convolutional network ,Oceanography ,GC1-1581 ,Geology ,QE1-996.5 - Abstract
Remote sensing scene classification has gained increasing interest in remote sensing image understanding and feature representation is the crucial factor for classification methods. Convolutional Neural Network (CNN) generally uses hierarchical deep structure to automatically learn the feature representation from the whole images and thus has been widely applied in scene classification. However, it may fail to consider the discriminative components within the image during the learning process. Moreover, the potential relationships of scene semantics are likely to be ignored. In this paper, we present a novel remote sensing scene classification method based on high-order graph convolutional network (H-GCN). Our method uses the attention mechanism to focus on the key components inside the image during CNN feature learning. More importantly, high-order graph convolutional network is applied to investigate the class dependencies. The graph structure is built where each node is described by the mean of attentive CNN features from each semantic class. The semantic class dependencies are propagated with mixing neighbor information of nodes at different orders and thus the more informative representation of nodes can be gained. The node representations of H-GCN and attention CNN features are finally integrated as the discriminative feature representation for scene classification. Experimental results on benchmark datasets demonstrate the feasibility and effectiveness of our method for remote sensing scene classification.
- Published
- 2021
- Full Text
- View/download PDF
222. Mutual Enhancement of Environment Recognition and Semantic Segmentation in Indoor Environment
- Author
-
Challa, Venkata Vamsi and Challa, Venkata Vamsi
- Abstract
Background:The dynamic field of computer vision and artificial intelligence has continually evolved, pushing the boundaries in areas like semantic segmentation andenvironmental recognition, pivotal for indoor scene analysis. This research investigates the integration of these two technologies, examining their synergy and implicayions for enhancing indoor scene understanding. The application of this integrationspans across various domains, including smart home systems for enhanced ambientliving, navigation assistance for Cleaning robots, and advanced surveillance for security. Objectives: The primary goal is to assess the impact of integrating semantic segmentation data on the accuracy of environmental recognition algorithms in indoor environments. Additionally, the study explores how environmental context can enhance the precision and accuracy of contour-aware semantic segmentation. Methods: The research employed an extensive methodology, utilizing various machine learning models, including standard algorithms, Long Short-Term Memorynetworks, and ensemble methods. Transfer learning with models like EfficientNet B3, MobileNetV3 and Vision Tranformer was a key aspect of the experimentation. The experiments were designed to measure the effect of semantic segmentation on environmental recognition and its reciprocal influence. Results: The findings indicated that the integration of semantic segmentation data significantly enhanced the accuracy of environmental recognition algorithms. Conversely, incorporating environmental context into contour-aware semantic segmentation led to notable improvements in precision and accuracy, reflected in metrics such as Mean Intersection over Union(MIoU). Conclusion: This research underscores the mutual enhancement between semantic segmentation and environmental recognition, demonstrating how each technology significantly boosts the effectiveness of the other in indoor scene analysis. The integration of semantic segmentation data notably
- Published
- 2024
223. Universal adversarial defense in remote sensing based on pre-trained denoising diffusion models
- Author
-
(0000-0003-1111-572X) Yu, W., (0000-0002-6857-0152) Xu, Y., (0000-0003-1203-741X) Ghamisi, P., (0000-0003-1111-572X) Yu, W., (0000-0002-6857-0152) Xu, Y., and (0000-0003-1203-741X) Ghamisi, P.
- Abstract
Deep neural networks (DNNs) have achieved tremendous success in many remote sensing (RS) applications. However, their vulnerability to the threat of adversarial perturbations should not be neglected. Unfortunately, current adversarial defense approaches in RS studies usually suffer from performance fluctuation and unnecessary re-training costs due to the need for prior knowledge of the adversarial perturbations among RS data. To circumvent these challenges, we propose a universal adversarial defense approach in RS imagery (UAD-RS) using pre-trained diffusion models to defend the common DNNs against multiple unknown adversarial attacks. Specifically, the generative diffusion models are first pre-trained on different RS datasets to learn generalized representations in various data domains. After that, a universal adversarial purification framework is developed using the forward and reverse process of the pre-trained diffusion models to purify the perturbations from adversarial samples. Furthermore, an adaptive noise level selection (ANLS) mechanism is built to capture the optimal noise level of the diffusion model that can achieve the best purification results closest to the clean samples according to their Frechet Inception Distance (FID) in deep feature space. As a result, only a single pre-trained diffusion model is needed for the universal purification of adversarial samples on each dataset, which significantly alleviates the re-training efforts for each attack setting and maintains high performance without the prior knowledge of adversarial perturbations. Experiments on four heterogeneous RS datasets regarding scene classification and semantic segmentation verify that UAD-RS outperforms state-of-the-art adversarial purification approaches with a universal defense against seven commonly existing adversarial perturbations.
- Published
- 2024
224. Multi-Modal Neural Feature Fusion for Automatic Driving Through Perception-Aware Path Planning
- Author
-
Zhenyu Li, Aiguo Zhou, Jiakun Pu, and Jiangyang Yu
- Subjects
Automatic driving ,path planning ,pose estimation ,scene classification ,VIO ,obstacle detection ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Path planning is a significant and challenging task in the domain of automatic driving. Many applications, such as autonomous driving, robotic navigation and aircraft object tracking in complex and changing urban road scenes, need accurate and robust path planning by detecting obstacles in the forward direction. The traditional methods only rely on the path search method without considering the environmental factors, the vehicle path planning method cannot deal with the complex and changeable environment. To deal with above problems, we propose a perception-aware based multi-modal feature fusion approach that combines visual-inertial odometer (VIO) poses and semantic obstacles in the forward scene of vehicles to plan driving paths. The proposed method takes environment awareness as the guide and combines path search algorithm to realize path optimization task in complex environment. The proposed approach first uses a long short memory network (LSTM) to build a VIO that fuses visual and inertial data for pose estimation. To detect obstacles, the proposed method uses a segmentation model with a lightweight structure to extract semantic 3D landmarks. Finally, a path search strategy combining an A* algorithm and visual information is proposed to plan driving paths for intelligent vehicles. We estimate the proposed path planning method on assimilated scenes and public datasets (KITTI and Cityscapes) by using a micro controller (Jetson Xavier NX) installed on a small vehicle. We also show comparable results with path planning that only uses the greedy algorithm or heuristic algorithm without using visual information and show that our method is adequate in coping with different complex scenes.
- Published
- 2021
- Full Text
- View/download PDF
225. Fusing Deep Features by Kernel Collaborative Representation for Remote Sensing Scene Classification
- Author
-
Xiaoning Chen, Mingyang Ma, Yong Li, and Wei Cheng
- Subjects
Collaborative representation classification (CRC) ,feature fusion ,kernel trick ,remote sensing ,scene classification ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Remote sensing scene classification is widely concerned because of its wide applications. Recently, convolutional neural networks (CNNs) have made a significant breakthrough in remote sensing image scene classification. However, the accuracy of using only a fully connected layer of CNNs as a classifier is not satisfied, especially for few-shot remote sensing images. In this article, we propose a feature-fusion-based kernel collaborative representation classification (FF-KCRC) framework for few-shot remote sensing images, which can make full use of the synergy between samples and the similarity between different types of image features to improve the accuracy of scene classification for few-shot remote sensing images. Specifically, we first design an effective feature extraction strategy to obtain more discriminative image features from CNNs, in which transfer learning is used to transfer the weights of pretrained CNNs to alleviate the few-shot training problem. Then, we design the FF-KCRC framework to make full use of the synergy between different categories and fuse the classification of different features, where “kernel trick” is used to address the problem of linear indivisibility. Extensive experiments have been conducted on publicly available remote sensing image datasets, and the results show that the proposed FF-KCRC achieves state-of-the-art results.
- Published
- 2021
- Full Text
- View/download PDF
226. A Multi-Level Convolution Pyramid Semantic Fusion Framework for High-Resolution Remote Sensing Image Scene Classification and Annotation
- Author
-
Xiongli Sun, Qiqi Zhu, and Qianqing Qin
- Subjects
High spatial resolution image ,scene classification ,bag of visual words ,feature pyramid ,multi-level ,remote sensing ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
High spatial resolution (HSR) imagery scene classification has become a hot research topic in remote sensing. Scene classification method based on the handcrafted features, such as the bag-of-visual-words (BoVW) model, describes an image by extracting local features of the scene and mapping them to the dictionary space, but usually uses a shallow structure and loses the spatial distribution characteristics of the scene. The method based on deep learning extracts hierarchical features to describe the scene, which can maintain the spatial position information well. However, deep features in different levels have scale recognition restrictions for multi-scale ground objects, and cannot understand complex scenes well. In this paper, the multi-level convolutional pyramid semantic fusion (MCPSF) framework is proposed for HSR imagery scene classification. Differing from previous scene classification methods, which integrate the feature of different levels directly, of which the fusion features have large differences in both sparsity and eigenvalue magnitude, MCPSF integrates multi-level semantic features extracted by BoVW model and convolutional neural network (CNN) model. In MCPSF, two convolution pyramid feature expression strategies are proposed to enhance the ability of capturing multi-scale land objects, i.e., local and convolutional pyramid based BoVW (LCPB) model and local and convolutional pyramid based pooling-stretched (LCPP) model. The effectiveness of the proposed method is verified on 21-class UC Merced (UCM) dataset and 30-class Aerial Image Dataset (AID). The framework was also transferred toa case study of scene annotation in Wuhan. The proposed framework significantly improves the performance when compared with other state-of-the-art methods.
- Published
- 2021
- Full Text
- View/download PDF
227. Attention Consistent Network for Remote Sensing Scene Classification
- Author
-
Xu Tang, Qiushuo Ma, Xiangrong Zhang, Fang Liu, Jingjing Ma, and Licheng Jiao
- Subjects
Convolutional neural network (CNN) ,remote sensing (RS) ,scene classification ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Remote sensing (RS) image scene classification is an important research topic in the RS community, which aims to assign the semantics to the land covers. Recently, due to the strong behavior of convolutional neural network (CNN) in feature representation, the growing number of CNN-based classification methods has been proposed for RS images. Although they achieve cracking performance, there is still some room for improvement. First, apart from the global information, the local features are crucial to distinguish the RS images. The existing networks are good at capturing the global features since the CNNs' hierarchical structure and the nonlinear fitting capacity. However, the local features are not always emphasized. Second, to obtain satisfactory classification results, the distances of RS images from the same/different classes should be minimized/maximized. Nevertheless, these key points in pattern classification do not get the attention they deserve. To overcome the limitation mentioned above, we propose a new CNN named attention consistent network (ACNet) based on the Siamese network in this article. First, due to the dual-branch structure of ACNet, the input data are the image pairs that are obtained by the spatial rotation. This helps our model to fully explore the global features from RS images. Second, we introduce different attention techniques to mine the objects' information from RS images comprehensively. Third, considering the influence of the spatial rotation and the similarities between RS images, we develop an attention consistent model to unify the salient regions and impact/separate the RS images from the same/different semantic categories. Finally, the classification results can be obtained using the learned features. Three popular RS scene datasets are selected to validate our ACNet. Compared with some existing networks, the proposed method can achieve better performance. The encouraging results illustrate that ACNet is effective for the RS image scene classification. The source codes of this method can be found in https://github.com/TangXu-Group/Remote-Sensing-Images-Classification/tree/main/GLCnet.
- Published
- 2021
- Full Text
- View/download PDF
228. CSDS: End-to-End Aerial Scenes Classification With Depthwise Separable Convolution and an Attention Mechanism
- Author
-
Xinyu Wang, Liming Yuan, Haixia Xu, and Xianbin Wen
- Subjects
Channel–spatial attention ,convolutional neural network (CNN) ,depthwise separable convolution (DS-Conv) ,scene classification ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Compared with natural scenes, aerial scenes are usually composed of numerous objects densely distributed within the aerial view, and thus, more key local semantic features are needed to describe them. However, when existing CNNs are used for remote sensing image classification, they typically focus on the global semantic features of the image, and especially for deep models, shallow and intermediate features are easily lost. This article proposes a channel–spatial attention mechanism based on a depthwise separable convolution (CSDS) network for aerial scene classification to solve these challenges. First, we construct a depthwise separable convolution (DS-Conv) and pyramid residual connection architecture. DS-Conv extracts features from each channel and merges them, effectively reducing the number of necessary calculations, and the pyramid residual connections connect the features from multiple layers and create associations. Then, the channel–spatial attention algorithm causes the model to obtain more effective features in the channel and spatial domains. Finally, an improved cross-entropy loss function is used to reduce the impact of similar categories on backpropagation. Comparative experiments on three public datasets show that the CSDS network can achieve results comparable to those of other state-of-the-art methods. In addition, visualization of feature extraction results by the Grad-CAM algorithm and ablation experiments for each module reflect the powerful feature learning and representation capabilities of the proposed CSDS network.
- Published
- 2021
- Full Text
- View/download PDF
229. MSMatch: Semisupervised Multispectral Scene Classification With Few Labels
- Author
-
Pablo Gomez and Gabriele Meoni
- Subjects
Deep learning ,multispectral image classification ,scene classification ,semisupervised learning ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Supervised learning techniques are at the center of many tasks in remote sensing. Unfortunately, these methods, especially recent deep learning methods, often require large amounts of labeled data for training. Even though satellites acquire large amounts of data, labeling the data is often tedious, expensive, and requires expert knowledge. Hence, improved methods that require fewer labeled samples are needed. We present MSMatch, the first semisupervised learning approach competitive with supervised methods on scene classification on the EuroSAT and UC Merced Land Use benchmark datasets. We test both RGB and multispectral images of EuroSAT and perform various ablation studies to identify the critical parts of the model. The trained neural network outperforms previous methods by up to 19.76% and 5.59% on EuroSAT and the UC Merced Land Use datasets, respectively. With just five labeled examples per class, we attain 90.71% and 95.86% accuracy on the UC Merced Land Use dataset and EuroSAT, respectively. Our results show that MSMatch is capable of greatly reducing the requirements for labeled data. It translates well to multispectral data and should enable various applications that are currently infeasible due to a lack of labeled data. We provide the source code of MSMatch online to enable easy reproduction and quick adoption.
- Published
- 2021
- Full Text
- View/download PDF
230. Remote Sensing Scene Classification Using Sparse Representation-Based Framework With Deep Feature Fusion
- Author
-
Shaohui Mei, Keli Yan, Mingyang Ma, Xiaoning Chen, Shun Zhang, and Qian Du
- Subjects
Deep feature learning ,remote sensing (RS) ,scene classification ,small training size ,sparse representation ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Scene classification of high-resolution remote sensing (RS) images has attracted increasing attentions due to its vital role in a wide range of applications. Convolutional neural networks (CNNs) have recently been applied on many computer vision tasks and have significantly boosted the performance including imagery scene classification, object detection, and so on. However, the classification performance heavily relies on the features that can accurately represent the scene of images, thus, how to fully explore the feature learning ability of CNNs is of crucial importance for scene classification. Another problem in CNNs is that it requires a large number of labeled samples, which is impractical in RS image processing. To address these problems, a novel sparse representation-based framework for small-sample-size RS scene classification with deep feature fusion is proposed. Specially, multilevel features are first extracted from different layers of CNNs to fully exploit the feature learning ability of CNNs. Note that the existing well-trained CNNs, e.g., AlexNet, VGGNet, and ResNet50, are used for feature extraction, in which no labeled samples is required. Then, sparse representation-based classification is designed to fuse the multilevel features, which is especially effective when only a small number of training samples are available. Experimental results over two benchmark datasets, e.g., UC-Merced and WHU-RS19, demonstrated that the proposed method can effectively fuse different levels of features learned in CNNs, and clearly outperform several state-of-the-art methods especially with limited training samples.
- Published
- 2021
- Full Text
- View/download PDF
231. NaSC-TG2: Natural Scene Classification With Tiangong-2 Remotely Sensed Imagery
- Author
-
Zhuang Zhou, Shengyang Li, Wei Wu, Weilong Guo, Xuan Li, Guisong Xia, and Zifei Zhao
- Subjects
Benchmark dataset ,deep learning ,remote sensing ,scene classification ,Tiangong-2 ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Scene classification is one of the most important applications of remote sensing. Researchers have proposed various datasets and innovative methods for remote sensing scene classification in recent years. However, most of the existing remote sensing scene datasets are collected uniquely from a single data source: Google Earth. In addition, scenes in different datasets are mainly human-made landscapes with high similarity. The lack of richness and diversity of data sources limits the research and applications of remote sensing classification. This article describes a large-scale dataset named “NaSC-TG2,” which is a novel benchmark dataset for remote sensing natural scene classification built from Tiangong-2 remotely sensed imagery. The goal of this dataset is to expand and enrich the annotation data for advancing remote sensing classification algorithms, especially for the natural scene classification. The dataset contains 20 000 images, which are equally divided into ten scene classes. The dataset has three primary advantages: 1) it is large scale, especially in terms of the number of each class, and the numbers of scenes are evenly distributed; 2) it has a large number of intraclass differences and high interclass similarity, because all images are carefully selected from different regions and seasons; and 3) it offers natural scenes with novel spatial scale and imaging performance compared with other datasets. All images are acquired from the new generation of wideband imaging spectrometer of Tiangong-2. In addition to RGB images, the corresponding multispectral scene images are also provided. This dataset is useful in supporting the development and evaluation of classification algorithms, as demonstrated in the present study.
- Published
- 2021
- Full Text
- View/download PDF
232. Compact Global-Local Convolutional Network With Multifeature Fusion and Learning for Scene Classification in Synthetic Aperture Radar Imagery
- Author
-
Kang Ni, Pengfei Liu, and Peng Wang
- Subjects
Affine subspace ,convolutional feature learning ,convolutional neural network (CNN) ,scene classification ,synthetic aperture radar (SAR) ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Feature learning of convolutional neural networks (CNNs) has gained considerable attention and achieved good performance on synthetic aperture radar (SAR) image scene classification. However, the performance of the existing convolutional feature learning methods is limited for generating the distinguishable feature representations because such techniques inherently suffer from shortcomings, i.e., they do not consider the local feature distribution of deep orderless feature statistics and deep orderless multifeature learning style. To alleviate these drawbacks, we propose a compact global-local convolutional network with multifeature fusion and learning (CGML) for SAR image scene classification, which contains double branches of convolutional feature learning net (C-net) and local feature distribution learning net (L-net). L-net employs the localized and parameterized affine subspace coding layer for local feature distribution learning and captures the feature statistics of each cluster center via detailed local feature division. The standard convolutional feature map is utilized for the convolutional feature learning in C-net. Subsequently, the compact multifeature fusion and learning strategy captures the compact global second-order orderless feature representation and allows the double branches to interact with each other via the tensor sketch algorithm. Especially, the feature learning strategy of L-net is defined in affine subspace which fully characterizes the feature distribution inside each cluster space. Finally, we concatenate the outputs of the multifeature fusion and learning network, then pool and feed them into softmax loss. Based on extensive evaluations on TerraSAR-X1 and TerraSAR-X2 image scene classification datasets, CGML can yield superior performances when compared with those of several state-of-the-art networks.
- Published
- 2021
- Full Text
- View/download PDF
233. On Creating Benchmark Dataset for Aerial Image Interpretation: Reviews, Guidances, and Million-AID
- Author
-
Yang Long, Gui-Song Xia, Shengyang Li, Wen Yang, Michael Ying Yang, Xiao Xiang Zhu, Liangpei Zhang, and Deren Li
- Subjects
Annotation ,benchmark datasets ,Million Aerial Image Dataset (Million-AID) ,remote sensing (RS) image interpretation ,scene classification ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
The past years have witnessed great progress on remote sensing (RS) image interpretation and its wide applications. With RS images becoming more accessible than ever before, there is an increasing demand for the automatic interpretation of these images. In this context, the benchmark datasets serve as an essential prerequisites for developing and testing intelligent interpretation algorithms. After reviewing existing benchmark datasets in the research community of RS image interpretation, this article discusses the problem of how to efficiently prepare a suitable benchmark dataset for RS image interpretation. Specifically, we first analyze the current challenges of developing intelligent algorithms for RS image interpretation with bibliometric investigations. We then present the general guidances on creating benchmark datasets in efficient manners. Following the presented guidances, we also provide an example on building RS image dataset, i.e., Million Aerial Image Dataset (Online. Available: https://captain-whu.github.io/DiRS/), a new large-scale benchmark dataset containing a million instances for RS image scene classification. Several challenges and perspectives in RS image annotation are finally discussed to facilitate the research in benchmark dataset construction. We do hope this article will provide the RS community an overall perspective on constructing large-scale and practical image datasets for further research, especially data-driven ones.
- Published
- 2021
- Full Text
- View/download PDF
234. Classification of Remote Sensing Images Using EfficientNet-B3 CNN Model With Attention
- Author
-
Haikel Alhichri, Asma S. Alswayed, Yakoub Bazi, Nassim Ammour, and Naif A. Alajlan
- Subjects
Remote sensing ,scene classification ,EfficientNet-B3 ,convolutional neural networks (CNNs) ,attention mechanisms ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Scene classification is a highly useful task in Remote Sensing (RS) applications. Many efforts have been made to improve the accuracy of RS scene classification. Scene classification is a challenging problem, especially for large datasets with tens of thousands of images with a large number of classes and taken under different circumstances. One problem that is observed in scene classification is the fact that for a given scene, only one part of it indicates which class it belongs to, whereas the other parts are either irrelevant or they actually tend to belong to another class. To address this issue, this paper proposes a deep attention Convolutional Neural Network (CNN) for scene classification in remote sensing. CNN models use successive convolutional layers to learn feature maps from larger and larger regions (or receptive fields) of the scene. The attention mechanism computes a new feature map as a weighted average of these original feature maps. In particular, we propose a solution, named EfficientNet-B3-Attn-2, based on the pre-trained EfficientNet-B3 CNN enhanced with an attention mechanism. A dedicated branch is added to layer 262 of the network, to compute the required weights. These weights are learned automatically by training the whole CNN model end-to-end using the backpropagation algorithm. In this way, the network learns to emphasize important regions of the scene and suppress the regions that are irrelevant to the classification. We tested the proposed EfficientNet-B3-Attn-2 on six popular remote sensing datasets, namely UC Merced, KSA, OPTIMAL-31, RSSCN7, WHU-RS19, and AID datasets, showing its strong capabilities in classifying RS scenes.
- Published
- 2021
- Full Text
- View/download PDF
235. Adaptive Deep Co-Occurrence Feature Learning Based on Classifier-Fusion for Remote Sensing Scene Classification
- Author
-
Ronald Tombe and Serestina Viriri
- Subjects
Adaptive deep co-occurrence learning ,deep feature extraction ,ensemble learning ,machine learning ,multigrained forests ,scene classification ,Ocean engineering ,TC1501-1800 ,Geophysics. Cosmic physics ,QC801-809 - Abstract
Remote sensing scene classification has numerous applications on land cover land use. However, classifying the scene images into their correct categories is a challenging task. This challenge is attributable to the diverse semantics of remote sensing images. This nature of remote sensing images makes the task of effective feature extraction and learning complex. Effective image feature representation is essential in image analysis and interpretation for accurate scene image classification with machine learning algorithms. The recent literature shows that convolutional neural networks are mighty in feature extraction for remote sensing scene classification. Additionally, recent literature shows that classifier-fusion attains superior results than individual classifiers. This article proposes the adaptive deep co-accordance feature learning (ADCFL). The ADCFL method utilizes a convolutional neural network to extract spatial feature information from an image in a co-occurrence manner with filters, and then this information is fed to the multigrain forest for feature learning and classification through majority votes with ensemble classifiers. An evaluation of the effectiveness of ADCFL is conducted on the public datasets Resisc45 and Ucmerced. The classification accuracy results attained by the ADCFL demonstrate that the proposed method achieves improved results.
- Published
- 2021
- Full Text
- View/download PDF
236. Multilevel Capsule Weighted Aggregation Network Based on a Decoupled Dynamic Filter for Remote Sensing Scene Classification
- Author
-
Chunyuan Wang, Yang Wu, Yihan Wang, Yiping Chen, and Yan Gao
- Subjects
Scene classification ,decoupled dynamic filter ,weighted aggregation ,capsule network ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The improvement of remote sensing scene classification(RSSC) by effectively extracting discriminant representations for complex and diverse scenes remains a challenging task. The capsule network(CapsNet) can encode the spatial relationship of features in an image, which exhibits encouraging performance. Nevertheless, the original CapsNet is unsuitable for RSSC with complex image background. In addition, conventional neural network methods only use the last features from their last convolutional layer and discount the intermediate features with complementary information. To search the additional information in intermediate convolutional layers and increase the performance of feature aggregation, this paper proposes a multilevel capsule weighted aggregation network (MCWANet) based on a decoupled dynamic filter(DDF), in which a new multilevel capsule encoding module and a new capsule sorting pooling (CSPool) method are implemented by combining the advantageous attributes of a residual DDF block, weighted capsule aggregation, and the new CSPool method. Extensive experiments on two challenging datasets, AID and NWPU-RESISC45, demonstrate that multilevel and multiscale features can be extracted and fused to extract semantically strong feature representation and that the proposed MCWANet performs competitively in RSSC.
- Published
- 2021
- Full Text
- View/download PDF
237. Lie Group spatial attention mechanism model for remote sensing scene classification.
- Author
-
Xu, Chengjun, Zhu, Guobin, and Shu, Jingqian
- Subjects
- *
REMOTE sensing , *CONVOLUTIONAL neural networks , *MACHINE learning - Abstract
Utilizing discriminative features to represent data samples is a significant step, and the remote sensing domain is no exception. Most existing convolutional neural network (CNN) models have achieved great results. However, they mainly focus on global high-level features, ignoring local, shallower features and the relationships between features, which are crucial for scene classification. In this study, a novel Lie Group spatial attention mechanism model is introduced. First, it uses Lie Group machine learning and CNN to preserve features at different levels. Then, the Lie Group spatial attention mechanism is used to suppress irrelevant features and enhance local semantic features. Finally, the Lie Fisher classifier is used for prediction. Extensive experiments on two publicly and challenging data sets demonstrate that our model enhances feature characterization capabilities and achieves competitive accuracy with other state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
238. Remote Sensing Image Scene Classification Using Multiscale Feature Fusion Covariance Network With Octave Convolution.
- Author
-
Bai, Lin, Liu, Qingxin, Li, Cuiling, Ye, Zhen, Hui, Meng, and Jia, Xiuping
- Subjects
- *
REMOTE sensing , *FEATURE extraction , *SOURCE code , *CONVOLUTIONAL neural networks - Abstract
In remote sensing scene classification (RSSC), features can be extracted with different spatial frequencies where high-frequency features usually represent detailed information and low-frequency features usually represent global structures. However, it is challenging to extract meaningful semantic information for RSSC tasks by just utilizing high- or low-frequency features. The spatial composition of remote sensing images (RSIs) is more complex than that of natural images, and the scales of objects vary significantly. In this article, a multiscale feature fusion covariance network (MF2CNet) with octave convolution (Oct Conv) is proposed, which can extract multifrequency and multiscale features from RSIs. First, the multifrequency feature extraction (MFE) module is used to obtain fine-grained frequency features by Oct Conv. Then, the features of different layers in MF2CNet are fused by the multiscale feature fusion (MF2) module. Finally, instead of using global average pooling (GAP), global covariance pooling (GCP) extracts high-order information from RSIs to capture richer statistics of deep features. In the proposed MF2CNet, the obtained multifrequency and multiscale features can effectively improve the performance of CNNs. Experimental results on four public RSI datasets show that MF2CNet has advantages in RSSC over current state-of-the-art methods. The source codes of this method can be found at https://github.com/liuqingxin-chd/MF2CNet. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
239. Domain Adaptation via a Task-Specific Classifier Framework for Remote Sensing Cross-Scene Classification.
- Author
-
Zheng, Zhendong, Zhong, Yanfei, Su, Yu, and Ma, Ailong
- Subjects
- *
REMOTE sensing , *GENERATIVE adversarial networks , *CONVOLUTIONAL neural networks , *CLASSIFICATION algorithms , *PHYSIOLOGICAL adaptation , *HYPERSPECTRAL imaging systems - Abstract
The scene classification of high spatial resolution (HSR) imagery involves labeling an HSR image with a specific high-level semantic class according to the composition of the semantic objects and their spatial relationships. As such, scene classification has attracted increased attention in recent years, and many different algorithms have now been proposed for the cross-scene classification task. However, the recently proposed scene classification methods based on deep convolutional neural networks (CNNs) still suffer from domain shift problems, because of the training data and validation data not following the assumption of independent and identical distributions. The employment of generative adversarial networks has been found to be an effective way to bridge the domain shift/gap. However, the existing cross-scene classification methods do not use the classification information in the target domain, and the domain classifier is task-independent for different scene classification tasks. In this article, to solve this problem, domain adaptation via a task-specific classifier (DATSNET) framework is proposed for HSR image scene classification. Task-specific classifiers and minimizing and maximizing, “ i.e., minimaxing,” of the classifier discrepancy are integrated in the DATSNET framework. The task-specific classifiers are proposed to align the distributions of the source domain features and target domain features by utilizing task-specific decision boundaries in the target domain. In order to align the two task-specific classifiers’ feature distributions, minimaxing the defined discrepancy between the different classifiers in an adversarial manner is proposed to obtain better task-specific classifier boundaries in the target domain and a better-aligned feature distribution in both domains. The experimental results obtained with different remote sensing cross-scene classification tasks demonstrate that the proposed method achieves a significantly improved performance compared with the other state-of-the-art remote sensing cross-scene classification algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
240. A Self-Training Hierarchical Prototype-based Ensemble Framework for Remote Sensing Scene Classification.
- Author
-
Gu, Xiaowei, Zhang, Ce, Shen, Qiang, Han, Jungong, Angelov, Plamen P., and Atkinson, Peter M.
- Subjects
- *
REMOTE sensing , *SUPERVISED learning , *CLASSIFICATION , *DEEP learning , *DATA mining , *TEXT recognition , *OPTICAL remote sensing - Abstract
• A novel semi-supervised ensemble framework with a cross-checking mechanism is proposed for remote sensing scene classification; • An enhanced self-training hierarchical prototype-based classifier is introduced as the base learner; • A novel pseudo-labelling mechanism is proposed to utilise the multi-granular information mined from data in decision-making; • The proposed ensemble framework can achieve top performance with a reduced requirement for labelled images. Remote sensing scene classification plays a critical role in a wide range of real-world applications. Technically, however, scene classification is an extremely challenging task due to the huge complexity in remotely sensed scenes, and the difficulty in acquiring labelled data for model training such as supervised deep learning. To tackle these issues, a novel semi-supervised ensemble framework is proposed here using the self-training hierarchical prototype-based classifier as the base learner for chunk-by-chunk prediction. The framework has the ability to build a powerful ensemble model from both labelled and unlabelled images with minimum supervision. Different feature descriptors are employed in the proposed ensemble framework to offer multiple independent views of images. Thus, the diversity of base learners is guaranteed for ensemble classification. To further increase the overall accuracy, a novel cross-checking strategy was introduced to enable the base learners to exchange pseudo-labelling information during the self-training process, and maximize the correctness of pseudo-labels assigned to unlabelled images. Extensive numerical experiments on popular benchmark remote sensing scenes demonstrated the effectiveness of the proposed ensemble framework, especially where the number of labelled images available is limited. For example, the classification accuracy achieved on the OPTIMAL-31, PatternNet and RSI-CB256 datasets was up to 99.91%, 98. 67% and 99.07% with only 40% of the image sets used as labelled training images, surpassing or at least on par with mainstream benchmark approaches trained with double the number of labelled images. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
241. Semi-Supervised Remote-Sensing Image Scene Classification Using Representation Consistency Siamese Network.
- Author
-
Miao, Wang, Geng, Jie, and Jiang, Wen
- Subjects
- *
REMOTE-sensing images , *ARTIFICIAL neural networks , *DEEP learning , *GENERATIVE adversarial networks , *CONVOLUTIONAL neural networks , *OPTICAL remote sensing - Abstract
Deep learning has achieved excellent performance in remote-sensing image scene classification, since a large number of datasets with annotations can be applied for training. However, in actual applications, there is just a few annotated samples and a large number of unannotated samples in remote-sensing images, which leads to overfitting of the deep model and affects the performance of scene classification. In order to address these problems, a semi-supervised representation consistency Siamese network (SS-RCSN) is proposed for remote-sensing image scene classification. First, considering intraclass diversity and interclass similarity of remote-sensing images, Involution-generative adversarial network (GAN) is utilized to extract the discriminative features from remote-sensing images via unsupervised learning. Then, Siamese network with a representation consistency loss is proposed for semi-supervised classification, which aims to reduce the differences of labeled and unlabeled data. Experimental results on UC Merced dataset, RESICS-45 dataset, aerial image dataset (AID), and RS dataset demonstrate that our method yields superior classification performance compared with other semi-supervised learning (SSL) methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
242. Influence of Different Activation Functions on Deep Learning Models in Indoor Scene Images Classification.
- Author
-
Anami, Basavaraj S. and Sagarnal, Chetan V.
- Abstract
The success of deep learning in the field of computer vision and object recognition has made significant breakthroughs, especially in improving recognition accuracy. The scene recognition algorithms have been evolved over the years because of the developments in machine learning and deep convolution neural networks (DCNNs). In this paper, the classification of indoor scenes using three deep learning models, namely, ResNet, MobileNet, and EfficientNet is attempted. The influence of activation functions on classification accuracy is being explored. Three activation functions, namely, tanh, ReLU, and sigmoid, are deployed in the work. The MIT-67 indoor dataset is split into scenes with and without people to test its effect on the accuracy of classification. The novelty of the work includes splitting the dataset, based on the spatial layout and segregating, into two groups, namely, with people and without people. Amongst the three pre-trained models, EfficientNet has given good results. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
243. Land type authenticity check of vector patches using a self-trained deep learning model.
- Author
-
Guo, Zihui, Liu, Wei, Xu, Jiawei, Li, Erzhu, Li, Xing, Zhang, Lianpeng, and Zhang, Jiaxing
- Subjects
- *
DEEP learning , *BIVECTORS , *VECTOR data - Abstract
Quality checks are important to ensure the accuracy and reliability of spatial data. However, current methods primarily focus on attributes and geometric information, and methods to verify the accuracy of land-type classification for vector patches are lacking. Therefore, in this paper, we proposed a framework for the automatic verification of the land type of vector patches, including the segmentation of complex vector patches, automatic acquisition of training samples, and automatic extraction of suspicious patches based on deep learning scene interpretation results. First, an innovative method for segmentation based on an oriented bounding box was proposed. Then, scene interpretation from the segmented processing units was performed. Finally, the scene interpretation results and the original data for vector patches were combined, and segmentation units that do not match the category information were automatically identified. The suspicious patches were extracted, and the land-type authenticity of vector patches was automatically checked. Experimental results showed that, in the study area, the accuracy, precision, recall, and F1 values for the model based on first-level land categories were 0.989, 0.842, 0.906, and 0.873, respectively, and those for second-level land categories were 0.994, 0.770, 0.891, and 0.826, respectively. Accordingly, the newly developed method provides reliable technical support for checking the land-type authenticity of vector patches. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
244. Scene Classification of Remotely Sensed Images using Optimized RSISC-16 Net Deep Convolutional Neural Network Model
- Author
-
P. Deepan, L. R. Sudha, K. Kalaivani, and J. Ganesh
- Subjects
Optimized RSISC-16 ,Scene classification ,remote sensing image ,and convolutional neural network ,Management information systems ,T58.6-58.62 - Abstract
Remote Sensing Image (RSI) analysis has seen a massive increase in popularity over the last few decades, due to the advancement of deep learning models. A wide variety of deep learning models have emerged for the task of scene classification in remote sensing image analysis. The majority of these models have shown significant success. However, we found that there is significant variability, in order to improve the system efficiency in characterizing complex patterns in remote sensing imagery. We achieved this goal by expanding the architecture of VGG-16 Net and fine-tuning hyperparameters such as batch size, dropout probabilities, and activation functions to create the optimized Remote Sensing Image Scene Classification (RSISC-16 Net) deep learning model for scene classification. Using the Talos optimization tool, the results are carried out. This will increase efficiency and reduce the risk of over-fitting. Our proposed RSISC-16 Net model outperforms the VGG-16 Net model, according to experimental results.
- Published
- 2022
- Full Text
- View/download PDF
245. Adversarial Remote Sensing Scene Classification Based on Lie Group Feature Learning
- Author
-
Chengjun Xu, Jingqian Shu, and Guobin Zhu
- Subjects
generative adversarial network ,Lie group ,remote sensing ,scene classification ,Science - Abstract
Convolutional Neural Networks have been widely used in remote sensing scene classification. Since this kind of model needs a large number of training samples containing data category information, a Generative Adversarial Network (GAN) is usually used to address the problem of lack of samples. However, GAN mainly generates scene data samples that do not contain category information. To address this problem, a novel supervised adversarial Lie Group feature learning network is proposed. In the case of limited data samples, the model can effectively generate data samples with category information. There are two main differences between our method and the traditional GAN. First, our model takes category information and data samples as the input of the model and optimizes the constraint of category information in the loss function, so that data samples containing category information can be generated. Secondly, the object scale sample generation strategy is introduced, which can generate data samples of different scales and ensure that the generated data samples contain richer feature information. After large-scale experiments on two publicly available and challenging datasets, it is found that our method can achieve better scene classification accuracy even with limited data samples.
- Published
- 2023
- Full Text
- View/download PDF
246. Deep Feature Aggregation Framework Driven by Graph Convolutional Network for Scene Classification in Remote Sensing.
- Author
-
Xu, Kejie, Huang, Hong, Deng, Peifang, and Li, Yuan
- Subjects
- *
CONVOLUTIONAL neural networks , *REMOTE sensing , *DEEP learning , *LAND use - Abstract
Scene classification of high spatial resolution (HSR) images can provide data support for many practical applications, such as land planning and utilization, and it has been a crucial research topic in the remote sensing (RS) community. Recently, deep learning methods driven by massive data show the impressive ability of feature learning in the field of HSR scene classification, especially convolutional neural networks (CNNs). Although traditional CNNs achieve good classification results, it is difficult for them to effectively capture potential context relationships. The graphs have powerful capacity to represent the relevance of data, and graph-based deep learning methods can spontaneously learn intrinsic attributes contained in RS images. Inspired by the abovementioned facts, we develop a deep feature aggregation framework driven by graph convolutional network (DFAGCN) for the HSR scene classification. First, the off-the-shelf CNN pretrained on ImageNet is employed to obtain multilayer features. Second, a graph convolutional network-based model is introduced to effectively reveal patch-to-patch correlations of convolutional feature maps, and more refined features can be harvested. Finally, a weighted concatenation method is adopted to integrate multiple features (i.e., multilayer convolutional features and fully connected features) by introducing three weighting coefficients, and then a linear classifier is employed to predict semantic classes of query images. Experimental results performed on the UCM, AID, RSSCN7, and NWPU-RESISC45 data sets demonstrate that the proposed DFAGCN framework obtains more competitive performance than some state-of-the-art methods of scene classification in terms of OAs. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
247. Self-supervised in-domain representation learning for remote sensing image scene classification.
- Author
-
Ghanbarzadeh A and Soleimani H
- Abstract
Transferring the ImageNet pre-trained weights to the various remote sensing tasks has produced acceptable results and reduced the need for labeled samples. However, the domain differences between ground imageries and remote sensing images cause the performance of such transfer learning to be limited. The difficulty of annotating remote sensing images is well-known as it requires domain experts and more time, whereas unlabeled data is readily available. Recently, self-supervised learning, which is a subset of unsupervised learning, emerged and significantly improved representation learning. Recent research has demonstrated that self-supervised learning methods capture visual features that are more discriminative and transferable than the supervised ImageNet weights. We are motivated by these facts to pre-train the in-domain representations of remote sensing imagery using contrastive self-supervised learning and transfer the learned features to other related remote sensing datasets. Specifically, we used the SimSiam algorithm to pre-train the in-domain knowledge of remote sensing datasets and then transferred the obtained weights to the other scene classification datasets. Thus, we have obtained state-of-the-art results on five land cover classification datasets with varying numbers of classes and spatial resolutions. In addition, by conducting appropriate experiments, including feature pre-training using datasets with different attributes, we have identified the most influential factors that make a dataset a good choice for obtaining in-domain features. We have transferred the features obtained by pre-training SimSiam on remote sensing datasets to various downstream tasks and used them as initial weights for fine-tuning. Moreover, we have linearly evaluated the obtained representations in cases where the number of samples per class is limited. Our experiments have demonstrated that using a higher-resolution dataset during the self-supervised pre-training stage results in learning more discriminative and general representations., Competing Interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (© 2024 The Authors.)
- Published
- 2024
- Full Text
- View/download PDF
248. Improving remote sensing scene classification using dung Beetle optimization with enhanced deep learning approach.
- Author
-
Alamgeer M, Al Mazroa A, S Alotaibi S, Alanazi MH, Alonazi M, and S Salama A
- Abstract
Remote sensing (RS) scene classification has received significant consideration because of its extensive use by the RS community. Scene classification in satellite images has widespread uses in remote surveillance, environmental observation, remote scene analysis, urban planning, and earth observations. Because of the immense benefits of the land scene classification task, various approaches have been presented recently for automatically classifying land scenes from remote sensing images (RSIs). Several approaches dependent upon convolutional neural networks (CNNs) are presented for classifying brutal RS scenes; however, they could only partially capture the context from RSIs due to the problematic texture, cluttered context, tiny size of objects, and considerable differences in object scale. This article designs a Remote Sensing Scene Classification using Dung Beetle Optimization with Enhanced Deep Learning (RSSC-DBOEDL) approach. The purpose of the RSSC-DBOEDL technique is to categorize different varieties of scenes that exist in the RSI. In the presented RSSC-DBOEDL technique, the enhanced MobileNet model is primarily deployed as a feature extractor. The DBO method could be implemented in this study for hyperparameter tuning of the enhanced MobileNet model. The RSSC-DBOEDL technique uses a multi-head attention-based long short-term memory (MHA-LSTM) technique to classify the scenes in the RSI. The simulation evaluation of the RSSC-DBOEDL approach has been examined under the benchmark RSI datasets. The simulation results of the RSSC-DBOEDL approach exhibited a more excellent accuracy outcome of 98.75 % and 95.07 % under UC Merced and EuroSAT datasets with other existing methods regarding distinct measures., Competing Interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (© 2024 The Authors.)
- Published
- 2024
- Full Text
- View/download PDF
249. Snapshot ensemble-based residual network (SnapEnsemResNet) for remote sensing image scene classification
- Author
-
Siddiqui, Muhammad Ibraheem, Khan, Khurram, Fazil, Adnan, and Zakwan, Muhammad
- Published
- 2023
- Full Text
- View/download PDF
250. Semantic embedding: scene image classification using scene-specific objects
- Author
-
Parseh, Mohammad Javad, Rahmanimanesh, Mohammad, Keshavarzi, Parviz, and Azimifar, Zohreh
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.