57 results
Search Results
2. Ship Detection with Deep Learning in Optical Remote-Sensing Images: A Survey of Challenges and Advances.
- Author
-
Zhao, Tianqi, Wang, Yongcheng, Li, Zheng, Gao, Yunxiao, Chen, Chi, Feng, Hao, and Zhao, Zhikang
- Subjects
- *
DEEP learning , *REMOTE-sensing images , *OPTICAL remote sensing , *OPTICAL images , *CONVOLUTIONAL neural networks , *TRANSFORMER models , *FEATURE extraction - Abstract
Ship detection aims to automatically identify whether there are ships in the images, precisely classifies and localizes them. Regardless of whether utilizing early manually designed methods or deep learning technology, ship detection is dedicated to exploring the inherent characteristics of ships to enhance recall. Nowadays, high-precision ship detection plays a crucial role in civilian and military applications. In order to provide a comprehensive review of ship detection in optical remote-sensing images (SDORSIs), this paper summarizes the challenges as a guide. These challenges include complex marine environments, insufficient discriminative features, large scale variations, dense and rotated distributions, large aspect ratios, and imbalances between positive and negative samples. We meticulously review the improvement methods and conduct a detailed analysis of the strengths and weaknesses of these methods. We compile ship information from common optical remote sensing image datasets and compare algorithm performance. Simultaneously, we compare and analyze the feature extraction capabilities of backbones based on CNNs and Transformer, seeking new directions for the development in SDORSIs. Promising prospects are provided to facilitate further research in the future. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. CroplandCDNet: Cropland Change Detection Network for Multitemporal Remote Sensing Images Based on Multilayer Feature Transmission Fusion of an Adaptive Receptive Field.
- Author
-
Wu, Qiang, Huang, Liang, Tang, Bo-Hui, Cheng, Jiapei, Wang, Meiqi, and Zhang, Zixuan
- Subjects
- *
CONVOLUTIONAL neural networks , *CHANGE-point problems , *FARMS , *MARKOV random fields , *REMOTE-sensing images , *FEATURE extraction - Abstract
Dynamic monitoring of cropland using high spatial resolution remote sensing images is a powerful means to protect cropland resources. However, when a change detection method based on a convolutional neural network employs a large number of convolution and pooling operations to mine the deep features of cropland, the accumulation of irrelevant features and the loss of key features will lead to poor detection results. To effectively solve this problem, a novel cropland change detection network (CroplandCDNet) is proposed in this paper; this network combines an adaptive receptive field and multiscale feature transmission fusion to achieve accurate detection of cropland change information. CroplandCDNet first effectively extracts the multiscale features of cropland from bitemporal remote sensing images through the feature extraction module and subsequently embeds the receptive field adaptive SK attention (SKA) module to emphasize cropland change. Moreover, the SKA module effectively uses spatial context information for the dynamic adjustment of the convolution kernel size of cropland features at different scales. Finally, multiscale features and difference features are transmitted and fused layer by layer to obtain the content of cropland change. In the experiments, the proposed method is compared with six advanced change detection methods using the cropland change detection dataset (CLCD). The experimental results show that CroplandCDNet achieves the best F1 and OA at 76.04% and 94.47%, respectively. Its precision and recall are second best of all models at 76.46% and 75.63%, respectively. Moreover, a generalization experiment was carried out using the Jilin-1 dataset, which effectively verified the reliability of CroplandCDNet in cropland change detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Learning Point Processes and Convolutional Neural Networks for Object Detection in Satellite Images.
- Author
-
Mabon, Jules, Ortner, Mathias, and Zerubia, Josiane
- Subjects
- *
OBJECT recognition (Computer vision) , *CONVOLUTIONAL neural networks , *POINT processes , *REMOTE-sensing images , *GABOR filters , *ARTIFICIAL satellites - Abstract
Convolutional neural networks (CNN) have shown great results for object-detection tasks by learning texture and pattern-extraction filters. However, object-level interactions are harder to grasp without increasing the complexity of the architectures. On the other hand, Point Process models propose to solve the detection of the configuration of objects as a whole, allowing the factoring in of the image data and the objects' prior interactions. In this paper, we propose combining the information extracted by a CNN with priors on objects within a Markov Marked Point Process framework. We also propose a method to learn the parameters of this Energy-Based Model. We apply this model to the detection of small vehicles in optical satellite imagery, where the image information needs to be complemented with object interaction priors because of noise and small object sizes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. On-Board Multi-Class Geospatial Object Detection Based on Convolutional Neural Network for High Resolution Remote Sensing Images.
- Author
-
Shen, Yanyun, Liu, Di, Chen, Junyi, Wang, Zhipan, Wang, Zhe, and Zhang, Qingling
- Subjects
- *
OBJECT recognition (Computer vision) , *CONVOLUTIONAL neural networks , *REMOTE-sensing images , *REMOTE sensing , *DATA transmission systems , *URBAN planning , *OPTICAL remote sensing - Abstract
Multi-class geospatial object detection in high-resolution remote sensing images has significant potential in various domains such as industrial production, military warning, disaster monitoring, and urban planning. However, the traditional process of remote sensing object detection involves several time-consuming steps, including image acquisition, image download, ground processing, and object detection. These steps may not be suitable for tasks with shorter timeliness requirements, such as military warning and disaster monitoring. Additionally, the transmission of massive data from satellites to the ground is limited by bandwidth, resulting in time delays and redundant information, such as cloud coverage images. To address these challenges and achieve efficient utilization of information, this paper proposes a comprehensive on-board multi-class geospatial object detection scheme. The proposed scheme consists of several steps. Firstly, the satellite imagery is sliced, and the PID-Net (Proportional-Integral-Derivative Network) method is employed to detect and filter out cloud-covered tiles. Subsequently, our Manhattan Intersection over Union (MIOU) loss-based YOLO (You Only Look Once) v7-Tiny method is used to detect remote-sensing objects in the remaining tiles. Finally, the detection results are mapped back to the original image, and the truncated NMS (Non-Maximum Suppression) method is utilized to filter out repeated and noisy boxes. To validate the reliability of the scheme, this paper creates a new dataset called DOTA-CD (Dataset for Object Detection in Aerial Images-Cloud Detection). Experiments were conducted on both ground and on-board equipment using the AIR-CD dataset, DOTA dataset, and DOTA-CD dataset. The results demonstrate the effectiveness of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
6. A CatBoost-Based Model for the Intensity Detection of Tropical Cyclones over the Western North Pacific Based on Satellite Cloud Images.
- Author
-
Zhong, Wei, Zhang, Deyuan, Sun, Yuan, and Wang, Qian
- Subjects
- *
TROPICAL cyclones , *REMOTE-sensing images , *CONVOLUTIONAL neural networks , *STANDARD deviations , *BRIGHTNESS temperature - Abstract
A CatBoost-based intelligent tropical cyclone (TC) intensity-detecting model was built to quantify the intensity of TCs over the Western North Pacific (WNP) with the cloud-top brightness temperature (CTBT) data of Fengyun-2F (FY-2F) and Fengyun-2G (FY-2G) and the best-track data of the China Meteorological Administration (CMA-BST) in recent years (2015–2018). The CatBoost-based model was featured with the greedy strategy of combination, the ordering principle in optimizing the possible gradient bias and prediction shift problems, and the oblivious tree in fast scoring. Compared with the previous studies based on the pure convolutional neural network (CNN) models, the CatBoost-based model exhibited better skills in detecting the TC intensity with the root mean square error (RMSE) of 3.74 m s−1. In addition to the three mentioned model features, there are also two reasons for the model's design. On one hand, the CatBoost-based model used the method of introducing prior physical factors (e.g., the structure and shape of the cloud, deep convections, and background fields) into its training process. On the other hand, the CatBoost-based model expanded the dataset size from 2342 to 13,471 samples through hourly interpolations of the original dataset. Furthermore, this paper investigated the errors of this model in detecting the different categories of TC intensity. The results showed that the deep learning-based TC intensity-detecting model proposed in this paper has systematic biases, namely, the overestimation (underestimation) of intensities in TCs which were weaker (stronger) than at the typhoon level, and the errors of the model in detecting weaker (stronger) TCs were smaller (larger). This implies that more factors than the CTBT should be included to further reduce the errors in detecting strong TCs. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. Convolutional Neural Network-Based Method for Agriculture Plot Segmentation in Remote Sensing Images.
- Author
-
Qi, Liang, Zuo, Danfeng, Wang, Yirong, Tao, Ye, Tang, Runkang, Shi, Jiayu, Gong, Jiajun, and Li, Bangyu
- Subjects
- *
IMAGE segmentation , *REMOTE sensing , *REMOTE-sensing images , *LAND use , *FEATURE extraction , *AGRICULTURAL productivity - Abstract
Accurate delineation of individual agricultural plots, the foundational units for agriculture-based activities, is crucial for effective government oversight of agricultural productivity and land utilization. To improve the accuracy of plot segmentation in high-resolution remote sensing images, the paper collects GF-2 satellite remote sensing images, uses ArcGIS10.3.1 software to establish datasets, and builds UNet, SegNet, DeeplabV3+, and TransUNet neural network frameworks, respectively, for experimental analysis. Then, the TransUNet network with the best segmentation effects is optimized in both the residual module and the skip connection to further improve its performance for plot segmentation in high-resolution remote sensing images. This article introduces Deformable ConvNets in the residual module to improve the original ResNet50 feature extraction network and combines the convolutional block attention module (CBAM) at the skip connection to calculate and improve the skip connection steps. Experimental results indicate that the optimized remote sensing plot segmentation algorithm based on the TransUNet network achieves an Accuracy of 86.02%, a Recall of 83.32%, an F1-score of 84.67%, and an Intersection over Union (IOU) of 86.90%. Compared to the original TransUNet network for remote sensing land parcel segmentation, whose F1-S is 81.94% and whose IoU is 69.41%, the optimized TransUNet network has significantly improved the performance of remote sensing land parcel segmentation, which verifies the effectiveness and reliability of the plot segmentation algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. DMAU-Net: An Attention-Based Multiscale Max-Pooling Dense Network for the Semantic Segmentation in VHR Remote-Sensing Images.
- Author
-
Yang, Yang, Dong, Junwu, Wang, Yanhui, Yu, Bibo, and Yang, Zhigang
- Subjects
- *
REMOTE-sensing images , *CONVOLUTIONAL neural networks , *REMOTE sensing , *IMAGE recognition (Computer vision) , *IMAGE segmentation , *FEATURE extraction - Abstract
High-resolution remote-sensing images cover more feature information, including texture, structure, shape, and other geometric details, while the relationships among target features are more complex. These factors make it more complicated for classical convolutional neural networks to obtain ideal results when performing a feature classification on remote-sensing images. To address this issue, we proposed an attention-based multiscale max-pooling dense network (DMAU-Net), which is based on U-Net for ground object classification. The network is designed with an integrated max-pooling module that incorporates dense connections in the encoder part to enhance the quality of the feature map, and thus improve the feature-extraction capability of the network. Equally, in the decoding, we introduce the Efficient Channel Attention (ECA) module, which can strengthen the effective features and suppress the irrelevant information. To validate the ground object classification performance of the multi-pooling integration network proposed in this paper, we conducted experiments on the Vaihingen and Potsdam datasets provided by the International Society for Photogrammetry and Remote Sensing (ISPRS). We compared DMAU-Net with other mainstream semantic segmentation models. The experimental results show that the DMAU-Net proposed in this paper effectively improves the accuracy of the feature classification of high-resolution remote-sensing images. The feature boundaries obtained by DMAU-Net are clear and regionally complete, enhancing the ability to optimize the edges of features. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. Cloud Removal from Satellite Images Using a Deep Learning Model with the Cloud-Matting Method.
- Author
-
Ma, Deying, Wu, Renzhe, Xiao, Dongsheng, and Sui, Baikai
- Subjects
- *
REMOTE-sensing images , *DEEP learning , *CONTROL groups , *OPTICAL remote sensing , *CONVOLUTIONAL neural networks , *OPTICAL limiting - Abstract
Clouds seriously limit the application of optical remote sensing images. In this paper, we remove clouds from satellite images using a novel method that considers ground surface reflections and cloud top reflections as a linear mixture of image elements from the perspective of image superposition. We use a two-step convolutional neural network to extract the transparency information of clouds and then recover the ground surface information of thin cloud regions. Given the poor balance of the generated samples, this paper also improves the binary Tversky loss function and applies it on multi-classification tasks. The model was validated on the simulated dataset and ALCD dataset, respectively. The results show that this model outperformed other control group experiments in cloud detection and removal. The model better locates the clouds in images with cloud matting, which is built based on cloud detection. In addition, the model successfully recovers the surface information of the thin cloud region when thick and thin clouds coexist, and it does not damage the original image's information. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. Improvements in Forest Segmentation Accuracy Using a New Deep Learning Architecture and Data Augmentation Technique.
- Author
-
He, Yan, Jia, Kebin, and Wei, Zhihao
- Subjects
- *
DEEP learning , *DATA augmentation , *CARBON cycle , *IMAGE segmentation , *REMOTE sensing , *CONVOLUTIONAL neural networks , *REMOTE-sensing images - Abstract
Forests are critical to mitigating global climate change and regulating climate through their role in the global carbon and water cycles. Accurate monitoring of forest cover is, therefore, essential. Image segmentation networks based on convolutional neural networks have shown significant advantages in remote sensing image analysis with the development of deep learning. However, deep learning networks typically require a large amount of manual ground truth labels for training, and existing widely used image segmentation networks struggle to extract details from large-scale high resolution satellite imagery. Improving the accuracy of forest image segmentation remains a challenge. To reduce the cost of manual labelling, this paper proposed a data augmentation method that expands the training data by modifying the spatial distribution of forest remote sensing images. In addition, to improve the ability of the network to extract multi-scale detailed features and the feature information from the NIR band of satellite images, we proposed a high-resolution forest remote sensing image segmentation network by fusing multi-scale features based on double input. The experimental results using the Sanjiangyuan plateau forest dataset show that our method achieves an IoU of 90.19%, which outperforms prevalent image segmentation networks. These results demonstrate that the proposed approaches can extract forests from remote sensing images more effectively and accurately. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
11. Urban Tree Canopy Mapping Based on Double-Branch Convolutional Neural Network and Multi-Temporal High Spatial Resolution Satellite Imagery.
- Author
-
Chen, Shuaiqiang, Chen, Meng, Zhao, Bingyu, Mao, Ting, Wu, Jianjun, and Bao, Wenxuan
- Subjects
- *
URBAN trees , *CONVOLUTIONAL neural networks , *REMOTE-sensing images , *SPATIAL resolution , *THROUGHFALL , *DEEP learning , *AUTUMN - Abstract
Accurate knowledge of urban forest patterns contributes to well-managed urbanization, but accurate urban tree canopy mapping is still a challenging task because of the complexity of the urban structure. In this paper, a new method that combines double-branch U-NET with multi-temporal satellite images containing phenological information is introduced to accurately map urban tree canopies. Based on the constructed GF-2 image dataset, we developed a double-branch U-NET based on the feature fusion strategy using multi-temporal images to obtain an accuracy improvement with an IOU (intersection over union) of 2.3% and an F1-Score of 1.3% at the pixel level compared to the U-NET using mono-temporal images which performs best in existing studies for urban tree canopy mapping. We also found that the double-branch U-NET based on the feature fusion strategy has better accuracy than the early fusion strategy and decision fusion strategy in processing multi-temporal images for urban tree canopy mapping. We compared the impact of image combinations of different seasons on the urban tree canopy mapping task and found that the combination of summer and autumn images had the highest accuracy in the study area. Our research not only provides a high-precision urban tree canopy mapping method but also provides a direction to improve the accuracy both from the model structure and data potential when using deep learning for urban tree canopy mapping. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
12. Dynamic Loss Reweighting Method Based on Cumulative Classification Scores for Long-Tailed Remote Sensing Image Classification.
- Author
-
Liu, Jiahang, Feng, Ruilei, Chen, Peng, Wang, Xiaozhen, and Ni, Yue
- Subjects
- *
CONVOLUTIONAL neural networks , *REMOTE sensing , *REMOTE-sensing images - Abstract
Convolutional neural networks have been widely used in remote sensing classification and achieved quite good results. Most of these methods are based on datasets with relatively balanced samples, but such ideal datasets are rare in applications. Long-tailed datasets are very common in practice, and the number of samples among categories in most datasets is often severely uneven and leads to bad results, especially in the category with a small sample number. To address this problem, a novel remote sensing image classification method based on loss reweighting for long-tailed data is proposed in this paper to improve the classification accuracy of samples from the tail categories. Firstly, abandoning the general weighting approach, the cumulative classification scores are proposed to construct category weights instead of the number of samples from each category. The cumulative classification score can effectively combine the number of samples and the difficulty of classification. Then, the imbalanced information of samples from each category contained in the relationships between the rows and columns of the cumulative classification score matrix is effectively extracted and used to construct the required classification weights for samples from different categories. Finally, the traditional cross-entropy loss function is improved and combined with the category weights generated in the previous step to construct a new loss reweighting mechanism for long-tailed data. Extensive experiments with different balance ratios are conducted on several public datasets, such as HistAerial, SIRI-WHU, NWPU-RESISC45, PatternNet, and AID, to verify the effectiveness of the proposed method. Compared with other similar methods, our method achieved higher classification accuracy and stronger robustness. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Capacity Estimation of Solar Farms Using Deep Learning on High-Resolution Satellite Imagery.
- Author
-
Ravishankar, Rashmi, AlMahmoud, Elaf, Habib, Abdulelah, and de Weck, Olivier L.
- Subjects
- *
DEEP learning , *REMOTE-sensing images , *SOLAR power plants , *CONVOLUTIONAL neural networks , *MACHINE learning , *OPTICAL remote sensing - Abstract
Global solar photovoltaic capacity has consistently doubled every 18 months over the last two decades, going from 0.3 GW in 2000 to 643 GW in 2019, and is forecast to reach 4240 GW by 2040. However, these numbers are uncertain, and virtually all reporting on deployments lacks a unified source of either information or validation. In this paper, we propose, optimize, and validate a deep learning framework to detect and map solar farms using a state-of-the-art semantic segmentation convolutional neural network applied to satellite imagery. As a final step in the pipeline, we propose a model to estimate the energy generation capacity of the detected solar energy facilities. Objectively, the deep learning model achieved highly competitive performance indicators, including a mean accuracy of 96.87%, and a Jaccard Index (intersection over union of classified pixels) score of 95.5%. Subjectively, it was found to detect spaces between panels producing a segmentation output at a sub-farm level that was better than human labeling. Finally, the detected areas and predicted generation capacities were validated against publicly available data to within an average error of 4.5% Deep learning applied specifically for the detection and mapping of solar farms is an active area of research, and this deep learning capacity evaluation pipeline is one of the first of its kind. We also share an original dataset of overhead solar farm satellite imagery comprising 23,000 images (256 × 256 pixels each), and the corresponding labels upon which the machine learning model was trained. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. AutoML-Based Neural Architecture Search for Object Recognition in Satellite Imagery.
- Author
-
Gudzius, Povilas, Kurasova, Olga, Darulis, Vytenis, and Filatovas, Ernestas
- Subjects
- *
REMOTE-sensing images , *OBJECT recognition (Computer vision) , *REMOTE sensing , *ARTIFICIAL satellite launching - Abstract
Advancements in optical satellite hardware and lowered costs for satellite launches raised the high demand for geospatial intelligence. The object recognition problem in multi-spectral satellite imagery carries dataset properties unique to this problem. Perspective distortion, resolution variability, data spectrality, and other features make it difficult for a specific human-invented neural network to perform well on a dispersed type of scenery, ranging data quality, and different objects. UNET, MACU, and other manually designed network architectures deliver high-performance results for accuracy and prediction speed in large objects. However, once trained on different datasets, the performance drops and requires manual recalibration or further configuration testing to adjust the neural network architecture. To solve these issues, AutoML-based techniques can be employed. In this paper, we focus on Neural Architecture Search that is capable of obtaining a well-performing network configuration without human manual intervention. Firstly, we conducted detailed testing on the top four performing neural networks for object recognition in satellite imagery to compare their performance: FastFCN, DeepLabv3, UNET, and MACU. Then we applied and further developed a Neural Architecture Search technique for the best-performing manually designed MACU by optimizing a search space at the artificial neuron cellular level of the network. Several NAS-MACU versions were explored and evaluated. Our developed AutoML process generated a NAS-MACU neural network that produced better performance compared with MACU, especially in a low-information intensity environment. The experimental investigation was performed on our annotated and updated publicly available satellite imagery dataset. We can state that the application of the Neural Architecture Search procedure has the capability to be applied across various datasets and object recognition problems within the remote sensing research field. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
15. Identifying Critical Infrastructure in Imagery Data Using Explainable Convolutional Neural Networks.
- Author
-
Elliott, Shiloh N., Shields, Ashley J. B., Klaehn, Elizabeth M., and Tien, Iris
- Subjects
- *
INFRASTRUCTURE (Economics) , *CONVOLUTIONAL neural networks , *WATER treatment plants , *PETROLEUM shipping terminals , *REMOTE-sensing images , *DAMS , *LANDSAT satellites - Abstract
To date, no method utilizing satellite imagery exists for detailing the locations and functions of critical infrastructure across the United States, making response to natural disasters and other events challenging due to complex infrastructural interdependencies. This paper presents a repeatable, transferable, and explainable method for critical infrastructure analysis and implementation of a robust model for critical infrastructure detection in satellite imagery. This model consists of a DenseNet-161 convolutional neural network, pretrained with the ImageNet database. The model was provided additional training with a custom dataset, containing nine infrastructure classes. The resultant analysis achieved an overall accuracy of 90%, with the highest accuracy for airports (97%), hydroelectric dams (96%), solar farms (94%), substations (91%), potable water tanks (93%), and hospitals (93%). Critical infrastructure types with relatively low accuracy are likely influenced by data commonality between similar infrastructure components for petroleum terminals (86%), water treatment plants (78%), and natural gas generation (78%). Local interpretable model-agnostic explanations (LIME) was integrated into the overall modeling pipeline to establish trust for users in critical infrastructure applications. The results demonstrate the effectiveness of a convolutional neural network approach for critical infrastructure identification, with higher than 90% accuracy in identifying six of the critical infrastructure facility types. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
16. A Remote-Sensing Scene-Image Classification Method Based on Deep Multiple-Instance Learning with a Residual Dense Attention ConvNet.
- Author
-
Wang, Xinyu, Xu, Haixia, Yuan, Liming, Dai, Wei, and Wen, Xianbin
- Subjects
- *
DEEP learning , *REMOTE-sensing images , *CONVOLUTIONAL neural networks , *FEATURE extraction , *REMOTE sensing , *OBJECT tracking (Computer vision) - Abstract
The spatial distribution of remote-sensing scene images is highly complex in character, so how to extract local key semantic information and discriminative features is the key to making it possible to classify accurately. However, most of the existing convolutional neural network (CNN) models tend to have global feature representations and lose the shallow features. In addition, when the network is too deep, gradient disappearance and overfitting tend to occur. To solve these problems, a lightweight, multi-instance CNN model for remote sensing scene classification is proposed in this paper: MILRDA. In the instance extraction and classifier part, more discriminative features are extracted by the constructed residual dense attention block (RDAB) while retaining shallow features. Then, the extracted features are transformed into instance-level vectors and the local information associated with bag-level labels is highlighted by the proposed channel-attention-based multi-instance pooling, while suppressing the weights of useless objects or backgrounds. Finally, the network is constrained by the cross-entropy loss function to output the final prediction results. The experimental results on four public datasets show that our proposed method can achieve comparable results to other state-of-the-art methods. Moreover, the visualization of feature maps shows that MILRDA can find more effective features. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
17. End-to-End Prediction of Lightning Events from Geostationary Satellite Images.
- Author
-
Brodehl, Sebastian, Müller, Richard, Schömer, Elmar, Spichtinger, Peter, and Wand, Michael
- Subjects
- *
REMOTE-sensing images , *ARTIFICIAL neural networks , *GEOSTATIONARY satellites , *THUNDERSTORMS , *INFRARED imaging , *CONVOLUTIONAL neural networks , *OPTICAL flow - Abstract
While thunderstorms can pose severe risks to property and life, forecasting remains challenging, even at short lead times, as these often arise in meta-stable atmospheric conditions. In this paper, we examine the question of how well we could perform short-term (up to 180 min) forecasts using exclusively multi-spectral satellite images and past lighting events as data. We employ representation learning based on deep convolutional neural networks in an "end-to-end" fashion. Here, a crucial problem is handling the imbalance of the positive and negative classes appropriately in order to be able to obtain predictive results (which is not addressed by many previous machine-learning-based approaches). The resulting network outperforms previous methods based on physically based features and optical flow methods (similar to operational prediction models) and generalizes across different years. A closer examination of the classifier performance over time and under masking of input data indicates that the learned model actually draws most information from structures in the visible spectrum, with infrared imaging sustaining some classification performance during the night. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
18. Estimation of Ground PM2.5 Concentrations in Pakistan Using Convolutional Neural Network and Multi-Pollutant Satellite Images.
- Author
-
Ahmed, Maqsood, Xiao, Zemin, and Shen, Yonglin
- Subjects
- *
AIR pollutants , *REMOTE-sensing images , *CONVOLUTIONAL neural networks , *STANDARD deviations , *FORMALDEHYDE , *PARTICULATE matter - Abstract
During the last few decades, worsening air quality has been diagnosed in many cities around the world. The accurately prediction of air pollutants, particularly, particulate matter 2.5 (PM2.5) is extremely important for environmental management. A Convolutional Neural Network (CNN) P-CNN model is presented in this paper, which uses seven different pollutant satellite images, such as Aerosol index (AER AI), Methane (CH4), Carbon monoxide (CO), Formaldehyde (HCHO), Nitrogen dioxide (NO2), Ozone (O3) and Sulfur dioxide (SO2), as auxiliary variables to estimate daily average PM2.5 concentrations. This study estimates daily average of PM2.5 concentrations in various cities of Pakistan (Islamabad, Lahore, Peshawar and Karachi) by using satellite images. The dataset contains a total of 2562 images from May-2019 to April-2020. We compare and analyze AlexNet, VGG16, ResNet50 and P-CNN model on every dataset. The accuracy of machine learning models was checked with Mean Absolute Error (MAE), Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE). The results show that P-CNN is more accurate than other approaches in estimating PM2.5 concentrations from satellite images. This study presents robust model using satellite images, useful for estimating PM2.5 concentrations. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
19. Gross Floor Area Estimation from Monocular Optical Image Using the NoS R-CNN.
- Author
-
Ji, Chao and Tang, Hong
- Subjects
- *
OPTICAL images , *MONOCULARS , *CONVOLUTIONAL neural networks , *SIGNAL convolution , *CONSTRUCTION cost estimates , *REMOTE-sensing images - Abstract
Gross floor area is defined as the product of number of building stories and its base area. Gross floor area acquisition is the core problem to estimate floor area ratio, which is an important indicator for many geographical analyses. High data acquisition cost or inherent defect of methods for existing gross floor area acquisition methods limit their applications in a wide range. In this paper we proposed three instance-wise gross floor area estimation methods in various degrees of end-to-end learning from monocular optical images based on the NoS R-CNN, which is a deep convolutional neural network to estimate the number of building stories. To the best of our knowledge, this is the first attempt to estimate instance-wise gross floor area from monocular optical satellite images. For comparing the performance of the proposed three methods, experiments on our dataset from nine cities in China were carried out, and the results were analyzed in detail in order to explore the reasons for the performance gap between the different methods. The results show that there is an inverse relationship between the model performance and the degree of end-to-end learning for base area estimation task and gross floor area estimation task. The quantitative and qualitative evaluations of the proposed methods indicate that the performances of proposed methods for accurate GFA estimation are promising for potential applications using large-scale remote sensing images. The proposed methods provide a new perspective for gross floor area/floor area ratio estimation and downstream tasks such as population estimation, living conditions assessment, etc. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. Remote Sensing Scene Classification Based on Convolutional Neural Networks Pre-Trained Using Attention-Guided Sparse Filters.
- Author
-
Chen, Jingbo, Wang, Chengyi, Ma, Zhong, Chen, Jiansheng, He, Dongxu, and Ackland, Stephen
- Subjects
- *
LAND use , *SIGNAL convolution , *ARTIFICIAL neural networks , *IMAGE processing , *REMOTE-sensing images - Abstract
Semantic-level land-use scene classification is a challenging problem, in which deep learning methods, e.g., convolutional neural networks (CNNs), have shown remarkable capacity. However, a lack of sufficient labeled images has proved a hindrance to increasing the land-use scene classification accuracy of CNNs. Aiming at this problem, this paper proposes a CNN pre-training method under the guidance of a human visual attention mechanism. Specifically, a computational visual attention model is used to automatically extract salient regions in unlabeled images. Then, sparse filters are adopted to learn features from these salient regions, with the learnt parameters used to initialize the convolutional layers of the CNN. Finally, the CNN is further fine-tuned on labeled images. Experiments are performed on the UCMerced and AID datasets, which show that when combined with a demonstrative CNN, our method can achieve 2.24% higher accuracy than a plain CNN and can obtain an overall accuracy of 92.43% when combined with AlexNet. The results indicate that the proposed method can effectively improve CNN performance using easy-to-access unlabeled images and thus will enhance the performance of land-use scene classification especially when a large-scale labeled dataset is unavailable. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
21. Learning Low Dimensional Convolutional Neural Networks for High-Resolution Remote Sensing Image Retrieval.
- Author
-
Weixun Zhou, Shawn Newsam, Congmin Li, and Zhenfeng Shao
- Subjects
- *
IMAGING systems in meteorology , *ARTIFICIAL neural networks , *HIGH resolution imaging , *REMOTE-sensing images , *IMAGE retrieval , *MACHINE learning - Abstract
Learning powerful feature representations for image retrieval has always been a challenging task in the field of remote sensing. Traditional methods focus on extracting low-level hand-crafted features which are not only time-consuming but also tend to achieve unsatisfactory performance due to the complexity of remote sensing images. In this paper, we investigate how to extract deep feature representations based on convolutional neural networks (CNNs) for high-resolution remote sensing image retrieval (HRRSIR). To this end, several effective schemes are proposed to generate powerful feature representations for HRRSIR. In the first scheme, a CNN pre-trained on a different problem is treated as a feature extractor since there are no sufficiently-sized remote sensing datasets to train a CNN from scratch. In the second scheme, we investigate learning features that are specific to our problem by first fine-tuning the pre-trained CNN on a remote sensing dataset and then proposing a novel CNN architecture based on convolutional layers and a three-layer perceptron. The novel CNN has fewer parameters than the pre-trained and fine-tuned CNNs and can learn low dimensional features from limited labelled images. The schemes are evaluated on several challenging, publicly available datasets. The results indicate that the proposed schemes, particularly the novel CNN, achieve state-of-the-art performance. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
22. Landslide Detection from Open Satellite Imagery Using Distant Domain Transfer Learning.
- Author
-
Qin, Shengwu, Guo, Xu, Sun, Jingbo, Qiao, Shuangshuang, Zhang, Lingshuai, Yao, Jingyu, Cheng, Qiushi, and Zhang, Yanqing
- Subjects
- *
LANDSLIDES , *REMOTE-sensing images , *CONVOLUTIONAL neural networks - Abstract
Using convolutional neural network (CNN) methods and satellite images for landslide identification and classification is a very efficient and popular task in geological hazard investigations. However, traditional CNNs have two disadvantages: (1) insufficient training images from the study area and (2) uneven distribution of the training set and validation set. In this paper, we introduced distant domain transfer learning (DDTL) methods for landslide detection and classification. We first introduce scene classification satellite imagery into the landslide detection task. In addition, in order to more effectively extract information from satellite images, we innovatively add an attention mechanism to DDTL (AM-DDTL). In this paper, the Longgang study area, a district in Shenzhen City, Guangdong Province, has only 177 samples as the landslide target domain. We examine the effect of DDTL by comparing three methods: the convolutional CNN, pretrained model and DDTL. We compare different attention mechanisms based on the DDTL. The experimental results show that the DDTL method has better detection performance than the normal CNN, and the AM-DDTL models achieve 94% classification accuracy, which is 7% higher than the conventional DDTL method. The requirements for the detection and classification of potential landslides at different disaster zones can be met by applying the AM-DDTL algorithm, which outperforms traditional CNN methods. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
23. Technical Solution Discussion for Key Challenges of Operational Convolutional Neural Network-Based Building-Damage Assessment from Satellite Imagery: Perspective from Benchmark xBD Dataset.
- Author
-
Su, Jinhua, Bai, Yanbing, Wang, Xingrui, Lu, Dong, Zhao, Bo, Yang, Hanfang, Mas, Erick, and Koshimura, Shunichi
- Subjects
- *
REMOTE-sensing images , *CONVOLUTIONAL neural networks , *ARTIFICIAL satellites , *BUILDING protection - Abstract
Earth Observation satellite imaging helps building diagnosis during a disaster. Several models are put forward on the xBD dataset, which can be divided into two levels: the building level and the pixel level. Models from two levels evolve into several versions that will be reviewed in this paper. There are four key challenges hindering researchers from moving forward on this task, and this paper tries to give technical solutions. First, metrics on different levels could not be compared directly. We put forward a fairer metric and give a method to convert between metrics of two levels. Secondly, drone images may be another important source, but drone data may have only a post-disaster image. This paper shows and compares methods of directly detecting and generating. Thirdly, the class imbalance is a typical feature of the xBD dataset and leads to a bad F1 score for minor damage and major damage. This paper provides four specific data resampling strategies, which are Main-Label Over-Sampling (MLOS), Discrimination After Cropping (DAC), Dilation of Area with Minority (DAM) and Synthetic Minority Over-Sampling Technique (SMOTE), as well as cost-sensitive re-weighting schemes. Fourthly, faster prediction meets the need for a real-time situation. This paper recommends three specific methods, feature-map subtraction, parameter sharing, and knowledge distillation. Finally, we developed our AI-driven Damage Diagnose Platform (ADDP). This paper introduces the structure of ADDP and technical details. Customized settings, interface preview, and upload and download satellite images are major services our platform provides. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
24. Super-Resolution of Sentinel-2 Images Using Convolutional Neural Networks and Real Ground Truth Data.
- Author
-
Galar, Mikel, Sesma, Rubén, Ayala, Christian, Albizua, Lourdes, and Aranda, Carlos
- Subjects
- *
CONVOLUTIONAL neural networks , *HIGH resolution imaging , *MULTISPECTRAL imaging , *IMAGE analysis , *REMOTE-sensing images - Abstract
Earth observation data is becoming more accessible and affordable thanks to the Copernicus programme and its Sentinel missions. Every location worldwide can be freely monitored approximately every 5 days using the multi-spectral images provided by Sentinel-2. The spatial resolution of these images for RGBN (RGB + Near-infrared) bands is 10 m, which is more than enough for many tasks but falls short for many others. For this reason, if their spatial resolution could be enhanced without additional costs, any posterior analyses based on these images would be benefited. Previous works have mainly focused on increasing the resolution of lower resolution bands of Sentinel-2 (20 m and 60 m) to 10 m resolution. In these cases, super-resolution is supported by bands captured at finer resolutions (RGBN at 10 m). On the contrary, this paper focuses on the problem of increasing the spatial resolution of 10 m bands to either 5 m or 2.5 m resolutions, without having additional information available. This problem is known as single-image super-resolution. For standard images, deep learning techniques have become the de facto standard to learn the mapping from lower to higher resolution images due to their learning capacity. However, super-resolution models learned for standard images do not work well with satellite images and hence, a specific model for this problem needs to be learned. The main challenge that this paper aims to solve is how to train a super-resolution model for Sentinel-2 images when no ground truth exists (Sentinel-2 images at 5 m or 2.5 m). Our proposal consists of using a reference satellite with a high similarity in terms of spectral bands with respect to Sentinel-2, but with higher spatial resolution, to create image pairs at both the source and target resolutions. This way, we can train a state-of-the-art Convolutional Neural Network to recover details not present in the original RGBN bands. An exhaustive experimental study is carried out to validate our proposal, including a comparison with the most extended strategy for super-resolving Sentinel-2, which consists in learning a model to super-resolve from an under-sampled version at either 40 m or 20 m to the original 10 m resolution and then, applying this model to super-resolve from 10 m to 5 m or 2.5 m. Finally, we will also show that the spectral radiometry of the native bands is maintained when super-resolving images, in such a way that they can be used for any subsequent processing as if they were images acquired by Sentinel-2. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
25. Active Fire Detection from Landsat-8 Imagery Using Deep Multiple Kernel Learning.
- Author
-
Rostami, Amirhossein, Shah-Hosseini, Reza, Asgari, Shabnam, Zarei, Arastou, Aghdami-Nia, Mohammad, and Homayouni, Saeid
- Subjects
- *
FIRE detectors , *CONVOLUTIONAL neural networks , *REMOTE-sensing images , *REMOTE sensing , *DEEP learning - Abstract
Active fires are devastating natural disasters that cause socio-economical damage across the globe. The detection and mapping of these disasters require efficient tools, scientific methods, and reliable observations. Satellite images have been widely used for active fire detection (AFD) during the past years due to their nearly global coverage. However, accurate AFD and mapping in satellite imagery is still a challenging task in the remote sensing community, which mainly uses traditional methods. Deep learning (DL) methods have recently yielded outstanding results in remote sensing applications. Nevertheless, less attention has been given to them for AFD in satellite imagery. This study presented a deep convolutional neural network (CNN) "MultiScale-Net" for AFD in Landsat-8 datasets at the pixel level. The proposed network had two main characteristics: (1) several convolution kernels with multiple sizes, and (2) dilated convolution layers (DCLs) with various dilation rates. Moreover, this paper suggested an innovative Active Fire Index (AFI) for AFD. AFI was added to the network inputs consisting of the SWIR2, SWIR1, and Blue bands to improve the performance of the MultiScale-Net. In an ablation analysis, three different scenarios were designed for multi-size kernels, dilation rates, and input variables individually, resulting in 27 distinct models. The quantitative results indicated that the model with AFI-SWIR2-SWIR1-Blue as the input variables, using multiple kernels of sizes 3 × 3, 5 × 5, and 7 × 7 simultaneously, and a dilation rate of 2, achieved the highest F1-score and IoU of 91.62% and 84.54%, respectively. Stacking AFI with the three Landsat-8 bands led to fewer false negative (FN) pixels. Furthermore, our qualitative assessment revealed that these models could detect single fire pixels detached from the large fire zones by taking advantage of multi-size kernels. Overall, the MultiScale-Net met expectations in detecting fires of varying sizes and shapes over challenging test samples. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
26. Semantic Segmentation and Edge Detection—Approach to Road Detection in Very High Resolution Satellite Images.
- Author
-
Ghandorh, Hamza, Boulila, Wadii, Masood, Sharjeel, Koubaa, Anis, Ahmed, Fawad, and Ahmad, Jawad
- Subjects
- *
HIGH resolution imaging , *TRAFFIC monitoring , *REMOTE-sensing images , *URBAN planning , *LANDSAT satellites - Abstract
Road detection technology plays an essential role in a variety of applications, such as urban planning, map updating, traffic monitoring and automatic vehicle navigation. Recently, there has been much development in detecting roads in high-resolution (HR) satellite images based on semantic segmentation. However, the objects being segmented in such images are of small size, and not all the information in the images is equally important when making a decision. This paper proposes a novel approach to road detection based on semantic segmentation and edge detection. Our approach aims to combine these two techniques to improve road detection, and it produces sharp-pixel segmentation maps, using the segmented masks to generate road edges. In addition, some well-known architectures, such as SegNet, used multi-scale features without refinement; thus, using attention blocks in the encoder to predict fine segmentation masks resulted in finer edges. A combination of weighted cross-entropy loss and the focal Tversky loss as the loss function is also used to deal with the highly imbalanced dataset. We conducted various experiments on two datasets describing real-world datasets covering the three largest regions in Saudi Arabia and Massachusetts. The results demonstrated that the proposed method of encoding HR feature maps effectively predicts sharp segmentation masks to facilitate accurate edge detection, even against a harsh and complicated background. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
27. FCAU-Net for the Semantic Segmentation of Fine-Resolution Remotely Sensed Images.
- Author
-
Niu, Xuerui, Zeng, Qiaolin, Luo, Xiaobo, and Chen, Liangfu
- Subjects
- *
CONVOLUTIONAL neural networks , *URBAN planning , *IMAGE processing , *REMOTE-sensing images , *LAND cover , *IMAGE segmentation , *IMAGE fusion - Abstract
The semantic segmentation of fine-resolution remotely sensed images is an urgent issue in satellite image processing. Solving this problem can help overcome various obstacles in urban planning, land cover classification, and environmental protection, paving the way for scene-level landscape pattern analysis and decision making. Encoder-decoder structures based on attention mechanisms have been frequently used for fine-resolution image segmentation. In this paper, we incorporate a coordinate attention (CA) mechanism, adopt an asymmetric convolution block (ACB), and design a refinement fusion block (RFB), forming a network named the fusion coordinate and asymmetry-based U-Net (FCAU-Net). Furthermore, we propose novel convolutional neural network (CNN) architecture to fully capture long-term dependencies and fine-grained details in fine-resolution remotely sensed imagery. This approach has the following advantages: (1) the CA mechanism embeds position information into a channel attention mechanism to enhance the feature representations produced by the network while effectively capturing position information and channel relationships; (2) the ACB enhances the feature representation ability of the standard convolution layer and captures and refines the feature information in each layer of the encoder; and (3) the RFB effectively integrates low-level spatial information and high-level abstract features to eliminate background noise when extracting feature information, reduces the fitting residuals of the fused features, and improves the ability of the network to capture information flows. Extensive experiments conducted on two public datasets (ZY-3 and DeepGlobe) demonstrate the effectiveness of the FCAU-Net. The proposed FCAU-Net transcends U-Net, Attention U-Net, the pyramid scene parsing network (PSPNet), DeepLab v3+, the multistage attention residual U-Net (MAResU-Net), MACU-Net, and the Transformer U-Net (TransUNet). Specifically, the FCAU-Net achieves a 97.97% (95.05%) pixel accuracy (PA), a 98.53% (91.27%) mean PA (mPA), a 95.17% (85.54%) mean intersection over union (mIoU), and a 96.07% (90.74%) frequency-weighted IoU (FWIoU) on the ZY-3 (DeepGlobe) dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
28. BiFDANet: Unsupervised Bidirectional Domain Adaptation for Semantic Segmentation of Remote Sensing Images.
- Author
-
Cai, Yuxiang, Yang, Yingchun, Zheng, Qiyi, Shen, Zhengwei, Shang, Yongheng, Yin, Jianwei, and Shi, Zhongtian
- Subjects
- *
REMOTE sensing , *REMOTE-sensing images , *IMAGE segmentation , *DEEP learning , *GENERATIVE adversarial networks , *CONVOLUTIONAL neural networks - Abstract
When segmenting massive amounts of remote sensing images collected from different satellites or geographic locations (cities), the pre-trained deep learning models cannot always output satisfactory predictions. To deal with this issue, domain adaptation has been widely utilized to enhance the generalization abilities of the segmentation models. Most of the existing domain adaptation methods, which based on image-to-image translation, firstly transfer the source images to the pseudo-target images, adapt the classifier from the source domain to the target domain. However, these unidirectional methods suffer from the following two limitations: (1) they do not consider the inverse procedure and they cannot fully take advantage of the information from the other domain, which is also beneficial, as confirmed by our experiments; (2) these methods may fail in the cases where transferring the source images to the pseudo-target images is difficult. In this paper, in order to solve these problems, we propose a novel framework BiFDANet for unsupervised bidirectional domain adaptation in the semantic segmentation of remote sensing images. It optimizes the segmentation models in two opposite directions. In the source-to-target direction, BiFDANet learns to transfer the source images to the pseudo-target images and adapts the classifier to the target domain. In the opposite direction, BiFDANet transfers the target images to the pseudo-source images and optimizes the source classifier. At test stage, we make the best of the source classifier and the target classifier, which complement each other with a simple linear combination method, further improving the performance of our BiFDANet. Furthermore, we propose a new bidirectional semantic consistency loss for our BiFDANet to maintain the semantic consistency during the bidirectional image-to-image translation process. The experiments on two datasets including satellite images and aerial images demonstrate the superiority of our method against existing unidirectional methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
29. DSMNN-Net: A Deep Siamese Morphological Neural Network Model for Burned Area Mapping Using Multispectral Sentinel-2 and Hyperspectral PRISMA Images.
- Author
-
Seydi, Seyd Teymoor, Hasanlou, Mahdi, and Chanussot, Jocelyn
- Subjects
- *
ARTIFICIAL neural networks , *REMOTE-sensing images , *FOREST fires , *FIRE detectors , *NATURAL disasters , *REMOTE sensing , *CONVOLUTIONAL neural networks , *WILDFIRE prevention - Abstract
Wildfires are one of the most destructive natural disasters that can affect our environment, with significant effects also on wildlife. Recently, climate change and human activities have resulted in higher frequencies of wildfires throughout the world. Timely and accurate detection of the burned areas can help to make decisions for their management. Remote sensing satellite imagery can have a key role in mapping burned areas due to its wide coverage, high-resolution data collection, and low capture times. However, although many studies have reported on burned area mapping based on remote sensing imagery in recent decades, accurate burned area mapping remains a major challenge due to the complexity of the background and the diversity of the burned areas. This paper presents a novel framework for burned area mapping based on Deep Siamese Morphological Neural Network (DSMNN-Net) and heterogeneous datasets. The DSMNN-Net framework is based on change detection through proposing a pre/post-fire method that is compatible with heterogeneous remote sensing datasets. The proposed network combines multiscale convolution layers and morphological layers (erosion and dilation) to generate deep features. To evaluate the performance of the method proposed here, two case study areas in Australian forests were selected. The framework used can better detect burned areas compared to other state-of-the-art burned area mapping procedures, with a performance of >98% for overall accuracy index, and a kappa coefficient of >0.9, using multispectral Sentinel-2 and hyperspectral PRISMA image datasets. The analyses of the two datasets illustrate that the DSMNN-Net is sufficiently valid and robust for burned area mapping, and especially for complex areas. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
30. Application of Deep Learning Architectures for Satellite Image Time Series Prediction: A Review.
- Author
-
Moskolaï, Waytehad Rose, Abdou, Wahabou, Dipanda, Albert, and Kolyang
- Subjects
- *
DEEP learning , *REMOTE-sensing images , *TIME series analysis , *CONVOLUTIONAL neural networks , *MACHINE learning , *ARTIFICIAL intelligence - Abstract
Satellite image time series (SITS) is a sequence of satellite images that record a given area at several consecutive times. The aim of such sequences is to use not only spatial information but also the temporal dimension of the data, which is used for multiple real-world applications, such as classification, segmentation, anomaly detection, and prediction. Several traditional machine learning algorithms have been developed and successfully applied to time series for predictions. However, these methods have limitations in some situations, thus deep learning (DL) techniques have been introduced to achieve the best performance. Reviews of machine learning and DL methods for time series prediction problems have been conducted in previous studies. However, to the best of our knowledge, none of these surveys have addressed the specific case of works using DL techniques and satellite images as datasets for predictions. Therefore, this paper concentrates on the DL applications for SITS prediction, giving an overview of the main elements used to design and evaluate the predictive models, namely the architectures, data, optimization functions, and evaluation metrics. The reviewed DL-based models are divided into three categories, namely recurrent neural network-based models, hybrid models, and feed-forward-based models (convolutional neural networks and multi-layer perceptron). The main characteristics of satellite images and the major existing applications in the field of SITS prediction are also presented in this article. These applications include weather forecasting, precipitation nowcasting, spatio-temporal analysis, and missing data reconstruction. Finally, current limitations and proposed workable solutions related to the use of DL for SITS prediction are also highlighted. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
31. LLFE: A Novel Learning Local Features Extraction for UAV Navigation Based on Infrared Aerial Image and Satellite Reference Image Matching.
- Author
-
Zhang, Xupei, He, Zhanzhuang, Ma, Zhong, Wang, Zhongxi, and Wang, Li
- Subjects
- *
FEATURE extraction , *INFRARED imaging , *AERONAUTICAL navigation , *REMOTE-sensing images , *IMAGE registration , *CONVOLUTIONAL neural networks , *DRONE aircraft - Abstract
Local features extraction is a crucial technology for image matching navigation of an unmanned aerial vehicle (UAV), where it aims to accurately and robustly match a real-time image and a geo-referenced image to obtain the position update information of the UAV. However, it is a challenging task due to the inconsistent image capture conditions, which will lead to extreme appearance changes, especially the different imaging principle between an infrared image and RGB image. In addition, the sparsity and labeling complexity of existing public datasets hinder the development of learning-based methods in this research area. This paper proposes a novel learning local features extraction method, which uses local features extracted by deep neural network to find the correspondence features on the satellite RGB reference image and real-time infrared image. First, we propose a single convolution neural network that simultaneously extracts dense local features and their corresponding descriptors. This network combines the advantages of a high repeatability local feature detector and high reliability local feature descriptors to match the reference image and real-time image with extreme appearance changes. Second, to make full use of the sparse dataset, an iterative training scheme is proposed to automatically generate the high-quality corresponding features for algorithm training. During the scheme, the dense correspondences are automatically extracted, and the geometric constraints are added to continuously improve the quality of them. With these improvements, the proposed method achieves state-of-the-art performance for infrared aerial (UAV captured) image and satellite reference image, which shows 4–6% performance improvements in precision, recall, and F1-score, compared to the other methods. Moreover, the applied experiment results show its potential and effectiveness on localization for UAVs navigation and trajectory reconstruction application. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
32. A New Lightweight Convolutional Neural Network for Multi-Scale Land Surface Water Extraction from GaoFen-1D Satellite Images.
- Author
-
Duan, Yueming, Zhang, Wenyi, Huang, Peng, He, Guojin, and Guo, Hongxiang
- Subjects
- *
CONVOLUTIONAL neural networks , *REMOTE-sensing images , *WATER efficiency , *BODIES of water , *REMOTE sensing - Abstract
Mapping land surface water automatically and accurately is closely related to human activity, biological reproduction, and the ecological environment. High spatial resolution remote sensing image (HSRRSI) data provide extensive details for land surface water and gives reliable data support for the accurate extraction of land surface water information. The convolutional neural network (CNN), widely applied in semantic segmentation, provides an automatic extraction method in land surface water information. This paper proposes a new lightweight CNN named Lightweight Multi-Scale Land Surface Water Extraction Network (LMSWENet) to extract the land surface water information based on GaoFen-1D satellite data of Wuhan, Hubei Province, China. To verify the superiority of LMSWENet, we compared the efficiency and water extraction accuracy with four mainstream CNNs (DeeplabV3+, FCN, PSPNet, and UNet) using quantitative comparison and visual comparison. Furthermore, we used LMSWENet to extract land surface water information of Wuhan on a large scale and produced the land surface water map of Wuhan for 2020 (LSWMWH-2020) with 2m spatial resolution. Random and equidistant validation points verified the mapping accuracy of LSWMWH-2020. The results are summarized as follows: (1) Compared with the other four CNNs, LMSWENet has a lightweight structure, significantly reducing the algorithm complexity and training time. (2) LMSWENet has a good performance in extracting various types of water bodies and suppressing noises because it introduces channel and spatial attention mechanisms and combines features from multiple scales. The result of land surface water extraction demonstrates that the performance of LMSWENet exceeds that of the other four CNNs. (3) LMSWENet can meet the requirement of high-precision mapping on a large scale. LSWMWH-2020 can clearly show the significant lakes, river networks, and small ponds in Wuhan with high mapping accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
33. Remote Sensing Image Scene Classification Based on Global Self-Attention Module.
- Author
-
Li, Qingwen, Yan, Dongmei, and Wu, Wanrong
- Subjects
- *
REMOTE sensing , *REMOTE-sensing images , *CONVOLUTIONAL neural networks , *OPTICAL remote sensing , *DEEP learning , *CLASSIFICATION , *TEXT recognition - Abstract
The complexity of scene images makes the research on remote-sensing image scene classification challenging. With the wide application of deep learning in recent years, many remote-sensing scene classification methods using a convolutional neural network (CNN) have emerged. Current CNN usually output global information by integrating the depth features extricated from the convolutional layer through the fully connected layer; however, the global information extracted is not comprehensive. This paper proposes an improved remote-sensing image scene classification method based on a global self-attention module to address this problem. The global information is derived from the depth characteristics extracted by the CNN. In order to better express the semantic information of the remote-sensing image, the multi-head self-attention module is introduced for global information augmentation. Meanwhile, the local perception unit is utilized to improve the self-attention module's representation capabilities for local objects. The proposed method's effectiveness is validated through comparative experiments with various training ratios and different scales on public datasets (UC Merced, AID, and NWPU-NESISC45). The precision of our proposed model is significantly improved compared to other methods for remote-sensing image scene classification. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
34. A 3D Reconstruction Framework of Buildings Using Single Off-Nadir Satellite Image.
- Author
-
Zhao, Chunhui, Zhang, Chi, Yan, Yiming, and Su, Nan
- Subjects
- *
REMOTE-sensing images , *BUILDING repair , *SPATIAL resolution , *PROBLEM solving , *CONVOLUTIONAL neural networks - Abstract
A novel framework for 3D reconstruction of buildings based on a single off-nadir satellite image is proposed in this paper. Compared with the traditional methods of reconstruction using multiple images in remote sensing, recovering 3D information that utilizes the single image can reduce the demands of reconstruction tasks from the perspective of input data. It solves the problem that multiple images suitable for traditional reconstruction methods cannot be acquired in some regions, where remote sensing resources are scarce. However, it is difficult to reconstruct a 3D model containing a complete shape and accurate scale from a single image. The geometric constraints are not sufficient as the view-angle, size of buildings, and spatial resolution of images are different among remote sensing images. To solve this problem, the reconstruction framework proposed consists of two convolutional neural networks: Scale-Occupancy-Network (Scale-ONet) and model scale optimization network (Optim-Net). Through reconstruction using the single off-nadir satellite image, Scale-Onet can generate water-tight mesh models with the exact shape and rough scale of buildings. Meanwhile, the Optim-Net can reduce the error of scale for these mesh models. Finally, the complete reconstructed scene is recovered by Model-Image matching. Profiting from well-designed networks, our framework has good robustness for different input images, with different view-angle, size of buildings, and spatial resolution. Experimental results show that an ideal reconstruction accuracy can be obtained both on the model shape and scale of buildings. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
35. Satellite Image Classification Using a Hierarchical Ensemble Learning and Correlation Coefficient-Based Gravitational Search Algorithm.
- Author
-
Thiagarajan, Kowsalya, Manapakkam Anandan, Mukunthan, Stateczny, Andrzej, Bidare Divakarachari, Parameshachari, and Kivudujogappa Lingappa, Hemalatha
- Subjects
- *
REMOTE-sensing images , *FEATURE extraction , *CONVOLUTIONAL neural networks , *FEATURE selection , *SUPPORT vector machines - Abstract
Satellite image classification is widely used in various real-time applications, such as the military, geospatial surveys, surveillance and environmental monitoring. Therefore, the effective classification of satellite images is required to improve classification accuracy. In this paper, the combination of Hierarchical Framework and Ensemble Learning (HFEL) and optimal feature selection is proposed for the precise identification of satellite images. The HFEL uses three different types of Convolutional Neural Networks (CNN), namely AlexNet, LeNet-5 and a residual network (ResNet), to extract the appropriate features from images of the hierarchical framework. Additionally, the optimal features from the feature set are extracted using the Correlation Coefficient-Based Gravitational Search Algorithm (CCGSA). Further, the Multi Support Vector Machine (MSVM) is used to classify the satellite images by extracted features from the fully connected layers of the CNN and selected features of the CCGSA. Hence, the combination of HFEL and CCGSA is used to obtain the precise classification over different datasets such as the SAT-4, SAT-6 and Eurosat datasets. The performance of the proposed HFEL–CCGSA is analyzed in terms of accuracy, precision and recall. The experimental results show that the HFEL–CCGSA method provides effective classification over the satellite images. The classification accuracy of the HFEL–CCGSA method is 99.99%, which is high when compared to AlexNet, LeNet-5 and ResNet. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
36. High-Resolution Boundary Refined Convolutional Neural Network for Automatic Agricultural Greenhouses Extraction from GaoFen-2 Satellite Imageries.
- Author
-
Zhang, Xiaoping, Cheng, Bo, Chen, Jinfen, and Liang, Chenbin
- Subjects
- *
CONVOLUTIONAL neural networks , *REMOTE-sensing images , *SPATIAL resolution , *REMOTE sensing , *GREENHOUSES , *LANDSAT satellites - Abstract
Agricultural greenhouses (AGs) are an important component of modern facility agriculture, and accurately mapping and dynamically monitoring their distribution are necessary for agricultural scientific management and planning. Semantic segmentation can be adopted for AG extraction from remote sensing images. However, the feature maps obtained by traditional deep convolutional neural network (DCNN)-based segmentation algorithms blur spatial details and insufficient attention is usually paid to contextual representation. Meanwhile, the maintenance of the original morphological characteristics, especially the boundaries, is still a challenge for precise identification of AGs. To alleviate these problems, this paper proposes a novel network called high-resolution boundary refined network (HBRNet). In this method, we design a new backbone with multiple paths based on HRNetV2 aiming to preserve high spatial resolution and improve feature extraction capability, in which the Pyramid Cross Channel Attention (PCCA) module is embedded to residual blocks to strengthen the interaction of multiscale information. Moreover, the Spatial Enhancement (SE) module is employed to integrate the contextual information of different scales. In addition, we introduce the Spatial Gradient Variation (SGV) unit in the Boundary Refined (BR) module to couple the segmentation task and boundary learning task, so that they can share latent high-level semantics and interact with each other, and combine this with the joint loss to refine the boundary. In our study, GaoFen-2 remote sensing images in Shouguang City, Shandong Province, China are selected to make the AG dataset. The experimental results show that HBRNet demonstrates a significant improvement in segmentation performance up to an IoU score of 94.89%, implying that this approach has advantages and potential for precise identification of AGs. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
37. Deep Learning Approaches to Earth Observation Change Detection.
- Author
-
Di Pilato, Antonio, Taggio, Nicolò, Pompili, Alexis, Iacobellis, Michele, Di Florio, Adriano, Passarelli, Davide, and Samarelli, Sergio
- Subjects
- *
DEEP learning , *CONVOLUTIONAL neural networks , *LAND cover , *REMOTE-sensing images , *REMOTE sensing - Abstract
The interest in change detection in the field of remote sensing has increased in the last few years. Searching for changes in satellite images has many useful applications, ranging from land cover and land use analysis to anomaly detection. In particular, urban change detection provides an efficient tool to study urban spread and growth through several years of observation. At the same time, change detection is often a computationally challenging and time-consuming task; therefore, a standard approach with manual detection of the elements of interest by experts in the domain of Earth Observation needs to be replaced by innovative methods that can guarantee optimal results with unquestionable value and within reasonable time. In this paper, we present two different approaches to change detection (semantic segmentation and classification) that both exploit convolutional neural networks to address these particular needs, which can be further refined and used in post-processing workflows for a large variety of applications. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
38. A Deep Learning Approach to an Enhanced Building Footprint and Road Detection in High-Resolution Satellite Imagery.
- Author
-
Ayala, Christian, Sesma, Rubén, Aranda, Carlos, and Galar, Mikel
- Subjects
- *
DEEP learning , *REMOTE-sensing images , *URBAN growth , *ROAD construction , *REMOTE sensing , *GEOSTATIONARY satellites , *VECTOR data , *LANDSAT satellites - Abstract
The detection of building footprints and road networks has many useful applications including the monitoring of urban development, real-time navigation, etc. Taking into account that a great deal of human attention is required by these remote sensing tasks, a lot of effort has been made to automate them. However, the vast majority of the approaches rely on very high-resolution satellite imagery (<2.5 m) whose costs are not yet affordable for maintaining up-to-date maps. Working with the limited spatial resolution provided by high-resolution satellite imagery such as Sentinel-1 and Sentinel-2 (10 m) makes it hard to detect buildings and roads, since these labels may coexist within the same pixel. This paper focuses on this problem and presents a novel methodology capable of detecting building and roads with sub-pixel width by increasing the resolution of the output masks. This methodology consists of fusing Sentinel-1 and Sentinel-2 data (at 10 m) together with OpenStreetMap to train deep learning models for building and road detection at 2.5 m. This becomes possible thanks to the usage of OpenStreetMap vector data, which can be rasterized to any desired resolution. Accordingly, a few simple yet effective modifications of the U-Net architecture are proposed to not only semantically segment the input image, but also to learn how to enhance the resolution of the output masks. As a result, generated mappings quadruplicate the input spatial resolution, closing the gap between satellite and aerial imagery for building and road detection. To properly evaluate the generalization capabilities of the proposed methodology, a data-set composed of 44 cities across the Spanish territory have been considered and divided into training and testing cities. Both quantitative and qualitative results show that high-resolution satellite imagery can be used for sub-pixel width building and road detection following the proper methodology. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
39. Precise Extraction of Buildings from High-Resolution Remote-Sensing Images Based on Semantic Edges and Segmentation.
- Author
-
Xia, Liegang, Zhang, Junxia, Zhang, Xiongbo, Yang, Haiping, and Xu, Meixia
- Subjects
- *
REMOTE-sensing images , *CONVOLUTIONAL neural networks , *PROBLEM solving , *EDGES (Geometry) , *REMOTE sensing , *IMAGE segmentation - Abstract
Building extraction is a basic task in the field of remote sensing, and it has also been a popular research topic in the past decade. However, the shape of the semantic polygon generated by semantic segmentation is irregular and does not match the actual building boundary. The boundary of buildings generated by semantic edge detection has difficulty ensuring continuity and integrity. Due to the aforementioned problems, we cannot directly apply the results in many drawing tasks and engineering applications. In this paper, we propose a novel convolutional neural network (CNN) model based on multitask learning, Dense D-LinkNet (DDLNet), which adopts full-scale skip connections and edge guidance module to ensure the effective combination of low-level information and high-level information. DDLNet has good adaptability to both semantic segmentation tasks and edge detection tasks. Moreover, we propose a universal postprocessing method that integrates semantic edges and semantic polygons. It can solve the aforementioned problems and more accurately locate buildings, especially building boundaries. The experimental results show that DDLNet achieves great improvements compared with other edge detection and semantic segmentation networks. Our postprocessing method is effective and universal. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
40. SFRS-Net: A Cloud-Detection Method Based on Deep Convolutional Neural Networks for GF-1 Remote-Sensing Images.
- Author
-
Li, Xiaolong, Zheng, Hong, Han, Chuanzhao, Zheng, Wentao, Chen, Hao, Jing, Ying, and Dong, Kaihan
- Subjects
- *
REMOTE-sensing images , *CONVOLUTIONAL neural networks , *OPTICAL remote sensing , *OPTICAL images - Abstract
Clouds constitute a major obstacle to the application of optical remote-sensing images as they destroy the continuity of the ground information in the images and reduce their utilization rate. Therefore, cloud detection has become an important preprocessing step for optical remote-sensing image applications. Due to the fact that the features of clouds in current cloud-detection methods are mostly manually interpreted and the information in remote-sensing images is complex, the accuracy and generalization of current cloud-detection methods are unsatisfactory. As cloud detection aims to extract cloud regions from the background, it can be regarded as a semantic segmentation problem. A cloud-detection method based on deep convolutional neural networks (DCNN)—that is, a spatial folding–unfolding remote-sensing network (SFRS-Net)—is introduced in the paper, and the reason for the inaccuracy of DCNN during cloud region segmentation and the concept of space folding/unfolding is presented. The backbone network of the proposed method adopts an encoder–decoder structure, in which the pooling operation in the encoder is replaced by a folding operation, and the upsampling operation in the decoder is replaced by an unfolding operation. As a result, the accuracy of cloud detection is improved, while the generalization is guaranteed. In the experiment, the multispectral data of the GaoFen-1 (GF-1) satellite is collected to form a dataset, and the overall accuracy (OA) of this method reaches 96.98%, which is a satisfactory result. This study aims to develop a method that is suitable for cloud detection and can complement other cloud-detection methods, providing a reference for researchers interested in cloud detection of remote-sensing images. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
41. An Efficient Decision Support System for Flood Inundation Management Using Intermittent Remote-Sensing Data.
- Author
-
Sun, Hai, Dai, Xiaoyi, Shou, Wenchi, Wang, Jun, and Ruan, Xuejing
- Subjects
- *
DECISION support systems , *NUMERICAL solutions to partial differential equations , *FLOOD damage , *REMOTE-sensing images , *CONVOLUTIONAL neural networks , *PONDS - Abstract
Timely acquisition of spatial flood distribution is an essential basis for flood-disaster monitoring and management. Remote-sensing data have been widely used in water-body surveys. However, due to the cloudy weather and complex geomorphic environment, the inability to receive remote-sensing images throughout the day has resulted in some data being missing and unable to provide dynamic and continuous flood inundation process data. To fully and effectively use remote-sensing data, we developed a new decision support system for integrated flood inundation management based on limited and intermittent remote-sensing data. Firstly, we established a new multi-scale water-extraction convolutional neural network named DEU-Net to extract water from remote-sensing images automatically. A specific datasets training method was created for typical region types to separate the water body from the confusing surface features more accurately. Secondly, we built a waterfront contour active tracking model to implicitly describe the flood movement interface. In this way, the flooding process was converted into the numerical solution of the partial differential equation of the boundary function. Space upwind difference format and the time Euler difference format were used to perform the numerical solution. Finally, we established seven indicators that considered regional characteristics and flood-inundation attributes to evaluate flood-disaster losses. The cloud model using the entropy weight method was introduced to account for uncertainties in various parameters. In the end, a decision support system realizing the flood losses risk visualization was developed by using the ArcGIS application programming interface (API). To verify the effectiveness of the model constructed in this paper, we conducted numerical experiments on the model's performance through comparative experiments based on a laboratory scale and actual scale, respectively. The results were as follows: (1) The DEU-Net method had a better capability to accurately extract various water bodies, such as urban water bodies, open-air ponds, plateau lakes etc., than the other comparison methods. (2) The simulation results of the active tracking model had good temporal and spatial consistency with the image extraction results and actual statistical data compared with the synthetic observation data. (3) The application results showed that the system has high computational efficiency and noticeable visualization effects. The research results may provide a scientific basis for the emergency-response decision-making of flood disasters, especially in data-sparse regions. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
42. An Efficient Approach Based on Privacy-Preserving Deep Learning for Satellite Image Classification.
- Author
-
Alkhelaiwi, Munirah, Boulila, Wadii, Ahmad, Jawad, Koubaa, Anis, and Driss, Maha
- Subjects
- *
REMOTE-sensing images , *DEEP learning , *CONVOLUTIONAL neural networks , *BIODIVERSITY monitoring , *CLASSIFICATION , *TELECOMMUNICATION satellites - Abstract
Satellite images have drawn increasing interest from a wide variety of users, including business and government, ever since their increased usage in important fields ranging from weather, forestry and agriculture to surface changes and biodiversity monitoring. Recent updates in the field have also introduced various deep learning (DL) architectures to satellite imagery as a means of extracting useful information. However, this new approach comes with its own issues, including the fact that many users utilize ready-made cloud services (both public and private) in order to take advantage of built-in DL algorithms and thus avoid the complexity of developing their own DL architectures. However, this presents new challenges to protecting data against unauthorized access, mining and usage of sensitive information extracted from that data. Therefore, new privacy concerns regarding sensitive data in satellite images have arisen. This research proposes an efficient approach that takes advantage of privacy-preserving deep learning (PPDL)-based techniques to address privacy concerns regarding data from satellite images when applying public DL models. In this paper, we proposed a partially homomorphic encryption scheme (a Paillier scheme), which enables processing of confidential information without exposure of the underlying data. Our method achieves robust results when applied to a custom convolutional neural network (CNN) as well as to existing transfer learning methods. The proposed encryption scheme also allows for training CNN models on encrypted data directly, which requires lower computational overhead. Our experiments have been performed on a real-world dataset covering several regions across Saudi Arabia. The results demonstrate that our CNN-based models were able to retain data utility while maintaining data privacy. Security parameters such as correlation coefficient (−0.004), entropy (7.95), energy (0.01), contrast (10.57), number of pixel change rate (4.86), unified average change intensity (33.66), and more are in favor of our proposed encryption scheme. To the best of our knowledge, this research is also one of the first studies that applies PPDL-based techniques to satellite image data in any capacity. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
43. Comparing Solo Versus Ensemble Convolutional Neural Networks for Wetland Classification Using Multi-Spectral Satellite Imagery.
- Author
-
Jamali, Ali, Mahdianpari, Masoud, Brisco, Brian, Granger, Jean, Mohammadimanesh, Fariba, and Salehi, Bahram
- Subjects
- *
CONVOLUTIONAL neural networks , *LAND cover , *REMOTE-sensing images , *DEEP learning , *WETLANDS , *CLIMATE change mitigation , *MACHINE learning - Abstract
Wetlands are important ecosystems that are linked to climate change mitigation. As 25% of global wetlands are located in Canada, accurate and up-to-date wetland classification is of high importance, nationally and internationally. The advent of deep learning techniques has revolutionized the current use of machine learning algorithms to classify complex environments, specifically in remote sensing. In this paper, we explore the potential and possible limitations to be overcome regarding the use of ensemble deep learning techniques for complex wetland classification and discusses the potential and limitation of various solo convolutional neural networks (CNNs), including DenseNet, GoogLeNet, ShuffleNet, MobileNet, Xception, Inception-ResNet, ResNet18, and ResNet101 in three different study areas located in Newfoundland and Labrador, Canada (i.e., Avalon, Gros Morne, and Grand Falls). Moreover, to improve the classification accuracies of wetland classes of bog, fen, marsh, swamp, and shallow water, the results of the three best CNNs in each study area is fused using three supervised classifiers of random forest (RF), bagged tree (BTree), Bayesian optimized tree (BOT), and one unsupervised majority voting classifier. The results suggest that the ensemble models, in particular BTree, have a valuable role to play in the classification of wetland classes of bog, fen, marsh, swamp, and shallow water. The ensemble CNNs show an improvement of 9.63–19.04% in terms of mean producer's accuracy compared to the solo CNNs, to recognize wetland classes in three different study areas. This research indicates a promising potential for integrating ensemble-based learning and deep learning for operational large area land cover, particularly complex wetland type classification. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
44. Vision Transformers for Remote Sensing Image Classification.
- Author
-
Bazi, Yakoub, Bashmal, Laila, Rahhal, Mohamad M. Al, Dayil, Reham Al, Ajlan, Naif Al, and Wang, Qi
- Subjects
- *
REMOTE sensing , *REMOTE-sensing images , *CONVOLUTIONAL neural networks , *NATURAL language processing , *CLASSIFICATION , *VISION , *OPTICAL remote sensing - Abstract
In this paper, we propose a remote-sensing scene-classification method based on vision transformers. These types of networks, which are now recognized as state-of-the-art models in natural language processing, do not rely on convolution layers as in standard convolutional neural networks (CNNs). Instead, they use multihead attention mechanisms as the main building block to derive long-range contextual relation between pixels in images. In a first step, the images under analysis are divided into patches, then converted to sequence by flattening and embedding. To keep information about the position, embedding position is added to these patches. Then, the resulting sequence is fed to several multihead attention layers for generating the final representation. At the classification stage, the first token sequence is fed to a softmax classification layer. To boost the classification performance, we explore several data augmentation strategies to generate additional data for training. Moreover, we show experimentally that we can compress the network by pruning half of the layers while keeping competing classification accuracies. Experimental results conducted on different remote-sensing image datasets demonstrate the promising capability of the model compared to state-of-the-art methods. Specifically, Vision Transformer obtains an average classification accuracy of 98.49%, 95.86%, 95.56% and 93.83% on Merced, AID, Optimal31 and NWPU datasets, respectively. While the compressed version obtained by removing half of the multihead attention layers yields 97.90%, 94.27%, 95.30% and 93.05%, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
45. Reduced-Complexity End-to-End Variational Autoencoder for on Board Satellite Image Compression.
- Author
-
Alves de Oliveira, Vinicius, Chabert, Marie, Oberlin, Thomas, Poulliat, Charly, Bruno, Mickael, Latry, Christophe, Carlavan, Mikael, Henrot, Simon, Falzon, Frederic, Camarero, Roberto, and Lukin, Vladimir
- Subjects
- *
IMAGE compression , *REMOTE-sensing images , *VIDEO coding , *CONVOLUTIONAL neural networks , *COMPUTATIONAL complexity , *IMAGE representation - Abstract
Recently, convolutional neural networks have been successfully applied to lossy image compression. End-to-end optimized autoencoders, possibly variational, are able to dramatically outperform traditional transform coding schemes in terms of rate-distortion trade-off; however, this is at the cost of a higher computational complexity. An intensive training step on huge databases allows autoencoders to learn jointly the image representation and its probability distribution, possibly using a non-parametric density model or a hyperprior auxiliary autoencoder to eliminate the need for prior knowledge. However, in the context of on board satellite compression, time and memory complexities are submitted to strong constraints. The aim of this paper is to design a complexity-reduced variational autoencoder in order to meet these constraints while maintaining the performance. Apart from a network dimension reduction that systematically targets each parameter of the analysis and synthesis transforms, we propose a simplified entropy model that preserves the adaptability to the input image. Indeed, a statistical analysis performed on satellite images shows that the Laplacian distribution fits most features of their representation. A complex non parametric distribution fitting or a cumbersome hyperprior auxiliary autoencoder can thus be replaced by a simple parametric estimation. The proposed complexity-reduced autoencoder outperforms the Consultative Committee for Space Data Systems standard (CCSDS 122.0-B) while maintaining a competitive performance, in terms of rate-distortion trade-off, in comparison with the state-of-the-art learned image compression schemes. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
46. ℱ 3 -Net: Feature Fusion and Filtration Network for Object Detection in Optical Remote Sensing Images.
- Author
-
Ye, Xinhai, Xiong, Fengchao, Lu, Jianfeng, Zhou, Jun, and Qian, Yuntao
- Subjects
- *
REMOTE-sensing images , *OPTICAL remote sensing , *FILTERS & filtration , *CONVOLUTIONAL neural networks , *REMOTE sensing - Abstract
Object detection in remote sensing (RS) images is a challenging task due to the difficulties of small size, varied appearance, and complex background. Although a lot of methods have been developed to address this problem, many of them cannot fully exploit multilevel context information or handle cluttered background in RS images either. To this end, in this paper, we propose a feature fusion and filtration network ( F 3 -Net) to improve object detection in RS images, which has higher capacity of combining the context information at multiple scales while suppressing the interference from the background. Specifically, F 3 -Net leverages a feature adaptation block with a residual structure to adjust the backbone network in an end-to-end manner, better considering the characteristics of RS images. Afterward, the network learns the context information of the object at multiple scales by hierarchically fusing the feature maps from different layers. In order to suppress the interference from cluttered background, the fused feature is then projected into a low-dimensional subspace by an additional feature filtration module. As a result, more relevant and accurate context information is extracted for further detection. Extensive experiments on DOTA, NWPU VHR-10, and UCAS AOD datasets demonstrate that the proposed detector achieves very promising detection performance. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
47. Multi-Label Remote Sensing Image Scene Classification by Combining a Convolutional Neural Network and a Graph Neural Network.
- Author
-
Li, Yansheng, Chen, Ruixian, Zhang, Yongjun, Zhang, Mi, and Chen, Ling
- Subjects
- *
CONVOLUTIONAL neural networks , *REMOTE-sensing images , *MULTILAYERS , *DEEP learning , *REMOTE sensing , *LABELS , *SYSTEM integration - Abstract
As one of the fundamental tasks in remote sensing (RS) image understanding, multi-label remote sensing image scene classification (MLRSSC) is attracting increasing research interest. Human beings can easily perform MLRSSC by examining the visual elements contained in the scene and the spatio-topological relationships of these visual elements. However, most of existing methods are limited by only perceiving visual elements but disregarding the spatio-topological relationships of visual elements. With this consideration, this paper proposes a novel deep learning-based MLRSSC framework by combining convolutional neural network (CNN) and graph neural network (GNN), which is termed the MLRSSC-CNN-GNN. Specifically, the CNN is employed to learn the perception ability of visual elements in the scene and generate the high-level appearance features. Based on the trained CNN, one scene graph for each scene is further constructed, where nodes of the graph are represented by superpixel regions of the scene. To fully mine the spatio-topological relationships of the scene graph, the multi-layer-integration graph attention network (GAT) model is proposed to address MLRSSC, where the GAT is one of the latest developments in GNN. Extensive experiments on two public MLRSSC datasets show that the proposed MLRSSC-CNN-GNN can obtain superior performance compared with the state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
48. Object-Oriented Open-Pit Mine Mapping Using Gaofen-2 Satellite Image and Convolutional Neural Network, for the Yuzhou City, China.
- Author
-
Chen, Tao, Hu, Naixun, Niu, Ruiqing, Zhen, Na, and Plaza, Antonio
- Subjects
- *
CONVOLUTIONAL neural networks , *REMOTE-sensing images , *LAND use mapping , *MINES & mineral resources , *ENVIRONMENTAL management , *STRIP mining - Abstract
Our society's growing need for mineral resources brings with it the associated risk of degrading our natural environment as well as impacting on neighboring communities. To better manage this risk, especially for open-pit mine (OM) operations, new earth observation tools are required for more accurate baseline mapping and subsequent monitoring. The purpose of this paper is to propose an object-oriented open-pit mine mapping (OOMM) framework from Gaofen-2 (GF-2) high-spatial resolution satellite image (HSRSI), based on convolutional neural networks (CNNs). To better present the different land use categories (LUCs) in the OM area, a minimum heterogeneity criterion-based multi-scale segmentation method was used, while a mean area ratio method was applied to optimize the segmentation scale of each LUC. After image segmentation, three object-feature domains were obtained based on the GF-2 HSRSI: spectral, texture, and geometric features. Then, the gradient boosting decision tree and Pearson correlation coefficient were used as an object feature information reduction (FIR) method to recognize the distinguishing feature that describe open-pit mines (OMs). Finally, the CNN was used by combing the significant features to map the OM. In total, 105 OM sites were extracted from the interpretation of GF-2 HSRSIs and the boundary of each OM was validated by field work and used as inputs to evaluate the open-pit mine mapping (OMM) accuracy. The results revealed that: (1) the FIR tool made a positive impact on effective OMM; (2) by splitting the segmented objects into two groups, training and testing sets which are composed of 70% of the objects, and validation sets which are formed by the remaining 30% of the objects, then combing the selected feature subsets for training to achieve an overall accuracy (OA) of 90.13% and a Kappa coefficient (KC) of 0.88 of the whole datasets; (3) comparing the results of the state-of-the-art method, support vector machine (SVM), in OMM, the proposed framework outperformed SVM by more than 7.28% in OA, 8.64% in KC, 6.15% in producer accuracy of OM and by 9.31% in user accuracy of OM. To the best of our knowledge, it is the first time that OM information has been used through the integration of multiscale segmentation of HSRSI with the CNN to get OMM results. The proposed framework can not only provide reliable technical support for the scientific management and environmental monitoring of open pit mining areas, but also be of wide generality and be applicable to other kinds of land use mapping in mining areas using HSR images. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
49. Farmland Parcel Mapping in Mountain Areas Using Time-Series SAR Data and VHR Optical Images.
- Author
-
Liu, Wei, Wang, Jian, Luo, Jiancheng, Wu, Zhifeng, Chen, Jingdong, Zhou, Yanan, Sun, Yingwei, Shen, Zhanfeng, Xu, Nan, and Yang, Yingpin
- Subjects
- *
OPTICAL images , *SYNTHETIC aperture radar , *CONVOLUTIONAL neural networks , *REMOTE-sensing images , *INSPECTION & review - Abstract
Accurate, timely, and reliable farmland mapping is a prerequisite for agricultural management and environmental assessment in mountainous areas. However, in these areas, high spatial heterogeneity and diversified planting structures together generate various small farmland parcels with irregular shapes that are difficult to accurately delineate. In addition, the absence of optical data caused by the cloudy and rainy climate impedes the use of time-series optical data to distinguish farmland from other land use types. Automatic delineation of farmland parcels in mountain areas is still a very difficult task. This paper proposes an innovative precise farmland parcel extraction approach supported by very high resolution(VHR) optical image and time series synthetic aperture radar(SAR) data. Firstly, Google satellite imagery with a spatial resolution of 0.55 m was used for delineating the boundaries of ground parcel objects in mountainous areas by a hierarchical extraction scheme. This scheme divides farmland into four types based on the morphological features presented in optical imagery, and designs different extraction models to produce each farmland type, respectively. The potential farmland parcel distribution map is then obtained by the layered recombination of these four farmland types. Subsequently, the time profile of each parcel in this map was constructed by five radar variables from the Sentinel-1A dataset, and the time-series classification method was used to distinguish farmland parcels from other types. An experiment was carried out in the north of Guiyang City, Guizhou Province, Southwest China. The result shows that, the producer's accuracy of farmland parcels obtained by the hierarchical scheme is increased by 7.39% to 96.38% compared with that without this scheme, and the time-series classification method produces an accuracy of 80.83% to further obtain the final overall accuracy of 96.05% for the farmland parcel maps, showing a good performance. In addition, through visual inspection, this method has a better suppression effect on background noise in mountainous areas, and the extracted farmland parcels are closer to the actual distribution of the ground farmland. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
50. Impairing Land Registry: Social, Demographic, and Economic Determinants of Forest Classification Errors.
- Author
-
Adamiak, Maciej, Biczkowski, Mirosław, Leśniewska-Napierała, Katarzyna, Nalej, Marta, and Napierała, Tomasz
- Subjects
- *
FOREST protection , *LAND use planning , *METROPOLITAN areas , *CONVOLUTIONAL neural networks , *REMOTE-sensing images , *CLINICAL trial registries - Abstract
This paper investigates the social, demographic, and economic factors determining differences between forest identification based on remote sensing techniques and land registry. The Database of Topographic Objects and Sentinel-2 satellite imagery data from 2018 were used to train a forest detection supervised machine learning model. Results aggregated to communes (NUTS-5 units) were compared to data from land registry delivered in Local Data Bank by Statistics Poland. The differences identified between above mentioned sources were defined as errors of land registry. Then, geographically weighted regression was applied to explain spatially varying impact of investigated errors' determinants: Urbanization processes, civic society development, education, land ownership, and culture and quality of spatial planning. The research area covers the entirety of Poland. It was confirmed that in less developed areas, local development policy stimulating urbanization processes does not respect land use planning principles, including the accuracy of land registry. A high education level of the society leads to protective measures before the further increase of the investigated forest cover's overestimation of the land registry in substantially urbanized areas. Finally, higher coverage by valid local spatial development plans stimulate protection against forest classification errors in the land registry. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.