27 results on '"Swin-Unet"'
Search Results
2. EDH-STNet: An Evaporation Duct Height Spatiotemporal Prediction Model Based on Swin-Unet Integrating Multiple Environmental Information Sources.
- Author
-
Ji, Hanjie, Guo, Lixin, Zhang, Jinpeng, Wei, Yiwen, Guo, Xiangming, and Zhang, Yusheng
- Subjects
- *
MACHINE learning , *RADIO technology , *PREDICTION models , *INFORMATION resources , *FORECASTING - Abstract
Given the significant spatial non-uniformity of marine evaporation ducts, accurately predicting the regional distribution of evaporation duct height (EDH) is crucial for ensuring the stable operation of radio systems. While machine-learning-based EDH prediction models have been extensively developed, they fail to provide the EDH distribution over large-scale regions in practical applications. To address this limitation, we have developed a novel spatiotemporal prediction model for EDH that integrates multiple environmental information sources, termed the EDH Spatiotemporal Network (EDH-STNet). This model is based on the Swin-Unet architecture, employing an Encoder–Decoder framework that utilizes consecutive Swin-Transformers. This design effectively captures complex spatial correlations and temporal characteristics. The EDH-STNet model also incorporates nonlinear relationships between various hydrometeorological parameters (HMPs) and EDH. In contrast to existing models, it introduces multiple HMPs to enhance these relationships. By adopting a data-driven approach that integrates these HMPs as prior information, the accuracy and reliability of spatiotemporal predictions are significantly improved. Comprehensive testing and evaluation demonstrate that the EDH-STNet model, which merges an advanced deep learning algorithm with multiple HMPs, yields accurate predictions of EDH for both immediate and future timeframes. This development offers a novel solution to ensure the stable operation of radio systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Green Tomato Segmentation Model Based on Optimized Swin-Unet Algorithm Under Facility Environments.
- Author
-
Ru Jiang, Huichuan Duan, Jingyu Yan, and Weikuan Jia
- Subjects
- *
COMPUTER vision , *AGRICULTURE , *FEATURE extraction , *AGRICULTURAL equipment , *FRUIT , *TOMATOES - Abstract
In facility-based agricultural environments, accurately identifying green tomatoes presents a significant challenge for machine vision systems due to the color similarity between green fruits and background branches and leaves as well as the overlapping occlusion between fruits. To solve this problem, this study constructs and optimizes the Attention Gate (AG) module using Swin-Unet as the baseline model, so that the model can focus on the features related to green tomatoes, suppress irrelevant regions in the background, and effectively enhance the representation of target features. Additionally, in order to optimize the edge smoothing of green tomato segmentation, this study further introduces a Atrous Spatial Pyramid Pooling (ASPP) module, which significantly improves the segmentation accuracy by expanding the feature sensing field and enhancing the multi-scale feature extraction capability. Experimental results on the specially constructed green tomato dataset show that the model achieves 97.5%, 92.4% and 85.9% for Pixel Accuracy (PA), Dice similarity coefficient (Dice) and Intersection over Union (IOU), respectively. The new model outperforms existing partial semantic segmentation models in several key metrics, proving its effectiveness in complex facility environments. This research not only addresses the technical difficulties in recognizing green fruits, but also provides solid technical support for the development and application of intelligent agricultural equipment. The model can be applied to segmentation and recognition of other types of fruits to meet the accuracy and efficiency requirements of green fruit recognition in smart agricultural equipment, which has a broad application prospect. [ABSTRACT FROM AUTHOR]
- Published
- 2024
4. A Classification and Segmentation Model for Diamond Abrasive Grains Based on Improved Swin-Unet-SAM.
- Author
-
Lin, Yanfen, Fan, Tinghao, and Fang, Congfu
- Subjects
INDUSTRIAL diamonds ,TRANSFORMER models ,INDUSTRIAL capacity ,MACHINE learning ,DIAMONDS ,DEEP learning - Abstract
The detection of abrasive grain images in diamond tools serves as the foundation for assessing the overall condition of the tools, encompassing crucial aspects of diamond abrasive grains like the quantity, size, morphology, and distribution. Given the intricate background textures and reflective characteristics exhibited by diamond images, diamond detection and segmentation pose a significant challenge. Recently, numerous defect detection methods based on machine learning and deep learning have emerged. However, several issues persist, such as detection accuracy and the interference caused by intricate background textures. The present work demonstrates an efficient classification and segmentation network algorithm that combines Swin-Unet with SAM (Segment Anything Model) to alleviate the existing problems. Specifically, four embedding structures were devised to bridge the two models for iterative training. The transformer blocks within the Swin-Unet model were enhanced to facilitate classification and coarse segmentation, and the mask structure in SAM was refined to enable fine segmentation. The experimental results show that under a small sample dataset with complex background textures, the average index values of ACC (accuracy), SE (Sensitivity), and DSC (Dice Similarity Coefficient) for the classification and segmentation of diamond abrasive grains reached 98.7%, 92.5%, and 85.9%, respectively. Compared with the model before improvement, its ACC, SE and DSC increased by 1.2%, 15.9%, and 7.6%, respectively. The test results, based on four different datasets, consistently indicated that this model has excellent segmentation performance and robustness and has great application potential in the industrial field. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. 2D magnetotelluric imaging method based on visionary self-attention mechanism and data science.
- Author
-
Luo, Yaohua, Li, Jiachen, Wang, Xuben, Zong, Junjie, and Tang, Haoyu
- Subjects
- *
MAGNETOTELLURIC prospecting , *UNDERGROUND construction , *ELECTROMAGNETIC fields , *DEEP learning , *DATA science - Abstract
2D magnetotelluric (MT) imaging detects underground structures by measuring electromagnetic fields. This study tackles two issues in the field: traditional methods' limitations due to insufficient forward modeling data, and the challenge of multiple solutions in complex scenarios. We introduce an enhanced 2D MT imaging approach with a novel self-attention mechanism, involving: 1. Generating diverse geophysical models and responses to increase data variety and volume. 2. Creating a Swin–Unet-based 2D MT Imaging network with self-attention for better modeling and relation capture, incorporating a MT sample generator using real data to lessen large-scale supervised training dependence, and refining the loss function for optimal validation. This method also includes eliminating MT background response to boost training efficiency and reduce training time. 3. Applying a transverse electric/transverse magnetic method for comprehensive 2D MT data response. Tests show that our method greatly improves 2D MT imaging's accuracy and efficiency, with excellent generalization. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. The InterVision Framework: An Enhanced Fine-Tuning Deep Learning Strategy for Auto-Segmentation in Head and Neck.
- Author
-
Choi, Byongsu, Beltran, Chris J., Yoo, Sang Kyun, Kwon, Na Hye, Kim, Jin Sung, and Park, Justin Chunjoo
- Subjects
- *
MACHINE learning , *RADIOTHERAPY treatment planning , *DEEP learning , *IMAGE registration , *TRANSFORMER models - Abstract
Simple Summary: The InterVision framework employs advanced deep learning techniques to interpolate or create intermediate images between existing ones using deformable vectors, thereby capturing specific patient characteristics, such as unique anatomical features and variations in organ shape, size, and position. These characteristics are vital for personalizing treatment plans in radiotherapy, as they allow for the use of pre-planning information, which is available before the treatment begins, ensuring a tailored and precise approach to each patient's care. The training process involves two steps: first, generating a general model using a comprehensive dataset, and second, fine-tuning this general model with additional data produced by the InterVision framework. By incorporating the dataset generated through the InterVision framework, we were able to create a more personalized model, surpassing the level of customization achieved by previous fine-tuning approaches. The performance of these models is evaluated using the volumetric dice similarity coefficient (VDSC) and the Hausdorff distance 95% (HD95%) across 18 anatomical structures in 20 test patients. A total of 18 anatomical structures were selected based on prior treatments that involved the most organs, and 20 test patients were chosen according to the availability that has a re-planning CT and manual contours within the total dataset. This framework is especially valuable for accurately predicting complex organs and targets that present significant challenges for traditional deep learning algorithms, particularly due to the intricate contours and the variability in organ shapes across different patients. Adaptive radiotherapy (ART) workflows are increasingly adopted to achieve dose escalation and tissue sparing under dynamic anatomical conditions. However, recontouring and time constraints hinder the implementation of real-time ART workflows. Various auto-segmentation methods, including deformable image registration, atlas-based segmentation, and deep learning-based segmentation (DLS), have been developed to address these challenges. Despite the potential of DLS methods, clinical implementation remains difficult due to the need for large, high-quality datasets to ensure model generalizability. This study introduces an InterVision framework for segmentation. The InterVision framework can interpolate or create intermediate visuals between existing images to generate specific patient characteristics. The InterVision model is trained in two steps: (1) generating a general model using the dataset, and (2) tuning the general model using the dataset generated from the InterVision framework. The InterVision framework generates intermediate images between existing patient image slides using deformable vectors, effectively capturing unique patient characteristics. By creating a more comprehensive dataset that reflects these individual characteristics, the InterVision model demonstrates the ability to produce more accurate contours compared to general models. Models are evaluated using the volumetric dice similarity coefficient (VDSC) and the Hausdorff distance 95% (HD95%) for 18 structures in 20 test patients. As a result, the Dice score was 0.81 ± 0.05 for the general model, 0.82 ± 0.04 for the general fine-tuning model, and 0.85 ± 0.03 for the InterVision model. The Hausdorff distance was 3.06 ± 1.13 for the general model, 2.81 ± 0.77 for the general fine-tuning model, and 2.52 ± 0.50 for the InterVision model. The InterVision model showed the best performance compared to the general model. The InterVision framework presents a versatile approach adaptable to various tasks where prior information is accessible, such as in ART settings. This capability is particularly valuable for accurately predicting complex organs and targets that pose challenges for traditional deep learning algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. A Multi-Organ Segmentation Network Based on Densely Connected RL-Unet.
- Author
-
Zhang, Qirui, Xu, Bing, Liu, Hu, Zhang, Yu, and Yu, Zhiqiang
- Subjects
CONVOLUTIONAL neural networks ,FEATURE selection ,TRANSFORMER models ,IMAGE segmentation ,CONTEXTUAL learning - Abstract
The convolutional neural network (CNN) has been widely applied in medical image segmentation due to its outstanding nonlinear expression ability. However, applications of CNN are often limited by the receptive field, preventing it from modeling global dependencies. The recently proposed transformer architecture, which uses a self-attention mechanism to model global context relationships, has achieved promising results. Swin-Unet is a Unet-like simple transformer semantic segmentation network that combines the dominant feature of both the transformer and Unet. Even so, Swin-Unet has some limitations, such as only learning single-scale contextual features, and it lacks inductive bias and effective multi-scale feature selection for processing local information. To solve these problems, the Residual Local induction bias-Unet (RL-Unet) algorithm is proposed in this paper. First, the algorithm introduces a local induction bias module into the RLSwin-Transformer module and changes the multi-layer perceptron (MLP) into a residual multi-layer perceptron (Res-MLP) module to model local and remote dependencies more effectively and reduce feature loss. Second, a new densely connected double up-sampling module is designed, which can further integrate multi-scale features and improve the segmentation accuracy of the target region. Third, a novel loss function is proposed that can significantly enhance the performance of multiple scales segmentation and the segmentation results for small targets. Finally, experiments were conducted using four datasets: Synapse, BraTS2021, ACDC, and BUSI. The results show that the performance of RL-Unet is better than that of Unet, Swin-Unet, R2U-Net, Attention-Unet, and other algorithms. Compared with them, RL-Unet produces significantly a lower Hausdorff Distance at 95% threshold (HD95) and comparable Dice Similarity Coefficient (DSC) results. Additionally, it exhibits higher accuracy in segmenting small targets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Integrated pixel-level crack detection and quantification using an ensemble of advanced U-Net architectures
- Author
-
Rakshitha R, Srinath S, N Vinay Kumar, Rashmi S, and Poornima B V
- Subjects
Crack segmentation ,Crack quantification ,Deep learning ,U-Net ,TransUNet ,Swin-UNet ,Technology - Abstract
Automated pavement crack detection faces significant challenges due to the complex shapes of crack patterns, their similarity to non-crack textures, and varying environmental conditions such as lighting and noise. Traditional methods often struggle to adapt, leading to inconsistent and less accurate results in real-world scenarios. This study introduces a hybrid framework that combines convolutional and transformer-based architectures, leveraging their strengths to achieve reliable crack segmentation and pixel-level quantification. The framework incorporates state-of-the-art deep learning models, including U-Net, Attention U-Net, Residual Attention U-Net (RAUNet), TransUNet, and Swin-Unet. U-Net variants, enhanced with attention mechanisms and residual connections, improve feature extraction and gradient flow, enabling precise delineation of crack boundaries. Transformer-based models like TransUNet and Swin-Unet use self-attention mechanisms to capture both local and global spatial relationships, enhancing robustness across diverse crack patterns. A key contribution of this study is the evaluation of loss functions, including Binary Cross-Entropy (BCE) Loss, Dice Loss, and Binary Focal Loss. Binary Focal Loss proved particularly effective in addressing class imbalance across four benchmark datasets. To further improve segmentation performance, two ensemble strategies were applied: stochastic reordering using logical operations (AND, OR, and averaging) and a weighted average ensemble optimized through grid search. The weighted average ensemble demonstrated superior performance, achieving mean Intersection over Union (mIoU) scores of 0.73, 0.70, 0.78, and 0.86 on the CFD, AgileRN, Crack500, and DeepCrack datasets, respectively. In addition to segmentation, this study developed a method for accurately quantifying crack length and width. By using Euclidean distance along skeletal paths, the algorithm minimized error rates in length and width estimation. This framework provides a scalable and efficient solution for automated pavement crack analysis. It addresses critical challenges in accuracy, adaptability, and reliability under diverse operational conditions, marking significant progress in crack detection technology.
- Published
- 2025
- Full Text
- View/download PDF
9. Phát hiện và hiển thị 3D vùng bất thường trên ảnh MRI não với cổng dịch vụ Billow AISA
- Author
-
Lê Minh Lợi, Trần Nguyễn Minh Thư, Nguyễn Thiện Hùng, Hồ Quốc An, and Phạm Nguyên Khang
- Subjects
Hiển thị hình ảnh ba chiều ,Phát hiện vùng bất thường ,Swin-Unet ,Science - Abstract
Việc phát hiện kịp thời khối u hỗ trợ các bác sĩ trong quá trình chẩn đoán và điều trị cho bệnh nhân được thực hiện hiệu quả trong tình trạng các bệnh viện luôn quá tải là rất cần thiết. Ứng dụng Slicer cho phép dựng hình ảnh 2D vùng tổn thương thành dữ liệu khối 3D giúp các bác sĩ có cái nhìn trực quan hơn trong việc chẩn đoán và điều trị. Tuy nhiên, ứng dụng Slicer chưa cho phép phát hiện tự động vùng bất thường và yêu cầu máy tính đủ mạnh để thực thi các mô hình này. Trong nghiên cứu này, tiện ích mở rộng Billow AISA cho Slicer được đề xuất nhằm xây dựng một cổng dịch vụ phân tích, dự đoán từ dữ liệu ảnh do người dùng cung cấp. Chức năng phân tích, dự đoán được thử nghiệm trong nghiên cứu này là phát hiện vùng bất thường trên ảnh MRI não với mô hình Swin-Unet. Kết quả thực nghiệm trên tập dữ liệu thu thập từ Bệnh viện Trường Đại học Y Dược Cần Thơ cho thấy tính khả thi và hiệu quả của mô hình Billow AISA.
- Published
- 2024
- Full Text
- View/download PDF
10. A Siamese Swin-Unet for image change detection
- Author
-
Yizhuo Tang, Zhengtao Cao, Ningbo Guo, and Mingyong Jiang
- Subjects
Change detection ,Remote sensing ,Swin transformer ,Swin-Unet ,Siamesenet ,Medicine ,Science - Abstract
Abstract The problem of change detection in remote sensing image processing is both difficult and important. It is extensively used in a variety of sectors, including land resource planning, monitoring and forecasting of agricultural plant health, and monitoring and assessment of natural disasters. Remote sensing images provide a large amount of long-term and fully covered data for earth environmental monitoring. A lot of progress has been made thanks to deep learning's quick development. But the majority of deep learning-based change detection techniques currently in use rely on the well-known Convolutional neural network (CNN). However, considering the locality of convolutional operation, CNN unable to master the interplay between global and distant semantic information. Some researches has employ Vision Transformer as a backbone in remote sensing field. Inspired by these researches, in this paper, we propose a network named Siam-Swin-Unet, which is a Siamesed pure Transformer with U-shape construction for remote sensing image change detection. Swin Transformer is a hierarchical vision transformer with shifted windows that can extract global feature. To learn local and global semantic feature information, the dual-time image are fed into Siam-Swin-Unet which is composed of Swin Transformer, Unet Siamesenet and two feature fusion module. Considered the Unet and Siamesenet are effective for change detection, We applied it to the model. The feature fusion module is designed for fusion of dual-time image features, and is efficient and low-compute confirmed by our experiments. Our network achieved 94.67 F1 on the CDD dataset (season varying).
- Published
- 2024
- Full Text
- View/download PDF
11. EDH-STNet: An Evaporation Duct Height Spatiotemporal Prediction Model Based on Swin-Unet Integrating Multiple Environmental Information Sources
- Author
-
Hanjie Ji, Lixin Guo, Jinpeng Zhang, Yiwen Wei, Xiangming Guo, and Yusheng Zhang
- Subjects
evaporation duct height ,Swin-Unet ,environmental information ,hydrometeorological parameters (HMPs) ,spatiotemporal prediction ,Science - Abstract
Given the significant spatial non-uniformity of marine evaporation ducts, accurately predicting the regional distribution of evaporation duct height (EDH) is crucial for ensuring the stable operation of radio systems. While machine-learning-based EDH prediction models have been extensively developed, they fail to provide the EDH distribution over large-scale regions in practical applications. To address this limitation, we have developed a novel spatiotemporal prediction model for EDH that integrates multiple environmental information sources, termed the EDH Spatiotemporal Network (EDH-STNet). This model is based on the Swin-Unet architecture, employing an Encoder–Decoder framework that utilizes consecutive Swin-Transformers. This design effectively captures complex spatial correlations and temporal characteristics. The EDH-STNet model also incorporates nonlinear relationships between various hydrometeorological parameters (HMPs) and EDH. In contrast to existing models, it introduces multiple HMPs to enhance these relationships. By adopting a data-driven approach that integrates these HMPs as prior information, the accuracy and reliability of spatiotemporal predictions are significantly improved. Comprehensive testing and evaluation demonstrate that the EDH-STNet model, which merges an advanced deep learning algorithm with multiple HMPs, yields accurate predictions of EDH for both immediate and future timeframes. This development offers a novel solution to ensure the stable operation of radio systems.
- Published
- 2024
- Full Text
- View/download PDF
12. A Siamese Swin-Unet for image change detection
- Author
-
Tang, Yizhuo, Cao, Zhengtao, Guo, Ningbo, and Jiang, Mingyong
- Published
- 2024
- Full Text
- View/download PDF
13. A Spinal MRI Image Segmentation Method Based on Improved Swin-UNet.
- Author
-
Cao, Jie, Fan, Jiacheng, Chen, Chin-Ling, Wu, Zhenyu, Jiang, Qingxuan, and Li, Shikai
- Abstract
As the number of patients increases, physicians are dealing with more and more cases of degenerative spine pathologies on a daily basis. To reduce the workload of healthcare professionals, we propose a modified Swin-UNet network model. Firstly, the Swin Transformer Blocks are improved using a residual post-normalization and scaling cosine attention mechanism, which makes the training process of the model more stable and improves the accuracy. Secondly, we use the log-space continuous position biasing method instead of the bicubic interpolation position biasing method. This method solves the problem of performance loss caused by the large difference between the resolution of the pretraining image and the resolution of the spine image. Finally, we introduce a segmentation smooth module (SSM) at the decoder stage. The SSM effectively reduces redundancy, and enhances the segmentation edge processing to improve the model’s segmentation accuracy. To validate the proposed method, we conducted experiments on a real dataset provided by hospitals. The average segmentation accuracy is no less than 95%. The experimental results demonstrate the superiority of the proposed method over the original model and other models of the same type in segmenting the spinous processes of the vertebrae and the posterior arch of the spine. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. A Method for Image-Based Interpretation of the Pulverized Coal Cloud in the Blast Furnace Tuyeres.
- Author
-
Zhou, Guanwei, Saxén, Henrik, Mattila, Olli, and Yu, Yaowei
- Subjects
PULVERIZED coal ,TRANSFORMER models ,IMAGE analysis ,BLAST furnaces ,IMAGE processing ,COAL ,IMAGE segmentation - Abstract
The conditions in the combustion zones, i.e., the raceways, are crucial for the operation of the blast furnace. In recent years, advancements in tuyere cameras and image processing and interpretation techniques have provided a better means by which to obtain information from this region of the furnace. In this study, a comprehensive approach is proposed to visually monitor the status of the pulverized coal cloud at the tuyeres based on a carefully designed processing strategy. Firstly, tuyere images are preprocessed to remove noise and enhance image quality, applying the adaptive Otsu algorithm to detect the edges of the coal cloud, enabling precise delineation of the pulverized coal region. Next, a Swin–Unet model, which combines the strengths of Swin Transformer and U-Net architecture, is employed for accurate segmentation of the coal cloud area. The extracted pulverized coal cloud features are analyzed using RGB super-pixel weighting, which takes into account the variations in color within the cloud region. It is demonstrated that the pulverized coal injection rate shows a correlation with the state of the cloud detected based on the images. The effectiveness of this visual monitoring method is validated using real-world data obtained from a blast furnace of SSAB Europe. The experimental results align with earlier research findings and practical operational experience. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Mpox Virus Image Segmentation Based on Multiscale Expansion Convolution
- Author
-
Jian-Fei Ma, Peng-Fei He, Cheng-Lin Li, and Rong Nie
- Subjects
Mpox virus ,Swin-Unet ,expansion convolution ,triplet attention ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
A novel strategy for segmenting mpox virus images is proposed to address the challenge of distinguishing lesion areas from muscle tissue and other regions of infection. This strategy leverages a multiscale inflated convolutional feature fusion and attentional Swin-Unet approach. In this method, a multi-scale extended convolution module is employed in the coding stage of the Swin-Unet network to enhance complementary features while preserving different features at different scales. Additionally, a triple attention module is integrated into the downsampling process to address the issue of inter-channel independence. Finally, the Swin Transformer Block is utilized to modulate the network segmentation performance by adjusting the iteration count in the encoding and decoding regions. Experimental results on a self-constructed mpox dataset demonstrate that the proposed network achieves a pixel segmentation accuracy of 90.4% and an average intersection-over-union ratio of 80.3%. These values represent improvements of 8.6% and 14.6%, respectively, compared to the original Swin-Unet. This enhancement provides valuable support for the ancillary diagnosis of mpox.
- Published
- 2024
- Full Text
- View/download PDF
16. A Multi-Organ Segmentation Network Based on Densely Connected RL-Unet
- Author
-
Qirui Zhang, Bing Xu, Hu Liu, Yu Zhang, and Zhiqiang Yu
- Subjects
Swin-Unet ,local induction bias ,Res-MLP ,feature fusion ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
The convolutional neural network (CNN) has been widely applied in medical image segmentation due to its outstanding nonlinear expression ability. However, applications of CNN are often limited by the receptive field, preventing it from modeling global dependencies. The recently proposed transformer architecture, which uses a self-attention mechanism to model global context relationships, has achieved promising results. Swin-Unet is a Unet-like simple transformer semantic segmentation network that combines the dominant feature of both the transformer and Unet. Even so, Swin-Unet has some limitations, such as only learning single-scale contextual features, and it lacks inductive bias and effective multi-scale feature selection for processing local information. To solve these problems, the Residual Local induction bias-Unet (RL-Unet) algorithm is proposed in this paper. First, the algorithm introduces a local induction bias module into the RLSwin-Transformer module and changes the multi-layer perceptron (MLP) into a residual multi-layer perceptron (Res-MLP) module to model local and remote dependencies more effectively and reduce feature loss. Second, a new densely connected double up-sampling module is designed, which can further integrate multi-scale features and improve the segmentation accuracy of the target region. Third, a novel loss function is proposed that can significantly enhance the performance of multiple scales segmentation and the segmentation results for small targets. Finally, experiments were conducted using four datasets: Synapse, BraTS2021, ACDC, and BUSI. The results show that the performance of RL-Unet is better than that of Unet, Swin-Unet, R2U-Net, Attention-Unet, and other algorithms. Compared with them, RL-Unet produces significantly a lower Hausdorff Distance at 95% threshold (HD95) and comparable Dice Similarity Coefficient (DSC) results. Additionally, it exhibits higher accuracy in segmenting small targets.
- Published
- 2024
- Full Text
- View/download PDF
17. SE-SWIN UNET FOR IMAGE SEGMENTATION OF MAJOR MAIZE FOLIAR DISEASES
- Author
-
Yujie Yang, Congsheng Wang, Qing Zhao, Guoqiang Li, and Hecang Zang
- Subjects
maize leaf diseases ,image segmentation ,Swin-Unet ,Swin transformer ,SENet ,Agriculture (General) ,S1-972 - Abstract
ABSTRACT Maize yields are important for human food security, and the issue of how to quickly and accurately segment areas of maize disease is an important one in the field of smart agriculture. To address the problem of irregular and multi-area clustering of regions of maize leaf lesions, which can lead to inaccurate segmentation, this paper proposes an improved Swin-Unet model called squeeze-and-excitation Swin-Unet (SE-Swin Unet). Our model applies Swin Transformer modules and skip connection structures for global and local learning. At each skip connection, a SENet module is incorporated to focus on global target features through channel-wise attention, with the aims of highlighting significant regions of disease on maize leaves and suppressing irrelevant background areas. The improved loss function in SE-Swin Unet is based on a combination of the binary cross entropy and Dice loss functions, which form the semantic segmentation model. Compared to other traditional convolutional neural networks on the same dataset, SE-Swin Unet achieves higher mean results for the intersection over union, accuracy, and F1-score, with values of 84.61%, 92.98%, and 89.91%, respectively. The SE-Swin Unet model proposed in this paper is therefore better able to extract information on maize leaf disease, and can provide a reference for the realisation of the complex task of corn leaf disease segmentation.
- Published
- 2024
- Full Text
- View/download PDF
18. Monitoring Mesoscale Convective System Using Swin-Unet Network Based on Daytime True Color Composite Images of Fengyun-4B.
- Author
-
Xiang, Ruxuanyi, Xie, Tao, Bai, Shuying, Zhang, Xuehong, Li, Jian, Wang, Minghua, and Wang, Chao
- Subjects
- *
MESOSCALE convective complexes , *CONVOLUTIONAL neural networks , *METEOROLOGICAL satellites , *GEOSTATIONARY satellites , *COLOR - Abstract
The monitoring of mesoscale convective systems (MCS) is typically based on satellite infrared data. Currently, there is limited research on the identification of MCS using true color composite cloud imagery. In this study, an MCS dataset was created based on the true color composite cloud imagery from the Fengyun-4B geostationary meteorological satellite. An MCS true color composite cloud imagery identification model was developed based on the Swin-Unet network. The MCS dataset was categorized into continental MCS and oceanic MCS, and the model's performance in identifying these two different types of MCS was examined. Experimental results indicated that the model achieved a recall rate of 83.3% in identifying continental MCS and 86.1% in identifying oceanic MCS, with a better performance in monitoring oceanic MCS. These results suggest that using true color composite cloud imagery for MCS monitoring is feasible, and the Swin-Unet network outperforms traditional convolutional neural networks. Meanwhile, we find that the frequency and distribution range of oceanic MCS is larger than that of continental MCS, and the area is larger and some parts of it are stronger. This study provides a novel approach for satellite remote-sensing-based MCS monitoring. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Enhancing Workplace Safety: PPE_Swin—A Robust Swin Transformer Approach for Automated Personal Protective Equipment Detection.
- Author
-
Riaz, Mudassar, He, Jianbiao, Xie, Kai, Alsagri, Hatoon S., Moqurrab, Syed Atif, Alhakbani, Haya Abdullah A., and Obidallah, Waeal J.
- Subjects
TRANSFORMER models ,CONVOLUTIONAL neural networks ,PERSONAL protective equipment ,CONSTRUCTION industry accidents ,INDUSTRIAL safety ,FEATURE extraction ,BUILDING sites - Abstract
Accidents occur in the construction industry as a result of non-compliance with personal protective equipment (PPE). As a result of diverse environments, it is difficult to detect PPE automatically. Traditional image detection models like convolutional neural network (CNN) and vision transformer (ViT) struggle to capture both local and global features in construction safety. This study introduces a new approach for automating the detection of personal protective equipment (PPE) in the construction industry, called PPE_Swin. By combining global and local feature extraction using the self-attention mechanism based on Swin-Unet, we address challenges related to accurate segmentation, robustness to image variations, and generalization across different environments. In order to train and evaluate our system, we have compiled a new dataset, which provides more reliable and accurate detection of personal protective equipment (PPE) in diverse construction scenarios. Our approach achieves a remarkable 97% accuracy in detecting workers with and without PPE, surpassing existing state-of-the-art methods. This research presents an effective solution for enhancing worker safety on construction sites by automating PPE compliance detection. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. Cloud and Cloud Shadow Detection of GF-1 Images Based on the Swin-UNet Method.
- Author
-
Tan, Yuhao, Zhang, Wenhao, Yang, Xiufeng, Liu, Qiyue, Mi, Xiaofei, Li, Juan, Yang, Jian, and Gu, Xingfa
- Subjects
- *
CONVOLUTIONAL neural networks , *TRANSFORMER models , *ARTIFICIAL neural networks , *REMOTE sensing , *FEATURE extraction , *OPTICAL remote sensing - Abstract
Cloud and cloud shadow detection in remote sensing images is an important preprocessing technique for quantitative analysis and large-scale mapping. To solve the problems of cloud and cloud shadow detection based on Convolutional Neural Network models, such as rough edges and insufficient overall accuracy, cloud and cloud shadow segmentation based on Swin-UNet was studied in the wide field of view (WFV) images of GaoFen-1 (GF-1). The Swin Transformer blocks help the model capture long-distance features and obtain deeper feature information in the network. This study selects a public GF1_WHU cloud and cloud shadow detection dataset for preprocessing and data optimization and conducts comparative experiments in different models. The results show that the algorithm performs well on vegetation, water, buildings, barren and other types. The average accuracy of cloud detection is 98.01%, the recall is 96.84% and the F1-score is 95.48%. The corresponding results of cloud shadow detection are 84.64%, 83.12% and 97.55%. In general, compared to U-Net, PSPNet and DeepLabV3+, this model performs better in cloud and cloud shadow detection, with clearer detection boundaries and a higher accuracy in complex surface conditions. This proves that Swin-UNet has great feature extraction capability in moderate and high-resolution remote sensing images. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
21. HCPSNet: heterogeneous cross-pseudo-supervision network with confidence evaluation for semi-supervised medical image segmentation.
- Author
-
Duan, Xianhua, Jin, Chaoqiang, and Shu, Xin
- Subjects
- *
DIAGNOSTIC imaging , *IMAGE segmentation , *CONFIDENCE - Abstract
Medical image segmentation technology can effectively help doctors to diagnose, but there are too little annotated data, which limits the development of fully supervised medical image segmentation methods. This dilemma leads to urgent research on semi-supervised medical image segmentation methods. To cope with this dilemma, we propose a semi-supervised dual flow network, which is called the Heterogeneous Cross-pseudo-supervision Network (HCPSNet). In the HCPSNet, Unet and Swin-Unet are combined for cross-learning, and a shifted patch tokenization (SPT) module is embedded into Swin-Unet to increase the spatial information contained in the feature maps. Besides, a confidence evaluation (CE) module is present to improve the performance of the model. The experimental results on three medical clinical datasets, LA2018, BraTs2019, and ACDC, show that our method can achieve good segmentation results with limited labeled samples. The mean dice of our proposed network on ACDC with seven cases' samples is 86.17%, about 3% higher than other models. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
22. A Novel Lightweight Swin-Unet Network for Semantic Segmentation of COVID-19 Lesion in CT Images
- Author
-
Zhi-Jun Gao, Yi He, and Yi Li
- Subjects
COVID-19 ,CT image ,semantic segmentation ,Swin-Unet ,ResMLP ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The Corona Virus Disease 2019 (COVID-19) is highly infectious, has been spread worldwide, caused a global pandemic, and seriously endangered human health and life. The most effective methods for halting and stopping the transmission of the Corona Virus include early detection, quarantine, and successful treatment. Because it exhibits significant imaging characteristics for COVID-19 lesions in chest computed tomography (CT), it can be used to diagnose COVID-19. Aiming at the inaccuracies of uneven gray distribution, irregular regions, multi-scale, and multi-region segmentation in COVID-19 CT images. This paper proposed a novel Swin-Unet network to improve the accuracy of multi-scale lesion segmentation in COVID-19 CT images. First, in the double-layer Swin Transformer blocks of the Swin-Unet, a residual multi-layer perceptron (ResMLP) module was introduced and replaced the multi-layer perceptron (MLP) module to reduce the loss of features during the transmission process, thereby improving the segmentation precision of multi-scale lesion areas. Second, the uncertain region inpainting module (URIM) was added after Linear Projection, which can refine the uncertain regions in the segmentation features map, thereby improving the segmentation accuracy of different lesion regions. Third, a new loss function DF was designed. It can effectively improve the small target segmentation effect and thus improve the multi-scale segmentation result. Finally, the proposed method was compared to other methods on the public dataset. The Dice, Precision, Recall, and IOU of the proposed method are 0.812, 0.780, 0.848, and 0.683, respectively, which are better than the other models. Moreover, our model has fewer parameters and faster reasoning speed. The proposed method achieves excellent segmentation results for multi-scale and multi-region lesions, and it will be more beneficial in aiding COVID-19 diagnosis and treatment.
- Published
- 2023
- Full Text
- View/download PDF
23. Subject-Sensitive Hash Algorithm for Integrity Authentication of High-Resolution Remote Sensing Images.
- Author
-
DING Kaimeng, XU Nan, LYU Dong, XU Qin, and MA Ji
- Subjects
IMAGE compression ,ARTIFICIAL neural networks ,ALGORITHMS ,DIGITAL watermarking - Abstract
As a new type of integrity authentication technology, subject-sensitive Hash overcomes the shortcomings of existing technologies, and can realize subject-sensitive authentication of HRRS images. However, existing subject-sensitive Hash algorithms still have deficiencies in the aspects of robustness. This paper proposes a new subject-sensitive Hash algorithm for integrity authentication of HRRS images. The proposed algorithm consists of two steps: the first step is to use the trained Swin-Unet to extract the subject-sensitive features of the HRRS images, the second step is to compress and encode the extracted subject-sensitive features to obtain subject-sensitive Hash sequences of HRRS images. The experimental results show that, compared with subject-sensitive Hash algorithms based on existing deep neural network models such as Attention U-net and AAU-Net, the robustness of the proposed algorithm for JPEG compression and watermarking embedding is improved, and the tampering sensitivity of this algorithm is basically consistent with that of existing algorithms, and it has a high tampering sensitivity in subject-related tampering. The security of the proposed algorithm is the same as the existing algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
24. Transformer-Based Deep Learning Network for Tooth Segmentation on Panoramic Radiographs.
- Author
-
Sheng, Chen, Wang, Lin, Huang, Zhenhuan, Wang, Tian, Guo, Yalin, Hou, Wenjie, Xu, Laiqing, Wang, Jiazhu, and Yan, Xue
- Abstract
Panoramic radiographs can assist dentist to quickly evaluate patients' overall oral health status. The accurate detection and localization of tooth tissue on panoramic radiographs is the first step to identify pathology, and also plays a key role in an automatic diagnosis system. However, the evaluation of panoramic radiographs depends on the clinical experience and knowledge of dentist, while the interpretation of panoramic radiographs might lead misdiagnosis. Therefore, it is of great significance to use artificial intelligence to segment teeth on panoramic radiographs. In this study, SWin-Unet, the transformer-based Ushaped encoder-decoder architecture with skip-connections, is introduced to perform panoramic radiograph segmentation. To well evaluate the tooth segmentation performance of SWin-Unet, the PLAGH-BH dataset is introduced for the research purpose. The performance is evaluated by F1 score, mean intersection and Union (IoU) and Acc, Compared with U-Net, Link-Net and FPN baselines, SWin-Unet performs much better in PLAGH-BH tooth segmentation dataset. These results indicate that SWin-Unet is more feasible on panoramic radiograph segmentation, and is valuable for the potential clinical application. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
25. Monitoring Mesoscale Convective System Using Swin-Unet Network Based on Daytime True Color Composite Images of Fengyun-4B
- Author
-
Ruxuanyi Xiang, Tao Xie, Shuying Bai, Xuehong Zhang, Jian Li, Minghua Wang, and Chao Wang
- Subjects
satellite observation ,mesoscale convective system ,Swin-Unet ,transformer ,Science - Abstract
The monitoring of mesoscale convective systems (MCS) is typically based on satellite infrared data. Currently, there is limited research on the identification of MCS using true color composite cloud imagery. In this study, an MCS dataset was created based on the true color composite cloud imagery from the Fengyun-4B geostationary meteorological satellite. An MCS true color composite cloud imagery identification model was developed based on the Swin-Unet network. The MCS dataset was categorized into continental MCS and oceanic MCS, and the model’s performance in identifying these two different types of MCS was examined. Experimental results indicated that the model achieved a recall rate of 83.3% in identifying continental MCS and 86.1% in identifying oceanic MCS, with a better performance in monitoring oceanic MCS. These results suggest that using true color composite cloud imagery for MCS monitoring is feasible, and the Swin-Unet network outperforms traditional convolutional neural networks. Meanwhile, we find that the frequency and distribution range of oceanic MCS is larger than that of continental MCS, and the area is larger and some parts of it are stronger. This study provides a novel approach for satellite remote-sensing-based MCS monitoring.
- Published
- 2023
- Full Text
- View/download PDF
26. Cloud and Cloud Shadow Detection of GF-1 Images Based on the Swin-UNet Method
- Author
-
Yuhao Tan, Wenhao Zhang, Xiufeng Yang, Qiyue Liu, Xiaofei Mi, Juan Li, Jian Yang, and Xingfa Gu
- Subjects
GF-1 ,cloud and cloud shadow detection ,Swin-UNet ,Swin Transformer ,Meteorology. Climatology ,QC851-999 - Abstract
Cloud and cloud shadow detection in remote sensing images is an important preprocessing technique for quantitative analysis and large-scale mapping. To solve the problems of cloud and cloud shadow detection based on Convolutional Neural Network models, such as rough edges and insufficient overall accuracy, cloud and cloud shadow segmentation based on Swin-UNet was studied in the wide field of view (WFV) images of GaoFen-1 (GF-1). The Swin Transformer blocks help the model capture long-distance features and obtain deeper feature information in the network. This study selects a public GF1_WHU cloud and cloud shadow detection dataset for preprocessing and data optimization and conducts comparative experiments in different models. The results show that the algorithm performs well on vegetation, water, buildings, barren and other types. The average accuracy of cloud detection is 98.01%, the recall is 96.84% and the F1-score is 95.48%. The corresponding results of cloud shadow detection are 84.64%, 83.12% and 97.55%. In general, compared to U-Net, PSPNet and DeepLabV3+, this model performs better in cloud and cloud shadow detection, with clearer detection boundaries and a higher accuracy in complex surface conditions. This proves that Swin-UNet has great feature extraction capability in moderate and high-resolution remote sensing images.
- Published
- 2023
- Full Text
- View/download PDF
27. Research on Semantic Segmentation Method of Macular Edema in Retinal OCT Images Based on Improved Swin-Unet.
- Author
-
Gao, Zhijun and Chen, Lun
- Subjects
MACULAR edema ,IMAGE segmentation ,OPTICAL coherence tomography ,RETINAL imaging ,DIABETIC retinopathy - Abstract
Optical coherence tomography (OCT), as a new type of tomography technology, has the characteristics of non-invasive, real-time imaging and high sensitivity, and is currently an important medical imaging tool to assist ophthalmologists in the screening, diagnosis, and follow-up treatment of patients with macular disease. In order to solve the problem of irregular occurrence area of diabetic retinopathy macular edema (DME), multi-scale and multi-region cluster of macular edema, which leads to inaccurate segmentation of the edema area, an improved Swin-Unet networks model was proposed for automatic semantic segmentation of macular edema lesion areas in OCT images. Firstly, in the deep bottleneck of the Swin-Unet network, the Resnet network layer was used to increase the extraction of pairs of sub-feature images. Secondly, the Swin Transformer block and skip connection structure were used for global and local learning, and the regions after semantic segmentation were morphologically smoothed and post-processed. Finally, the proposed method was performed on the macular edema patient dataset publicly available at Duke University, and was compared with previous segmentation methods. The experimental results show that the proposed method can not only improve the overall semantic segmentation accuracy of retinal macular edema, but also further to improve the semantic segmentation effect of multi-scale and multi-region edema regions. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.