Descriptor: "u-net" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"u-net"' showing total 6,232 results

Start Over Descriptor "u-net"

6,232 results on '"u-net"'

1. Sustainable Development Through Deep Learning-Based Waveform Segmentation: A Review

Author: Saini, Aryan, Sharma, Dushyant, Tomar, Aditya Singh, Sharma, Pavika, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Whig, Pawan, editor, Silva, Nuno, editor, Elngar, Ahmad A., editor, Aneja, Nagender, editor, and Sharma, Pavika, editor
Published: 2025
Full Text: View/download PDF

2. Sleep arousal detection for monitoring of sleep disorders using one-dimensional convolutional neural network-based U-Net and bio-signals

Author: Mishra, Priya and Swetapadma, Aleena
Published: 2024
Full Text: View/download PDF

3. Liver tumor segmentation using G-Unet and the impact of preprocessing and postprocessing methods.

Author: D J, Deepak and B S, Sunil Kumar
Subjects: CONVOLUTIONAL neural networks, LIVER tumors, COMPUTED tomography, THERAPEUTICS, LIVER
Abstract: Accurate liver and lesion segmentation plays a crucial role in the clinical assessment and therapeutic planning of hepatic diseases. The segmentation of the liver and lesions using automated techniques is a crucial undertaking that holds the potential to facilitate the early detection of malignancies and the effective management of patients' treatment requirements by medical professionals. This research presents the Generalized U-Net (G-Unet), a unique hybrid model designed for segmentation tasks. The G-Unet model is capable of incorporating other models such as convolutional neural networks (CNN), residual networks (ResNets), and densely connected convolutional neural networks (DenseNet) into the general U-Net framework. The G-Unet model, which consists of three distinct configurations, was assessed using the LiTS dataset. The results indicate that G-Unet demonstrated a high level of accuracy in segmenting the data. Specifically, the G-Unet model, configured with DenseNet architecture, produced a liver tumor segmentation accuracy of 72.9% dice global score. This performance is comparable to the existing state-of-the-art methodologies. The study also showcases the influence of different preprocessing and postprocessing techniques on the accuracy of segmentation. The utilization of Hounsfield Unit (HU) windowing and histogram equalization as preprocessing approaches, together with the implementation of conditional random fields as postprocessing techniques, resulted in a notable enhancement of 3.35% in the accuracy of tumor segmentation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. Multi-scale input layers and dense decoder aggregation network for COVID-19 lesion segmentation from CT scans.

Author: Lan, Xiaoke and Jin, Wenbing
Subjects: *COMPUTED tomography, *DEEP learning, *COVID-19, *DIAGNOSTIC imaging, *STATISTICAL correlation
Abstract: Accurate segmentation of COVID-19 lesions from medical images is essential for achieving precise diagnosis and developing effective treatment strategies. Unfortunately, this task presents significant challenges, owing to the complex and diverse characteristics of opaque areas, subtle differences between infected and healthy tissue, and the presence of noise in CT images. To address these difficulties, this paper designs a new deep-learning architecture (named MD-Net) based on multi-scale input layers and dense decoder aggregation network for COVID-19 lesion segmentation. In our framework, the U-shaped structure serves as the cornerstone to facilitate complex hierarchical representations essential for accurate segmentation. Then, by introducing the multi-scale input layers (MIL), the network can effectively analyze both fine-grained details and contextual information in the original image. Furthermore, we introduce an SE-Conv module in the encoder network, which can enhance the ability to identify relevant information while simultaneously suppressing the transmission of extraneous or non-lesion information. Additionally, we design a dense decoder aggregation (DDA) module to integrate feature distributions and important COVID-19 lesion information from adjacent encoder layers. Finally, we conducted a comprehensive quantitative analysis and comparison between two publicly available datasets, namely Vid-QU-EX and QaTa-COV19-v2, to assess the robustness and versatility of MD-Net in segmenting COVID-19 lesions. The experimental results show that the proposed MD-Net has superior performance compared to its competitors, and it exhibits higher scores on the Dice value, Matthews correlation coefficient (Mcc), and Jaccard index. In addition, we also conducted ablation studies on the Vid-QU-EX dataset to evaluate the contributions of each key component within the proposed architecture. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Retinal blood vessel segmentation using a deep learning method based on modified U-NET model.

Author: Sanjeewani, Yadav, Arun Kumar, Akbar, Mohd, Kumar, Mohit, and Yadav, Divakar
Abstract: Retinal blood vessel segmentation is important for detection of several highly prevalent, vision-threatening diseases such as diabetic retinopathy. Automatic retinal blood vessel segmentation is crucial to overcome the limitations posed by diagnoses by doctors. In recent times, deep learning-based methods have achieved great success in automatically segmenting retinal blood vessels from images. In this paper, a U-Net-based architecture is proposed to segment the retinal blood vessels from fundus images of the eye. Three pre-processing algorithms are proposed to enhance the performance of the proposed method further. Based on experimental evaluation of the publicly available DRIVE dataset, the proposed method achieves 0.9577 average accuracies (Acc), 0.7436 sensitivity (Se), 0.9838 specificities (Sp) and 0.7931 F1-score. The proposed method outperforms the recent state-of-art approaches in the literature. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. Ground subsidence prediction with high precision: a novel spatiotemporal prediction model with Interferometric Synthetic Aperture Radar technology.

Author: Tao, Qiuxiang, Xiao, Yixin, Hu, Leyin, Liu, Ruixiang, and Li, Xuepeng
Subjects: *SYNTHETIC aperture radar, *MINE subsidences, *STANDARD deviations, *RECURRENT neural networks, *MINES & mineral resources
Abstract: As the extraction of mineral resources intensifies, ground subsidence in mining areas has escalated, posing substantial challenges to sustainable development and operational safety. This subsidence, resulting directly from mining activities, significantly compromises the safety of nearby residents by damaging residential structures and infrastructure. Thus, developing precise and dependable methods for predicting ground subsidence is crucial. This study introduces an innovative Cabs-Unet model, which enhances the U-Net architecture by integrating a Convolutional Block Attention Module (CBAM) and Depthwise Separable Convolutions (DSC). This model aims to predict the spatiotemporal dynamics of the Interferometric Synthetic Aperture Radar (InSAR) time series. Employing Small Baseline Subset Interferometric Synthetic Aperture Radar (SBAS InSAR) technology, we gathered and validated data on ground subsidence at the Pengzhuang coal mine from May 2017 to November 2021, covering 130 scenes, with its accuracy corroborated by levelling survey results. An empirical evaluation of the Cabs-Unet model in two distinct subsidence zones demonstrated superior performance over conventional methods like Convolutional Long Short-Term Memory (ConvLSTM) and Predictive Recurrent Neural Network (PredRNN), with Root Mean Square Error (RMSE) values of 1.44 and 1.70, respectively. These findings highlight the model’s efficacy in accurately predicting spatiotemporal InSAR ground subsidence. Further predictive analysis using InSAR data indicated an expected increase in subsidence, projecting cumulative declines of −457 mm in Area A and −1278 mm in Area B by 17 July 2022. Our model proves effective in assessing subsidence, promptly detecting potential risks and facilitating the rapid implementation of risk mitigation strategies. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. Lightweight decoder U-net crack segmentation network based on depthwise separable convolution.

Author: Yu, Yongbo, Zhang, Yage, Yu, Junyang, and Yue, Jianwei
Abstract: Cracks are a common type of damage found on the surfaces of concrete buildings and roads. Accurately identifying the width and direction of these cracks is critical for maintaining and evaluating such structures. However, challenges such as irregular crack shapes and complex background interference persist in the crack identification task. To address these challenges, we propose a semantic segmentation network for cracks (DSU-Net) based on U-Net. A lightweight decoder is built through depthwise separable convolution to reduce model complexity and better retain the high-level features extracted by the encoder. Three modules are designed to improve the performance of the model. First, a feature enhancement module (DCM) that combines CBAM and squeeze excitation (cSE) is constructed to further enhance and optimize the intermediate features extracted by the encoder. Secondly, a neighboring layer information fusion module (NIF) is constructed to enrich the semantic information of extracted features. Finally, a feature refinement module (FRM) is constructed using multi-layer convolutional skip connections to make the final refinement of the features extracted by the model. Experiments were conducted using three datasets: DeepCrack, Crack500, and CCSS. The segmentation effect was tested, and nine models were used for comparative experiments. The test results showed an average improvement of 1.29% and 1.89% in the three datasets compared to the suboptimal models MIoU and F1, respectively. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. Reasoning cartographic knowledge in deep learning-based map generalization with explainable AI.

Author: Fu, Cheng, Zhou, Zhiyong, Xin, Yanan, and Weibel, Robert
Subjects: *ARTIFICIAL neural networks, *VISUAL analytics, *ARTIFICIAL intelligence, *GENERALIZATION, *VISUALIZATION
Abstract: Cartographic map generalization involves complex rules, and a full automation has still not been achieved, despite many efforts over the past few decades. Pioneering studies show that some map generalization tasks can be partially automated by deep neural networks (DNNs). However, DNNs are still used as black-box models in previous studies. We argue that integrating explainable AI (XAI) into a DL-based map generalization process can give more insights to develop and refine the DNNs by understanding what cartographic knowledge exactly is learned. Following an XAI framework for an empirical case study, visual analytics and quantitative experiments were applied to explain the importance of input features regarding the prediction of a pre-trained ResU-Net model. This experimental case study finds that the XAI-based visualization results can easily be interpreted by human experts. With the proposed XAI workflow, we further find that the DNN pays more attention to the building boundaries than the interior parts of the buildings. We thus suggest that boundary intersection over union is a better evaluation metric than commonly used intersection over union in qualifying raster-based map generalization results. Overall, this study shows the necessity and feasibility of integrating XAI as part of future DL-based map generalization development frameworks. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Spreading anomaly semantic segmentation and 3D reconstruction of binder jet additive manufacturing powder bed images.

Author: Gourley, Alexander, Kaufman, Jonathan, Aman, Bashu, Schwalbach, Edwin, Beuth, Jack, Rueschhoff, Lisa, and Reeja-Jayan, B.
Abstract: Variability in the inherently dynamic nature of additive manufacturing introduces imperfections that hinder the commercialization of new materials. Binder jetting produces ceramic and metallic parts, but low green densities and spreading anomalies reduce the predictability and processability of resulting geometries. In situ feedback presents a method for robust evaluation of spreading anomalies, reducing the number of required builds to refine processing parameters in a multivariate space. In this study, we report layer-wise powder bed semantic segmentation for the first time with a visually light ceramic powder, alumina, or Al2O3, leveraging an image analysis software to rapidly segment optical images acquired during the additive manufacturing process. Using preexisting image analysis tools allowed for rapid analysis of 316 stainless steel and alumina powders with small data sets by providing an accessible framework for implementing neural networks. Models trained on five build layers for each material to classify base powder, parts, streaking, short spreading, and bumps from recoater friction with testing categorical accuracies greater than 90%. Lower model performance accompanied the more subtle spreading features present in the white alumina compared to the darker steel. Applications of models to new builds demonstrated repeatability with the resulting models, and trends in classified pixels reflected corrections made to processing parameters. Through the development of robust analysis techniques and feedback for new materials, parameters can be corrected as builds progress. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. In-Vehicle Environment Noise Speech Enhancement Using Lightweight Wave-U-Net.

Author: Kang, Byung Ha, Park, Hyun Jun, Lee, Sung Hee, Choi, Yeon Kyu, Lee, Myoung Ok, and Han, Sung Won
Subjects: *CONVOLUTIONAL neural networks, *SPEECH enhancement, *SPEECH perception, *DEEP learning, *NETWORK performance
Abstract: With the rapid advancement of AI technology, speech recognition has also advanced quickly. In recent years, speech-related technologies have been widely implemented in the automotive industry. However, in-vehicle environment noise inhibits the recognition rate, resulting in poor speech recognition performance. Numerous speech enhancement methods have been proposed to mitigate this performance degradation. Filter-based methodologies have been used to remove existing vehicle environment noise; however, they remove only limited noise. In addition, there is the constraint that there are limits to the size of models that can be mounted inside a vehicle. Therefore, making the model lighter while increasing speech quality in a vehicle environment is an essential factor. This study proposes a Wave-U-Net with a depthwise-separable convolution to overcome these limitations. We built various convolutional blocks using the Wave-U-Net model as a baseline to analyze the results, and we designed the network by adding squeeze-and-excitation network to improve performance without significantly increasing the parameters. The experimental results show how much noise is lost through spectrogram visualization, and that the proposed model improves performance in eliminating noise compared with conventional methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. Prediction of carcass rib eye area by ultrasound images in sheep using computer vision.

Author: Júnior, Francisco Albir Lima, Filho, Luiz Antônio Silva Figueiredo, de Sousa Júnior, Antônio, Silva, Romuere Rodrigues Veloso e., Barbosa, Bruna Lima, de Brito Vieira, Rafaela, Rocha, Artur Oliveira, de Moura Oliveira, Tiago, and Sarmento, José Lindenberg Rocha
Subjects: *ULTRASONIC imaging, *COMPUTER vision, *RANDOM forest algorithms, *SHEEP, *AREA measurement
Abstract: The present research created a tool to measure ultrasound images of the rib eye area in sheep. One hundred twenty-one ultrasound images of sheep were captured, with regions of interest segmented using the U-Net algorithm. The metrics adopted to evaluate automatic segmentations were Dicescore and intersection over union. Finally, a regression analysis was performed using the AdaBoost Regressor and Random Forest Regressor algorithms and the fit of the models was evaluated using the Mean Square Residuals, mean absolute error and coefficient of determination. The values obtained for the Dice metric were 0.94, and for Intersection over Union it was 0.89, demonstrating a high similarity between the actual and predicted values, ranging from 0 to 1. The values of Mean Quadratic Residuals, mean absolute error and coefficient The determination of the regressor models indicated the best fit for the Random Forest Regressor. The U-Net algorithm efficiently segmented ultrasound images of the Longissimus Dorsi muscle, with greater precision than the measurements performed by the specialist. This efficient segmentation allowed the standardization of rib eye area measurements and, consequently, the phenotyping of beef sheep on a large scale. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. DABT-U-Net: Dual Attentive BConvLSTM U-Net with Transformers and Collaborative Patch-based Approach for Accurate Retinal Vessel Segmentation.

Author: Jalali, Y., Fateh, M., and Rezvani, M.
Subjects: RETINAL blood vessels, EYE diseases, IMAGE segmentation, ACCURACY, EARLY diagnosis
Abstract: The segmentation of retinal vessels is vital for timely diagnosis. and treatment of various eye diseases. However, due to inherent characteristics of retinal vessels in fundus images such as changes in thickness, direction, and complexity of vessels, as well as imbalanced contrast between background and vessels, segmenting retinal vessels continues to pose significant challenges. Also, despite advancements in CNNbased methods, challenges such as insufficient extraction of structural information, complexity, overfitting, preference for local information, and poor performance in noisy conditions persist. To address these drawbacks, in this paper we proposed a novel modified U-Net named DABT-U-Net. Our method enhances discriminative capability by introducing Hierarchical Dilated Convolution (HDC), Dual Attentive BConvLSTM, and Multi-Head Self-Attention (MHSA) blocks. Additionally, we adopt a collaborative patch-based training approach to mitigate data scarcity and overfitting. Evaluation on the DRIVE and STARE datasets shows that DABT-U-Net achieves superior accuracy, sensitivity, and F1 score compared to existing methods, demonstrating its effectiveness in retinal vessel segmentation. Specifically, our proposed method demonstrates improvements in accuracy, sensitivity, and F1 score by 0.32%, 0.61%, and 0.14%, respectively, on the DRIVE dataset, and by 0.07%, 0.83%, and 0.14% on the STARE dataset compared to a less effective approach. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. Wound Tissue Segmentation and Classification Using U-Net and Random Forest.

Author: Arjun, V. S., Chandrasekhar, Leena, and Jaseena, K. U.
Subjects: RANDOM forest algorithms, CHRONIC wounds & injuries, TISSUE wounds, DIGITAL image processing, NURSE practitioners, WOUND healing
Abstract: Analysing wound tissue is a crucial research field for assessing the progression of wound healing. Wounds exhibit certain attributes concerning colour and texture, although these features can vary among different wound images. Research in this field serves multiple purposes, including confirming the presence of chronic wounds, identifying infected wounds, determining the origin of the wound and addressing other factors that classify and characterise various types of wounds. Wounds pose a substantial health concern. Currently, clinicians and nurses mainly evaluate the healing status of wounds based on visual examination. This paper presents an outline of digital image processing and traditional machine learning methods for the tissue analysis of chronic wound images. Here, we propose a novel wound tissue analysis system that consists of wound image pre-processing, wound area segmentation and wound analysis by tissue segmentation. The wound area is extracted using a simple U-Net segmentation model. Granulation, slough and necrotic tissues are the three primary forms of wound tissues. The k -means clustering technique is employed to assign labels to tissues. Within the wound boundary, the tissue classification is performed by applying the Random Forest classification algorithm. Both segmentation (U-Net) and classification (Random Forest) models are trained, and the segmentation gives 99% accuracy, and the classification model gives 99.21% accuracy. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. Detection of pulmonary nodules in chest radiographs: novel cost function for effective network training with purely synthesized datasets.

Author: Hanaoka, Shouhei, Nomura, Yukihiro, Yoshikawa, Takeharu, Nakao, Takahiro, Takenaga, Tomomi, Matsuzaki, Hirotaka, Yamamichi, Nobutake, and Abe, Osamu
Abstract: Purpose: Many large radiographic datasets of lung nodules are available, but the small and hard-to-detect nodules are rarely validated by computed tomography. Such difficult nodules are crucial for training nodule detection methods. This lack of difficult nodules for training can be addressed by artificial nodule synthesis algorithms, which can create artificially embedded nodules. This study aimed to develop and evaluate a novel cost function for training networks to detect such lesions. Embedding artificial lesions in healthy medical images is effective when positive cases are insufficient for network training. Although this approach provides both positive (lesion-embedded) images and the corresponding negative (lesion-free) images, no known methods effectively use these pairs for training. This paper presents a novel cost function for segmentation-based detection networks when positive–negative pairs are available. Methods: Based on the classic U-Net, new terms were added to the original Dice loss for reducing false positives and the contrastive learning of diseased regions in the image pairs. The experimental network was trained and evaluated, respectively, on 131,072 fully synthesized pairs of images simulating lung cancer and real chest X-ray images from the Japanese Society of Radiological Technology dataset. Results: The proposed method outperformed RetinaNet and a single-shot multibox detector. The sensitivities were 0.688 and 0.507 when the number of false positives per image was 0.2, respectively, with and without fine-tuning under the leave-one-case-out setting. Conclusion: To our knowledge, this is the first study in which a method for detecting pulmonary nodules in chest X-ray images was evaluated on a real clinical dataset after being trained on fully synthesized images. The synthesized dataset is available at https://zenodo.org/records/10648433. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. A framework for 3D radiotherapy dose prediction using the deep learning approach.

Author: Lam Thanh Hien, Ha Manh Toan, and Do Nang Toan
Subjects: CONVOLUTIONAL neural networks, CANCER radiotherapy, COMPUTED tomography, RADIATION doses, DEATH rate
Abstract: Cancer is known as a dangerous disease to humans with a very high death rate. There are a lot of cancer treatment methods that have been studied and applied in the world. One of the main methods is using radiation beams to kill cancer cells. This method, also known as radiotherapy, requires experts having a high level of skill and experience. Our work focuses on the 3D dose prediction problem in radiotherapy by proposing a framework aiming to create a medical intelligent system for this problem. To do that, we created a convolutional neural network based on ResNet and U-Net to generate the predicted radiation dose. To improve the quality of the training phase, we also applied some data processing techniques based on the characteristics of the 3D computed tomography (CT) data. The experiment used the dataset from patients who were cancer-treated with radiotherapy in the OpenKBP competition. The results achieved good evaluating metrics, the first is by the Dose-score and the second is by the dose-volume histogram (DVH) score. From the training result, we built the medical system supporting 3D dose prediction and visualizing the result as slices in heatmap form. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. A time-frequency fusion model for multi-channel speech enhancement.

Author: Zeng, Xiao, Xu, Shiyun, and Wang, Mingjiang
Subjects: ARTIFICIAL neural networks, SPEECH enhancement, FEATURE extraction, TIMEKEEPING
Abstract: Multi-channel speech enhancement plays a critical role in numerous speech-related applications. Several previous works explicitly utilize deep neural networks (DNNs) to exploit tempo-spectral signal characteristics, which often leads to excellent performance. In this work, we present a time-frequency fusion model, namely TFFM, for multi-channel speech enhancement. We utilize three cascaded U-Nets to capture three types of high-resolution features, aiming to investigate their individual contributions. To be specific, the first U-Net keeps the time dimension and performs feature extraction along the frequency dimension for the high-resolution spectral features with global temporal information, the second U-Net keeps the frequency dimension and extracts features along the time dimension for the high-resolution temporal features with global spectral information, and the third U-Net downsamples and upsamples along both the frequency and time dimensions for the high-resolution tempo-spectral features. These three cascaded U-Nets are designed to aggregate local and global features, thereby effectively handling the tempo-spectral information of speech signals. The proposed TFFM in this work outperforms state-of-the-art baselines. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. Retinex decomposition based low‐light image enhancement by integrating Swin transformer and U‐Net‐like architecture.

Author: Wang, Zexin, Qingge, Letu, Pan, Qingyi, and Yang, Pei
Subjects: *TRANSFORMER models, *IMAGE intensifiers, *VISUAL perception, *REFLECTANCE, *TEST methods
Abstract: Low‐light images are captured in environments with minimal lighting, such as nighttime or underwater conditions. These images often suffer from issues like low brightness, poor contrast, lack of detail, and overall darkness, significantly impairing human visual perception and subsequent high‐level visual tasks. Enhancing low‐light images holds great practical significance. Among the various existing methods for Low‐Light Image Enhancement (LLIE), those based on the Retinex theory have gained significant attention. However, despite considerable efforts in prior research, the challenge of Retinex decomposition remains unresolved. In this study, an LLIE network based on the Retinex theory is proposed, which addresses these challenges by integrating attention mechanisms and a U‐Net‐like architecture. The proposed model comprises three modules: the Decomposition module (DECM), the Reflectance Recovery module (REFM), and the Illumination Enhancement module (ILEM). Its objective is to decompose low‐light images based on the Retinex theory and enhance the decomposed reflectance and illumination maps using attention mechanisms and a U‐Net‐like architecture. We conducted extensive experiments on several widely used public datasets. The qualitative results demonstrate that the approach produces enhanced images with superior visual quality compared to the existing methods on all test datasets, especially for some extremely dark images. Furthermore, the quantitative evaluation results based on metrics PSNR, SSIM, LPIPS, BRISQUE, and MUSIQ show the proposed model achieves superior performance, with PSNR and BRISQUE significantly outperforming the baseline approaches, where (PSNR, mean BRISQUE) values of the proposed method and the second best results are (17.14, 17.72) and (16.44, 19.65). Additionally, further experimental results such as ablation studies indicate the effectiveness of the proposed model. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. RDAG U-Net: An Advanced AI Model for Efficient and Accurate CT Scan Analysis of SARS-CoV-2 Pneumonia Lesions.

Author: Lee, Chih-Hui, Pan, Cheng-Tang, Lee, Ming-Chan, Wang, Chih-Hsuan, Chang, Chun-Yung, and Shiue, Yow-Ling
Subjects: *IMAGE analysis, *ARTIFICIAL intelligence, *LUNG diseases, *COMPUTED tomography, *RESPIRATORY infections
Abstract: Background/Objective: This study aims to utilize advanced artificial intelligence (AI) image recog-nition technologies to establish a robust system for identifying features in lung computed tomog-raphy (CT) scans, thereby detecting respiratory infections such as SARS-CoV-2 pneumonia. Spe-cifically, the research focuses on developing a new model called Residual-Dense-Attention Gates U-Net (RDAG U-Net) to improve accuracy and efficiency in identification. Methods: This study employed Attention U-Net, Attention Res U-Net, and the newly developed RDAG U-Net model. RDAG U-Net extends the U-Net architecture by incorporating ResBlock and DenseBlock modules in the encoder to retain training parameters and reduce computation time. The training dataset in-cludes 3,520 CT scans from an open database, augmented to 10,560 samples through data en-hancement techniques. The research also focused on optimizing convolutional architectures, image preprocessing, interpolation methods, data management, and extensive fine-tuning of training parameters and neural network modules. Result: The RDAG U-Net model achieved an outstanding accuracy of 93.29% in identifying pulmonary lesions, with a 45% reduction in computation time compared to other models. The study demonstrated that RDAG U-Net performed stably during training and exhibited good generalization capability by evaluating loss values, model-predicted lesion annotations, and validation-epoch curves. Furthermore, using ITK-Snap to convert 2D pre-dictions into 3D lung and lesion segmentation models, the results delineated lesion contours, en-hancing interpretability. Conclusion: The RDAG U-Net model showed significant improvements in accuracy and efficiency in the analysis of CT images for SARS-CoV-2 pneumonia, achieving a 93.29% recognition accuracy and reducing computation time by 45% compared to other models. These results indicate the potential of the RDAG U-Net model in clinical applications, as it can accelerate the detection of pulmonary lesions and effectively enhance diagnostic accuracy. Additionally, the 2D and 3D visualization results allow physicians to understand lesions' morphology and distribution better, strengthening decision support capabilities and providing valuable medical diagnosis and treatment planning tools. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. Snow Cover Extraction from Landsat 8 OLI Based on Deep Learning with Cross-Scale Edge-Aware and Attention Mechanism.

Author: Yu, Zehao, Gong, Hanying, Zhang, Shiqiang, and Wang, Wei
Subjects: *WATER management, *OPTICAL remote sensing, *LANDSAT satellites, *REMOTE sensing, *DEEP learning, *SNOW cover
Abstract: Snow cover distribution is of great significance for climate change and water resource management. Current deep learning-based methods for extracting snow cover from remote sensing images face challenges such as insufficient local detail awareness and inadequate utilization of global semantic information. In this study, a snow cover extraction algorithm integrating cross-scale edge perception and an attention mechanism on the U-net model architecture is proposed. The cross-scale edge perception module replaces the original jump connection of U-net, enhances the low-level image features by introducing edge detection on the shallow feature scale, and enhances the detail perception via branch separation and fusion features on the deep feature scale. Meanwhile, parallel channel and spatial attention mechanisms are introduced in the model encoding stage to adaptively enhance the model's attention to key features and improve the efficiency of utilizing global semantic information. The method was evaluated on the publicly available CSWV_S6 optical remote sensing dataset, and the accuracy of 98.14% indicates that the method has significant advantages over existing methods. Snow extraction from Landsat 8 OLI images of the upper reaches of the Irtysh River was achieved with satisfactory accuracy rates of 95.57% (using two, three, and four bands) and 96.65% (using two, three, four, and six bands), indicating its strong potential for automated snow cover extraction over larger areas. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. A Comparison of Local and Global Strategies for Exploiting Field Inversion on Separated Flows at Low Reynolds Number.

Author: Muscarà, Luca, Cisternino, Marco, Ferrero, Andrea, Iob, Andrea, and Larocca, Francesco
Subjects: REYNOLDS number, MACHINE learning, PROBLEM solving, AEROFOILS, FORECASTING
Abstract: The prediction of separated flows at low Reynolds numbers is crucial for several applications in aerospace and energy fields. Reynolds-averaged Navier–Stokes (RANS) equations are widely used but their accuracy is limited in the presence of transition or separation. In this work, two different strategies for improving RANS simulations by means of field inversion are discussed. Both strategies require solving an optimization problem to identify a correction field by minimizing the error on some measurable data. The obtained correction field is exploited with two alternative strategies. The first strategy aims to the identification of a relation that allows to express the local correction field as a function of some local flow features. However, this regression can be difficult or even impossible because the relation between the assumed input variables and the local correction could not be a function. For this reason, an alternative is proposed: a U-Net model is trained on the original and corrected RANS results. In this way, it is possible to perform a prediction with the original RANS model and then correct it by means of the U-Net. The methodologies are evaluated and compared on the flow around the NACA0021 and the SD7003 airfoils. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. LV-YOLO: logistic vehicle speed detection and counting using deep learning based YOLO network.

Author: Rani, N. Gopika, Priya, N. Hema, Ahilan, A., and Muthukumaran, N.
Abstract: In the era of smart cities and advancing transportation technologies, predicting logistic vehicle and vehicle speed is pivotal to enhancing traffic management, safety, and overall transportation efficiency. Properly predicting vehicle and vehicle speed is critical to the interests of both road users and traffic authorities. However, accurately predicting the vehicle speed and logistics vehicle of a single trip is a difficult task. In some cases, unpredicted accidents will happen, so death cases will increase. To overcome these issues, a novel Logistic Vehicle speed detection using the YOLO (LV-YOLO) method has been introduced to detect logistical vehicles and speed using the YOLO network. The proposed framework is divided into three layers such as image acquisition, segmentation layer, and detection layer. In the image acquisition layer, a CCTV camera captures highway traffic video. The collected video is converted into frames. In the segmentation layer, the video frame is segmented using U-Net, which segments the vehicle in the video frames. The detection layer performs truck detection, and speed detection using LV-YOLO on segmented frames based on the Boxy Vehicle dataset. The simulated results show that the LV-YOLO technique maintains excellent mAP levels of 99.42%. The LV-YOLO improves the overall mAP by 1.72, 5.42, and 0.82% better than the Simple Vehicle Counting System, Real-Time Detection, and Advance YOLOv3 Model for vehicle detection, 4.81, and 2.63% better than Deep Learning and CAN protocol, and 1D-CNN speed estimation mode for speed prediction respectively. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. DUFormer: dual-channel image splicing detection based on anchor-shaped U-Net and stepwise transformer for power systems.

Author: Tian, Xiuxia, Zhao, Jianren, and Wen, Longfang
Abstract: The safe operation of intelligent power systems relies on the authenticity and integrity of image data. However, splicing-based image tampering, a common form of image forgery, poses severe challenges to the security monitoring of power systems. Addressing the limitations of traditional image splicing detection techniques in power system applications, this paper introduces DUFormer, a dual-channel image splicing detection model that combines anchor-shaped U-Net and stepwise Transformer. The model explores image features through the stepwise Transformer and precisely locates small-sized tampered areas using the anchor-shaped U-Net, enhancing the recognition capability for tampering of various scales. Tests on the substation splicing forgery dataset (SSFD) dataset, which contains 1192 tampered images of power systems, show that DUFormer achieved a 32.76% improvement in intersection over union and a 29.77% improvement in F1 score, and a reduction in mean absolute error by 0.05 relative to the second-best performing model. Additionally, evaluations on multiple public datasets confirm that DUFormer surpasses existing detection technologies on various performance metrics, especially exhibiting outstanding performance at the level of detail. This paper also examines the model's robustness against JPEG compression operation to ensure its effectiveness in real-world applications. This research not only improves the pixel-level detection accuracy of power image splicing but also lays a solid foundation for the development of future security monitoring technologies for intelligent power systems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. Tsnet: a two-stage network for image dehazing with multi-scale fusion and adaptive learning.

Author: Gong, Xiaolin, Zheng, Zehan, and Du, Heyuan
Abstract: Image dehazing has been a popular topic of research for a long time. Previous deep learning-based image dehazing methods have failed to achieve satisfactory dehazing effects on both synthetic datasets and real-world datasets, exhibiting poor generalization. Moreover, single-stage networks often result in many regions with artifacts and color distortion in output images. To address these issues, this paper proposes a two-stage image dehazing network called TSNet, mainly consisting of the multi-scale fusion module (MSFM) and the adaptive learning module (ALM). Specifically, MSFM and ALM enhance the generalization of TSNet. The MSFM can obtain large receptive fields at multiple scales and integrate features at different frequencies to reduce the differences between inputs and learning objectives. The ALM can actively learn of regions of interest in images and restore texture details more effectively. Additionally, TSNet is designed as a two-stage network, where the first-stage network performs image dehazing, and the second-stage network is employed to improve issues such as artifacts and color distortion present in the results of the first-stage network. We also change the learning objective from ground truth images to opposite fog maps, which improves the learning efficiency of TSNet. Extensive experiments demonstrate that TSNet exhibits superior dehazing performance on both synthetic and real-world datasets compared to previous state-of-the-art methods. The related code is released at https://github.com/zzhlovexuexi/TSNet. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. Optic Nerve Sheath Ultrasound Image Segmentation Based on CBC-YOLOv5s.

Author: Chu, Yonghua, Xu, Jinyang, Wu, Chunshuang, Ye, Jianping, Zhang, Jucheng, Shen, Lei, Wang, Huaxia, and Yao, Yudong
Subjects: OPTIC nerve, IMAGE segmentation, ULTRASONIC imaging, TRANSFORMER models, FEATURE extraction, DEEP learning
Abstract: The diameter of the optic nerve sheath is an important indicator for assessing the intracranial pressure in critically ill patients. The methods for measuring the optic nerve sheath diameter are generally divided into invasive and non-invasive methods. Compared to the invasive methods, the non-invasive methods are safer and have thus gained popularity. Among the non-invasive methods, using deep learning to process the ultrasound images of the eyes of critically ill patients and promptly output the diameter of the optic nerve sheath offers significant advantages. This paper proposes a CBC-YOLOv5s optic nerve sheath ultrasound image segmentation method that integrates both local and global features. First, it introduces the CBC-Backbone feature extraction network, which consists of dual-layer C3 Swin-Transformer (C3STR) and dual-layer Bottleneck Transformer (BoT3) modules. The C3STR backbone's multi-layer convolution and residual connections focus on the local features of the optic nerve sheath, while the Window Transformer Attention (WTA) mechanism in the C3STR module and the Multi-Head Self-Attention (MHSA) in the BoT3 module enhance the model's understanding of the global features of the optic nerve sheath. The extracted local and global features are fully integrated in the Spatial Pyramid Pooling Fusion (SPPF) module. Additionally, the CBC-Neck feature pyramid is proposed, which includes a single-layer C3STR module and three-layer CReToNeXt (CRTN) module. During upsampling feature fusion, the C3STR module is used to enhance the local and global awareness of the fused features. During downsampling feature fusion, the CRTN module's multi-level residual design helps the network to better capture the global features of the optic nerve sheath within the fused features. The introduction of these modules achieves the thorough integration of the local and global features, enabling the model to efficiently and accurately identify the optic nerve sheath boundaries, even when the ocular ultrasound images are blurry or the boundaries are unclear. The Z2HOSPITAL-5000 dataset collected from Zhejiang University Second Hospital was used for the experiments. Compared to the widely used YOLOv5s and U-Net algorithms, the proposed method shows improved performance on the blurry test set. Specifically, the proposed method achieves precision, recall, and Intersection over Union (IoU) values that are 4.1%, 2.1%, and 4.5% higher than those of YOLOv5s. When compared to U-Net, the precision, recall, and IoU are improved by 9.2%, 21%, and 19.7%, respectively. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. U-DeepONet: U-Net enhanced deep operator network for geologic carbon sequestration.

Author: Diab, Waleed and Al Kobaisi, Mohammed
Subjects: *ARTIFICIAL neural networks, *GEOLOGICAL carbon sequestration, *POROUS materials, *TWO-phase flow, *SCIENCE education, *SCIENTIFIC computing
Abstract: Learning operators with deep neural networks is an emerging paradigm for scientific computing. Deep Operator Network (DeepONet) is a modular operator learning framework that allows for flexibility in choosing the kind of neural network to be used in the trunk and/or branch of the DeepONet. This is beneficial as it has been shown many times that different types of problems require different kinds of network architectures for effective learning. In this work, we design an efficient neural operator based on the DeepONet architecture. We introduce U-Net enhanced DeepONet (U-DeepONet) for learning the solution operator of highly complex CO2-water two-phase flow in heterogeneous porous media. The U-DeepONet is more accurate in predicting gas saturation and pressure buildup than the state-of-the-art U-Net based Fourier Neural Operator (U-FNO) and the Fourier-enhanced Multiple-Input Operator (Fourier-MIONet) trained on the same dataset. Moreover, our U-DeepONet is significantly more efficient in training times than both the U-FNO (more than 18 times faster) and the Fourier-MIONet (more than 5 times faster), while consuming less computational resources. We also show that the U-DeepONet is more data efficient and better at generalization than both the U-FNO and the Fourier-MIONet. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. A Segmentation-Based Automated Corneal Ulcer Grading System for Ocular Staining Images Using Deep Learning and Hough Circle Transform.

Author: Manawongsakul, Dulyawat and Patanukhom, Karn
Subjects: *CONVOLUTIONAL neural networks, *HOUGH transforms, *IMAGE processing, *IMAGE segmentation, *DEEP learning, CORNEAL ulcer
Abstract: Corneal ulcer is a prevalent ocular condition that requires ophthalmologists to diagnose, assess, and monitor symptoms. During examination, ophthalmologists must identify the corneal ulcer area and evaluate its severity by manually comparing ocular staining images with severity indices. However, manual assessment is time-consuming and may provide inconsistent results. Variations can occur with repeated evaluations of the same images or with grading among different evaluators. To address this problem, we propose an automated corneal ulcer grading system for ocular staining images based on deep learning techniques and the Hough Circle Transform. The algorithm is structured into two components for cornea segmentation and corneal ulcer segmentation. Initially, we apply a deep learning method combined with the Hough Circle Transform to segment cornea areas. Subsequently, we develop the corneal ulcer segmentation model using deep learning methods. In this phase, the predicted cornea areas are utilized as masks for training the corneal ulcer segmentation models during the learning phase. Finally, this algorithm uses the results from these two components to determine two outputs: (1) the percentage of the ulcerated area on the cornea, and (2) the severity degree of the corneal ulcer based on the Type–Grade (TG) grading standard. These methodologies aim to enhance diagnostic efficiency across two key aspects: (1) ensuring consistency by delivering uniform and dependable results, and (2) enhancing robustness by effectively handling variations in eye size. In this research, our proposed method is evaluated using the SUSTech-SYSU public dataset, achieving an Intersection over Union of 89.23% for cornea segmentation and 82.94% for corneal ulcer segmentation, along with a Mean Absolute Error of 2.51% for determining the percentage of the ulcerated area on the cornea and an Accuracy of 86.15% for severity grading. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. A Comparative Analysis of U-Net and Vision Transformer Architectures in Semi-Supervised Prostate Zonal Segmentation.

Author: Huang, Guantian, Xia, Bixuan, Zhuang, Haoming, Yan, Bohan, Wei, Cheng, Qi, Shouliang, Qian, Wei, and He, Dianning
Subjects: *TRANSFORMER models, *DIAGNOSTIC imaging, *AUTODIDACTICISM, *TIME-varying networks, *ENTROPY
Abstract: The precise segmentation of different regions of the prostate is crucial in the diagnosis and treatment of prostate-related diseases. However, the scarcity of labeled prostate data poses a challenge for the accurate segmentation of its different regions. We perform the segmentation of different regions of the prostate using U-Net- and Vision Transformer (ViT)-based architectures. We use five semi-supervised learning methods, including entropy minimization, cross pseudo-supervision, mean teacher, uncertainty-aware mean teacher (UAMT), and interpolation consistency training (ICT) to compare the results with the state-of-the-art prostate semi-supervised segmentation network uncertainty-aware temporal self-learning (UATS). The UAMT method improves the prostate segmentation accuracy and provides stable prostate region segmentation results. ICT plays a more stable role in the prostate region segmentation results, which provides strong support for the medical image segmentation task, and demonstrates the robustness of U-Net for medical image segmentation. UATS is still more applicable to the U-Net backbone and has a very significant effect on a positive prediction rate. However, the performance of ViT in combination with semi-supervision still requires further optimization. This comparative analysis applies various semi-supervised learning methods to prostate zonal segmentation. It guides future prostate segmentation developments and offers insights into utilizing limited labeled data in medical imaging. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. Scale- and Resolution-Adapted Shaded Relief Generation Using U-Net.

Author: Farmakis-Serebryakova, Marianna, Heitzler, Magnus, and Hurni, Lorenz
Subjects: *DIGITAL elevation models, *WEB-based user interfaces, *WEB design, *MACHINE learning, *EQUITABLE remedies (Law), *HISTOGRAMS
Abstract: On many maps, relief shading is one of the most significant graphical elements. Modern relief shading techniques include neural networks. To generate such shading automatically at an arbitrary scale, one needs to consider how the resolution of the input digital elevation model (DEM) relates to the neural network process and the maps used for training. Currently, there is no clear guidance on which DEM resolution to use to generate relief shading at specific scales. To address this gap, we trained the U-Net models on swisstopo manual relief shadings of Switzerland at four different scales and using four different resolutions of SwissALTI3D DEM. An interactive web application designed for this study allows users to outline a random area and compare histograms of varying brightness between predictions and manual relief shadings. The results showed that DEM resolution and output scale influence the appearance of the relief shading, with an overall scale/resolution ratio. We present guidelines for generating relief shading with neural networks for arbitrary areas and scales. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. Enhancement of Semantic Segmentation by Image‐Level Fine‐Tuning to Overcome Image Pattern Imbalance in HRCT of Diffuse Infiltrative Lung Diseases.

Author: Ham, Sungwon, Park, Beomhee, Yun, Jihye, Lee, Sang Min, Seo, Joon Beom, and Kim, Namkug
Subjects: *CRYPTOGENIC organizing pneumonia, *IDIOPATHIC pulmonary fibrosis, *INTERSTITIAL lung diseases, *PULMONARY fibrosis, *LUNG diseases
Abstract: Diagnosing diffuse infiltrative lung diseases (DILD) using high‐resolution computed tomography (HRCT) is challenging, even for expert radiologists, due to the complex and variable image patterns. Moreover, the imbalances among the six key DILD‐related patterns—normal, ground‐glass opacity, reticular opacity, honeycombing, emphysema, and consolidation—further complicate accurate segmentation and diagnosis. This study presents an enhanced U‐Net‐based segmentation technique aimed at addressing these challenges. The primary contribution of our work is the fine‐tuning of the U‐Net model using image‐level labels from 92 HRCT images that include various types of DILDs, such as cryptogenic organizing pneumonia, usual interstitial pneumonia, and nonspecific interstitial pneumonia. This approach helps to correct the imbalance among image patterns, improving the model's ability to accurately differentiate between them. By employing semantic lung segmentation and patch‐level machine learning, the fine‐tuned model demonstrated improved agreement with radiologists' evaluations compared to conventional methods. This suggests a significant enhancement in both segmentation accuracy and inter‐observer consistency. In conclusion, the fine‐tuned U‐Net model offers a more reliable tool for HRCT image segmentation, making it a valuable imaging biomarker for guiding treatment decisions in patients with DILD. By addressing the issue of pattern imbalances, our model significantly improves the accuracy of DILD diagnosis, which is crucial for effective patient care. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. STU3Net: An Improved U‐Net With Swin Transformer Fusion for Thyroid Nodule Segmentation.

Author: Deng, Xiangyu, Dang, Zhiyan, and Pan, Lihao
Subjects: *TRANSFORMER models, *THYROID nodules, *CONTRAST-enhanced ultrasound, *ULTRASONIC imaging, *IMAGE reconstruction
Abstract: Thyroid nodules are a common endocrine system disorder for which accurate ultrasound image segmentation is important for evaluation and diagnosis, as well as a critical step in computer‐aided diagnostic systems. However, the accuracy and consistency of segmentation remains a challenging task due to the presence of scattering noise, low contrast and resolution in ultrasound images. Therefore, we propose a deep learning‐based CAD (computer‐aided diagnosis) method, STU3Net in this paper, aiming at automatic segmentation of thyroid nodules. The method employs a modified Swin Transformer combined with a CNN encoder, which is capable of extracting morphological features and edge details of thyroid nodules in ultrasound images. In decoding through the features for image reconstruction, we introduce a modified three‐layer U‐Net network with cross‐layer connectivity to further enhance image reduction. This cross‐layer connectivity enhances the network's capture and representation of the contained image feature information by creating skip connections between different layers and merging the detailed information of the shallow network with the abstract information of the deeper network. Through comparison experiments with current mainstream deep learning methods on the TN3K and BUSI datasets, we validate the superiority of the STU3Net method in thyroid nodule segmentation performance. The experimental results show that STU3Net outperforms most of the mainstream models on the TN3K dataset, with Dice and IoU reaching 0.8368 and 0.7416, respectively, which are significantly better than other methods. The method demonstrates excellent performance on these datasets and provides radiologists with an effective auxiliary tool to accurately detect thyroid nodules in ultrasound images. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. ADT‐UNet: An Innovative Algorithm for Glioma Segmentation in MR Images.

Author: Zhipeng, Liu, Jiawei, Wu, Ye, Jing, Bian, Xuefeng, Qiwei, Wu, Li, Rui, and Zhu, Yinxing
Subjects: *TRANSFORMER models, *IMAGE segmentation, *MAGNETIC resonance imaging, *GLIOMAS, *MEDICAL technology, *DEEP learning
Abstract: The precise delineation of glioma tumors is of paramount importance for surgical and radiotherapy planning. Presently, the primary drawbacks associated with the manual segmentation approach are its laboriousness and inefficiency. In order to tackle these challenges, a deep learning‐based automatic segmentation technique was introduced to enhance the efficiency of the segmentation process. We proposed ADT‐UNet, an innovative algorithm for segmenting glioma tumors in MR images. ADT‐UNet leveraged attention‐dense blocks and Transformer as its foundational elements. It extended the U‐Net framework by incorporating the dense connection structure and attention mechanism. Additionally, a Transformer structure was introduced at the end of the encoder. Furthermore, a novel attention‐guided multi‐scale feature fusion module was integrated into the decoder. To enhance network stability during training, a loss function was devised that combines Dice loss and binary cross‐entropy loss, effectively guiding the network optimization process. On the test set, the DSC was 0.933, the IOU was 0.878, the PPV was 0.942, and the Sen was 0.938. Ablation experiments conclusively demonstrated that the inclusion of all the three proposed modules led to enhanced segmentation accuracy within the model. The most favorable outcomes were observed when all the three modules were employed simultaneously. The proposed methodology exhibited substantial competitiveness across various evaluation indices, with the three additional modules synergistically complementing each other to collectively enhance the segmentation accuracy of the model. Consequently, it is anticipated that this method will serve as a robust tool for assisting clinicians in auxiliary diagnosis and contribute to the advancement of medical intelligence technology. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. 基于改进型 U-Net 的变色油墨血浆判别模型.

Author: 张瀚文, 曹维娟, 罗刚银, 江浩, 邱香, 许杰, 史蓉蓉, and 郑然
Subjects: *ESTIMATION bias, *IMAGE segmentation, *PATIENT safety, *MACHINE learning, *INK
Abstract: Objective: Due to inconsistent subjective assessment criteria and excessively long calculation responses, there is a high risk of mistakenly discarding suspected hemolytic plasma and inappropriately discarding suspected non-hemolytic plasma during plasma preparation, posing significant risks to patient safety and leading to waste. This study aims to resolue these problems. Methods: A thresholding method that integrates deep learning with color-changing ink concepts was developed. By employing an enhanced U-Net architecture for image segmentation, the study introduces an advanced attention mechanism, batch normalization, and a padding module to tackle issues such as mean estimation bias, computational inefficiencies, and limited receptive field sizes in spatial mapping relationships. The model was validated and compared using a self-collected sample dataset. Results: This study employed the color-changing ink boundary method for classification, enhancing the computational efficiency of plasma discrimination and reducing discrimination time, while ensuring the accuracy of plasma sample identification. The accuracy rate of the experimental results is 99.52%. Conclusion: The results indicate that the plasma discrimination accuracy of this model is superior to other discrimination models, and it is expected to be applied in clinical practice. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. A Hybrid Method for Multiple Sclerosis Lesion Segmentation Using Wavelet and Dense U-Net.

Author: Alijamaat, Ali, Mirhosseini, Seyed Mohsen, and Aliakbari, Reyhaneh
Subjects: *CENTRAL nervous system, *MULTIPLE sclerosis, *IMAGE processing, *WHITE matter (Nerve tissue), *WAVELET transforms
Abstract: Multiple Sclerosis (MS) is one of the debilitating disorders of the central nervous system. This disease causes lesions in the white matter of the brain tissue. It can also lead to many physical and psychological disorders in movement, vision, and memory. Lesion segmentation in MRI images to determine the number and size of lesions is one of the diagnostic problems for specialists. Using automated diagnostic tools as an aid can help professionals. Traditional image processing and deep learning methods are used to automate lesion segmentation. The U-Net is one of the most widely used deep learning architectures for MS lesion segmentation. The images are used in the Fourier domain in the U-Net network, which does not include all its features. Our proposed method combines the HAR wavelet transform and the Dense net-based U-Net. This makes local features and lesions of different sizes more prominent and leads to higher quality segmentation. The proposed method had a better Dice value than the compared methods in the experiments. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. A Multi-Scale Liver Tumor Segmentation Method Based on Residual and Hybrid Attention Enhanced Network with Contextual Integration.

Author: Sun, Liyan, Jiang, Linqing, Wang, Mingcong, Wang, Zhenyan, and Xin, Yi
Subjects: *FEATURE extraction, *LIVER tumors, *PARALLEL processing, *LIVER cancer, *DEATH rate
Abstract: Liver cancer is one of the malignancies with high mortality rates worldwide, and its timely detection and accurate diagnosis are crucial for improving patient prognosis. To address the limitations of traditional image segmentation techniques and the U-Net network in capturing fine image features, this study proposes an improved model based on the U-Net architecture, named RHEU-Net. By replacing traditional convolution modules in the encoder and decoder with improved residual modules, the network's feature extraction capabilities and gradient stability are enhanced. A Hybrid Gated Attention (HGA) module is integrated before the skip connections, enabling the parallel processing of channel and spatial attentions, optimizing the feature fusion strategy, and effectively replenishing image details. A Multi-Scale Feature Enhancement (MSFE) layer is introduced at the bottleneck, utilizing multi-scale feature extraction technology to further enhance the expression of receptive fields and contextual information, improving the overall feature representation effect. Testing on the LiTS2017 dataset demonstrated that RHEU-Net achieved Dice scores of 95.72% for liver segmentation and 70.19% for tumor segmentation. These results validate the effectiveness of RHEU-Net and underscore its potential for clinical application. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. U-NET: A Supervised Approach for Monaural Source Separation.

Author: Basir, Samiul, Hossain, Md. Nahid, Hosen, Md. Shakhawat, Ali, Md. Sadek, Riaz, Zainab, and Islam, Md. Shohidul
Subjects: *CONVOLUTIONAL neural networks, *FOURIER transforms, *DEEP learning
Abstract: Separating speech is a challenging area of research, especially when trying to separate the desired source from its combination. Deep learning has arisen as a promising solution, surpassing traditional methods. While prior research has mainly focused on the magnitude, log-magnitude, or a combination of the magnitude and phase portions, a new approach using the Short-time Fourier Transform (STFT), and a deep Convolutional Neural Network named U-NET has been proposed. This method, unlike others, considers both the real and imaginary components for decomposition. During the training stage, the mixed time-domain signal undergoes a transformation into a frequency-domain signal by using STFT, producing a mixed complex spectrogram. The spectrogram's real and imaginary parts are then divided and combined into a single matrix. The newly formed matrix is fed through U-NET to extract the source components. The same process is repeated at testing. The resulting concatenated matrix for the mixed test signal is passed through the saved model to generate two enhanced concatenated matrices for each source. These matrices are then transformed back into time-domain signals using inverse STFT by extracting the magnitude and phase. The proposed approach has been evaluated using the GRID audio visual corpuses, with results showing improved quality and intelligibility compared to the existing methods, as demonstrated by objective measurement metrics. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. SRU-Net: a novel spatiotemporal attention network for sclera segmentation and recognition.

Author: Mashayekhbakhsh, Tara, Meshgini, Saeed, Rezaii, Tohid Yousefi, and Makouei, Somayeh
Abstract: Segmenting sclera images for effective recognition under non-cooperative conditions poses a significant challenge due to the prevalent noise. While U-Net-based methods have shown success, their limitations in accurately segmenting objects with varying shapes necessitate innovative approaches. This paper introduces the spatiotemporal residual encoding and decoding network (SRU-Net), featuring multi-spatiotemporal feature integration (Ms-FI) modules and attention-pool mechanisms to enhance segmentation accuracy and robustness. Ms-FI modules within SRU-Net’s encoders and decoders identify salient feature regions and prune responses, while attention-pool modules improve segmentation robustness. To assess the proposed SRU-Net, we conducted experiments using six datasets, employing precision, recall, and F1-score metrics. The experimental results demonstrate the superiority of SRU-Net over state-of-the-art methods. Specifically, SRU-Net achieves F1-score values of 94.58%, 98.31%, 98.49%, 97.52%, 95.3%, 97.47%, and 93.11% for MSD, MASD, SVBPI, MASD+MSD, UBIRIS.v1, UBIRIS.v2, and MICHE, respectively. Further evaluation in recognition tasks, with metrics such as AUC, EER, VER@0.1%FAR, and VER@1%FAR considered for the six datasets. The proposed pipeline, comprising SRU-Net and auto encoders (AE), outperforms previous research for all datasets. Particularly noteworthy is the comparison of EER, where SRU-Net + AE exhibits the best recognition results, achieving an EER of 9.42%, 3.81%, and 5.73% for MSD, MASD, and MICHE datasets, respectively. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. MSU-Net: the multi-scale supervised U-Net for image splicing forgery localization.

Author: Yu, Hao, Su, Lichao, Dai, Chenwei, and Wang, Jinli
Abstract: Image splicing forgery, that is, copying some parts of an image into another image, is one of the frequently used tampering methods in image forgery. As a research hotspot in recent years, deep learning has been used in image forgery detection. However, current deep learning methods have two drawbacks: first, they are too simple in feature fusion; second, they rely only on a single cross-entropy loss as the loss function, leading to models prone to overfitting. To address these issues, a image splicing forgery localization method based on multi-scale supervised U-shaped network, named MSU-Net, is proposed in this paper. First, a triple-stream feature extraction module is designed, which combines the noise view and edge information of the input image to extract semantic-related and semantic-agnostic features. Second, a feature hierarchical fusion mechanism is proposed that introduces a channel attention mechanism layer by layer to perceive multi-level manipulation trajectories, avoiding the loss of information in semantic-related and semantic-agnostic shallow features during the convolution process. Finally, a strategy for multi-scale supervision is developed, a boundary artifact localization module is designed to compute the edge loss, and a contrastive learning module is introduced to compute the contrastive loss. Through extensive experiments on several public datasets, MSU-Net demonstrates high accuracy in localizing tampered regions and outperforms state-of-the-art methods. Additional attack experiments show that MSU-Net exhibits good robustness against Gaussian blur, Gaussian noise, and JPEG compression attacks. Besides, MSU-Net is superior in terms of model complexity and localization speed. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. Titanium Alloy Weld Time-of-Flight Diffraction Image Denoising Based on a Wavelet Feature Fusion Deep-Learning Model.

Author: Zhi, Zelin, Jiang, Hongquan, Yang, Deyan, Yue, Kun, Gao, Jianmin, Cheng, Zhixiang, Xu, Yongjun, Geng, Qiang, and Zhou, Wei
Subjects: *IMAGE denoising, *WELDED joints, *WELDING, *NONDESTRUCTIVE testing, *IMAGE fusion, *TITANIUM alloys
Abstract: Images of titanium alloy welds detected by time-of-flight diffraction (TOFD) have problems, including large noise signals and many interference streaks around the defects, all of which seriously limit the accuracy and effectiveness of defect recognition. Existing image denoising methods lack the knowledge of the noise characteristics of TOFD images of titanium alloy weld and the preprocessing experience of technicians in the field. In addition, it is difficult to select the parameters of the preprocessing methods, and they are easily influenced by the level of technical personnel, resulting in low efficiency and poor consistency in preprocessing. To address these problems, we proposed a denoising method based on the combination of wavelet band features and deep-learning theory for TOFD images of titanium alloy weld. First, based on the wavelet preprocessing method and the experience of nondestructive testing (NDT) technicians, we constructed an image pair dataset consisting of the original TOFD images of titanium alloy weld and the desired target images to realize the accumulation of engineers' preprocessing knowledge. Second, we constructed a multiband wavelet feature fusion U-net image denoising model (WU-net) and designed a loss function under three constraints of image consistency, image texture information consistency, and structural similarity. This model was able to learn to achieve end-to-end adaptive denoising for TOFD images of titanium alloy weld. Third, we illustrated and validated the effectiveness of TOFD image preprocessing for titanium alloy weld. The results showed that the proposed method effectively eliminated TOFD image noise and improved the accuracy of defect recognition. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. CST-UNet: Cross Swin Transformer Enhanced U-Net with Masked Bottleneck for Single-Channel Speech Enhancement.

Author: Zhang, Zipeng, Chen, Wei, Guo, Weiwei, Liu, Yiming, Yang, Jianhua, and Liu, Houguang
Subjects: *SPEECH enhancement, *TRANSFORMER models, *COMPUTATIONAL complexity, *CORPORA, *DEEP learning
Abstract: Speech enhancement performance has improved significantly with the introduction of deep learning models, especially methods based on the Long–Short-Term Memory architecture. However, these methods face challenges such as high computational complexity and redundancy of input features. To address these issues, we propose a U-Net-based approach that utilizes an encoder/decoder to extract more concise features, thereby enhancing single-channel speech performance and reducing computation complexity. The proposed method includes a Cross-Swin-Transformer block and a masked bottleneck module, which down-samples features while preserving the detailed representation through skip connections and carefully designed blocks. The bottleneck module extracts coarse representations of hidden features as masks. We evaluated our method against other U-Net-based approaches on VCTK and DNS corpora using CBAK, eSTOI, PESQ, STOI, and SI-SDR metrics. The results demonstrate that the proposed method achieves promising performance while significantly reducing computational complexity. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. Real-time anomaly detection for 'Remote' bus stop surveillance using unsupervised conditional generative adversarial networks.

Author: Xi, Beihao and Chen, Qingkui
Subjects: *GENERATIVE adversarial networks, *PUBLIC safety, *BUS stops, *IMAGE segmentation, *ROAD safety measures, *INTRUSION detection systems (Computer security), *VIDEO surveillance
Abstract: In response to the imbalance between normal and abnormal samples in existing anomaly detection datasets, as well as the complexity in defining anomalies, we introduce a new dataset named Remote Stop to provide data support for existing algorithms. Concurrently, we propose an unsupervised video anomaly detection method based on conditional generative adversarial networks. Our approach trains the model to learn the distribution of normal video data, enabling it to identify anomalous events. The incorporation of a spatial attention mechanism enhances the model's performance in detecting abnormal behaviors in video frames while maintaining high processing efficiency. Moreover, unlike other methods that assess the entire image, our approach uses overlapping image blocks to determine anomalies, enhancing the accuracy and robustness of the model in image segmentation. These innovations not only address the issues of scarce samples and high-cost labeling but also provide new perspectives and tools for video anomaly detection in the field of public safety. The effectiveness of the model was validated on the Avenue and Ped2 datasets and applied to our newly created dataset (Remote Stop), achieving an AUC of 84.3% and processing 61 video frames per second. This enables efficient sequential processing of large-scale video data, offering positive contributions to enhancing public road safety by providing early warnings and enabling timely preventive measures. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. Automated shoreline extraction process for unmanned vehicles via U-net with heuristic algorithm.

Author: Prokop, Katarzyna, Połap, Dawid, Włodarczyk-Sielicka, Marta, Połap, Karolina, Jaszcz, Antoni, and Stateczny, Andrzej
Subjects: HEURISTIC algorithms, DATABASES, GEOGRAPHIC boundaries, IMAGE processing, REAL estate development
Abstract: Detecting the shoreline is an important task for its potential use. The shoreline allows cropping of the image into two separate areas that present the water area and the shore. It is particularly interesting because the images can be used to analyze pollution, land development, or even waterfront erosion. Unfortunately, automatic shoreline detection is a complex problem due to numerous physical and atmospheric issues. In this paper, we present a solution based on a U-net convolutional network, that is trained to shoreline detection on a dedicated database. The database is automatically generated by applying image processing techniques and a heuristic algorithm. Using heuristics, optimal values of mask generation parameters are determined. Consequently, the solution allows for the automation of generating a set of masks by analyzing the boundary line and the efficiency of the segmentation network. The proposed solution allows for the analysis of the coastline, where potential obstacles and even occurring waves can be quickly detected. To evaluate the proposed solution, tests were carried out in real conditions, which showed the effectiveness of the model. In addition, tests were carried out on a publicly available database, which allowed for obtaining higher results than existing methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. A 3D U-NET BASED ON EARLY FUSION MODEL: IMPROVEMENT, COMPARATIVE ANALYSIS WITH STATE-OF-THE-ART MODELS AND FINE-TUNING.

Author: KAYHAN, Beyza and UYMAZ, Sait Ali
Subjects: DIAGNOSTIC imaging, ARTIFICIAL intelligence, DIGITAL technology, TECHNOLOGICAL innovations, ORGANS (Anatomy)
Abstract: Multi-organ segmentation is the process of identifying and separating multiple organs in medical images. This segmentation allows for the detection of structural abnormalities by examining the morphological structure of organs. Carrying out the process quickly and precisely has become an important issue in today's conditions. In recent years, researchers have used various technologies for the automatic segmentation of multiple organs. In this study, improvements were made to increase the multiorgan segmentation performance of the 3D U-Net based fusion model combining HSV and grayscale color spaces and compared with state-of-the-art models. Training and testing were performed on the MICCAI 2015 dataset published at Vanderbilt University, which contains 3D abdominal CT images in NIfTI format. The model's performance was evaluated using the Dice similarity coefficient. In the tests, the liver organ showed the highest Dice score. Considering the average Dice score of all organs, and comparing it with other models, it has been observed that the fusion approach model yields promising results. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. Ocean Currents Velocity Hindcast and Forecast Bias Correction Using a Deep-Learning Approach.

Author: Muhamed Ali, Ali, Zhuang, Hanqi, Huang, Yu, Ibrahim, Ali K., Altaher, Ali Salem, and Chérubin, Laurent M.
Subjects: OCEAN currents, OCEAN dynamics, NUMERICAL calculations, CURRENT transformers (Instrument transformer), DEEP learning
Abstract: Today's prediction of ocean dynamics relies on numerical models. However, numerical models are often unable to accurately model and predict real ocean dynamics, leading to a lack of fulfillment of a range of services that require reliable predictions at various temporal and spatial scales. Indeed, a numerical model cannot fully resolve all the physical processes in the ocean due to various reasons, including biases in the initial field and calculation errors in the numerical solution of the model. Thus, bias-correcting methods have become crucial to improve the dynamical accuracy of numerical model predictions. In this study, we present a machine learning-based three-dimensional velocity bias correction method derived from historical observations that applies to both hindcast and forecast. Our approach is based on the modification of an existing deep learning model, called U-Net, designed specifically for image segmentation analysis in the biomedical field. U-Net was modified to create a Transform Model that retains the temporal and spatial evolution of the differences between the model and observations to produce a correction in the form of regression weights that evolves spatially and temporally with the model both forward and backward in time, beyond the observation period. Using daily ocean current observations from a 2.5-year current meter array deployment, we show that significant bias corrections can be conducted up to 50 days pre- or post-observations. Using a 3-year-long virtual array, valid bias corrections can be conducted for up to one year. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. A Deep Learning Strategy for the Retrieval of Sea Wave Spectra from Marine Radar Data.

Author: Ludeno, Giovanni, Esposito, Giuseppe, Lugni, Claudio, Soldovieri, Francesco, and Gennarelli, Gianluca
Subjects: CONVOLUTIONAL neural networks, OCEAN waves, TRANSFER functions, FAST Fourier transforms, DEEP learning
Abstract: In the context of sea state monitoring, reconstructing the wave field and estimating the sea state parameters from radar data is a challenging problem. To reach this goal, this paper proposes a fully data-driven, deep learning approach based on a convolutional neural network. The network takes as input the radar image spectrum and outputs the sea wave directional spectrum. After a 2D fast Fourier transform, the wave elevation field is reconstructed, and accordingly, the sea state parameters are estimated. The reconstruction strategy, herein presented, is tested using numerical data generated from a synthetic sea wave simulator, considering the spectral proprieties of the Joint North Sea Wave Observation Project model. A performance analysis of the proposed deep-learning estimation strategy is carried out, along with a comparison to the classical modulation transfer function approach. The results demonstrate that the proposed approach is effective in reconstructing the directional wave spectrum across different sea states. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. A Multidimensional Framework Incorporating 2D U-Net and 3D Attention U-Net for the Segmentation of Organs from 3D Fluorodeoxyglucose-Positron Emission Tomography Images.

Author: Vezakis, Andreas, Vezakis, Ioannis, Vagenas, Theodoros P., Kakkos, Ioannis, and Matsopoulos, George K.
Subjects: CONVOLUTIONAL neural networks, ANATOMICAL planes, POSITRON emission tomography, HEART ventricles, DEEP learning
Abstract: Accurate analysis of Fluorodeoxyglucose (FDG)-Positron Emission Tomography (PET) images is crucial for the diagnosis, treatment assessment, and monitoring of patients suffering from various cancer types. FDG-PET images provide valuable insights by revealing regions where FDG, a glucose analog, accumulates within the body. While regions of high FDG uptake include suspicious tumor lesions, FDG also accumulates in non-tumor-specific regions and organs. Identifying these regions is crucial for excluding them from certain measurements, or calculating useful parameters, for example, the mean standardized uptake value (SUV) to assess the metabolic activity of the liver. Manual organ delineation from FDG-PET by clinicians demands significant effort and time, which is often not feasible in real clinical workflows with high patient loads. For this reason, this study focuses on automatically identifying key organs with high FDG uptake, namely the brain, left cardiac ventricle, kidneys, liver, and bladder. To this end, an ensemble approach is adopted, where a three-dimensional Attention U-Net (3D AU-Net) is employed for robust three-dimensional analysis, while a two-dimensional U-Net (2D U-Net) is utilized for analysis in the coronal plane. The 3D AU-Net demonstrates highly detailed organ segmentations, but also includes many false positive regions. In contrast, 2D U-Net achieves higher reliability with minimal false positive regions, but lacks the 3D details. Experiments conducted on a subset of the public AutoPET dataset with 60 PET scans demonstrate that the proposed ensemble model achieves high accuracy in segmenting the required organs, surpassing current state-of-the-art techniques, and supporting the potential utilization of the proposed methodology in accelerating and enhancing the clinical workflow of cancer patients. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

46. Deep Learning-Based Workflow for Bone Segmentation and 3D Modeling in Cone-Beam CT Orthopedic Imaging.

Author: Tiribilli, Eleonora and Bocchi, Leonardo
Subjects: CONVOLUTIONAL neural networks, CONE beam computed tomography, COMPUTED tomography, GRAPH algorithms, USER interfaces
Abstract: In this study, a deep learning-based workflow designed for the segmentation and 3D modeling of bones in cone beam computed tomography (CBCT) orthopedic imaging is presented. This workflow uses a convolutional neural network (CNN), specifically a U-Net architecture, to perform precise bone segmentation even in challenging anatomical regions such as limbs, joints, and extremities, where bone boundaries are less distinct and densities are highly variable. The effectiveness of the proposed workflow was evaluated by comparing the generated 3D models against those obtained through other segmentation methods, including SegNet, binary thresholding, and graph cut algorithms. The accuracy of these models was quantitatively assessed using the Jaccard index, the Dice coefficient, and the Hausdorff distance metrics. The results indicate that the U-Net-based segmentation consistently outperforms other techniques, producing more accurate and reliable 3D bone models. The user interface developed for this workflow facilitates intuitive visualization and manipulation of the 3D models, enhancing the usability and effectiveness of the segmentation process in both clinical and research settings. The findings suggest that the proposed deep learning-based workflow holds significant potential for improving the accuracy of bone segmentation and the quality of 3D models derived from CBCT scans, contributing to better diagnostic and pre-surgical planning outcomes in orthopedic practice. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

47. Guided-YNet: Saliency Feature-Guided Interactive Feature Enhancement Lung Tumor Segmentation Network.

Author: Zhou, Tao, Pan, Yunfeng, Lu, Huiling, Dang, Pei, Guo, Yujie, and Wang, Yaxing
Subjects: POSITRON emission tomography, COMPUTER-aided diagnosis, IMAGE segmentation, DIAGNOSTIC imaging, LUNG tumors
Abstract: Multimodal lung tumor medical images can provide anatomical and functional information for the same lesion. Such as Positron Emission Computed Tomography (PET), Computed Tomography (CT), and PET-CT. How to utilize the lesion anatomical and functional information effectively and improve the network segmentation performance are key questions. To solve the problem, the Saliency Feature-Guided Interactive Feature Enhancement Lung Tumor Segmentation Network (Guide-YNet) is proposed in this paper. Firstly, a double-encoder single-decoder U-Net is used as the backbone in this model, a single-coder single-decoder U-Net is used to generate the saliency guided feature using PET image and transmit it into the skip connection of the backbone, and the high sensitivity of PET images to tumors is used to guide the network to accurately locate lesions. Secondly, a Cross Scale Feature Enhancement Module (CSFEM) is designed to extract multi-scale fusion features after downsampling. Thirdly, a Cross-Layer Interactive Feature Enhancement Module (CIFEM) is designed in the encoder to enhance the spatial position information and semantic information. Finally, a Cross-Dimension Cross-Layer Feature Enhancement Module (CCFEM) is proposed in the decoder, which effectively extracts multimodal image features through global attention and multi-dimension local attention. The proposed method is verified on the lung multimodal medical image datasets, and the results show that the Mean Intersection over Union (MIoU), Accuracy (Acc), Dice Similarity Coefficient (Dice), Volumetric overlap error (Voe), Relative volume difference (Rvd) of the proposed method on lung lesion segmentation are 87.27%, 93.08%, 97.77%, 95.92%, 89.28%, and 88.68%, respectively. It is of great significance for computer-aided diagnosis. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

48. Cell nuclei image segmentation using U-Net and DeepLabV3+ with transfer learning and regularization.

Author: Koishiyeva, Dina, Sydybayeva, Madina, Belginova, Saule, Yeskendirova, Damelya, Azamatova, Zhanerke, Kalpebayev, Azamat, and Beketova, Gulzhanat
Subjects: MACHINE learning, COMPUTER vision, CELL nuclei, FEATURE extraction, IMAGE segmentation
Abstract: Semantic nuclei segmentation is a challenging area of computer vision. Accurate nuclei segmentation can help medics in diagnosing many diseases. Automatic nuclei segmentation can help medics in diagnosing many diseases such as cancer by providing automatic tissue analysis. Deep learning algorithms allow automatic feature extraction from medical images, however, hematoxylin and eosin (H&E) stained images are challenging due to variability in staining and textures. Using pre-trained models in deep learning speeds up development and improves their performance. This paper compares Deeplabv3+ and U-Net deep learning methods with the pre-trained models ResNet-50 and EfficientNetB4 embedded in their architecture. In addition, different regularization and dropout parameters are applied to prevent overtraining. The experiment was conducted on the PanNuke dataset consisting of nearly 8,000 histological images and annotated nuclei. As a result, the ResNet50-based DeepLabV3+ model with L2 regularization of 0.02 and dropout of 0.7 showed efficiency with dice coefficient (DCS) of 0.8356, intersection over union (IOU) of 0.7280, and loss of 0.3212 on the test set. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

49. Change Detection for Forest Ecosystems Using Remote Sensing Images with Siamese Attention U-Net.

Author: Hewarathna, Ashen Iranga, Hamlin, Luke, Charles, Joseph, Vigneshwaran, Palanisamy, George, Romiyal, Thuseethan, Selvarajah, Wimalasooriya, Chathrie, and Shanmugam, Bharanidharan
Subjects: FOREST monitoring, CARBON sequestration, DEEP learning, REMOTE sensing, LAND cover, LANDSCAPE assessment
Abstract: Forest ecosystems are critical components of Earth's biodiversity and play vital roles in climate regulation and carbon sequestration. They face increasing threats from deforestation, wildfires, and other anthropogenic activities. Timely detection and monitoring of changes in forest landscapes pose significant challenges for government agencies. To address these challenges, we propose a novel pipeline by refining the U-Net design, including employing two different schemata of early fusion networks and a Siam network architecture capable of processing RGB images specifically designed to identify high-risk areas in forest ecosystems through change detection across different time frames in the same location. It annotates ground truth change maps in such time frames using an encoder–decoder approach with the help of an enhanced feature learning and attention mechanism. Our proposed pipeline, integrated with ResNeSt blocks and SE attention techniques, achieved impressive results in our newly created forest cover change dataset. The evaluation metrics reveal a Dice score of 39.03%, a kappa score of 35.13%, an F1-score of 42.84%, and an overall accuracy of 94.37%. Notably, our approach significantly outperformed multitasking model approaches in the ONERA dataset, boasting a precision of 53.32%, a Dice score of 59.97%, and an overall accuracy of 97.82%. Furthermore, it surpassed multitasking models in the HRSCD dataset, even without utilizing land cover maps, achieving a Dice score of 44.62%, a kappa score of 11.97%, and an overall accuracy of 98.44%. Although the proposed model had a lower F1-score than other methods, other performance metrics highlight its effectiveness in timely detection and forest landscape monitoring, advancing deep learning techniques in this field. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

50. Evaluating the Impact of Filtering Techniques on Deep Learning-Based Brain Tumour Segmentation.

Author: Rosa, Sofia, Vasconcelos, Verónica, and Caridade, Pedro J. S. B.
Subjects: CONTRAST-enhanced magnetic resonance imaging, GREENHOUSE gases, BRAIN tumors, CONVOLUTIONAL neural networks, SYMPTOMS
Abstract: Gliomas are a common and aggressive kind of brain tumour that is difficult to diagnose due to their infiltrative development, variable clinical presentation, and complex behaviour, making them an important focus in neuro-oncology. Segmentation of brain tumour images is critical for improving diagnosis, prognosis, and treatment options. Manually segmenting brain tumours is time-consuming and challenging. Automatic segmentation algorithms can significantly improve the accuracy and efficiency of tumour identification, thus improving treatment planning and outcomes. Deep learning-based segmentation tumours have shown significant advances in the last few years. This study evaluates the impact of four denoising filters, namely median, Gaussian, anisotropic diffusion, and bilateral, on tumour detection and segmentation. The U-Net architecture is applied for the segmentation of 3064 contrast-enhanced magnetic resonance images from 233 patients diagnosed with meningiomas, gliomas, and pituitary tumours. The results of this work demonstrate that bilateral filtering yields superior outcomes, proving to be a robust and computationally efficient approach in brain tumour segmentation. This method reduces the processing time by 12 epochs, which in turn contributes to lowering greenhouse gas emissions by optimizing computational resources and minimizing energy consumption. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

6,232 results on '"u-net"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources