Descriptor: "Attention mechanisms" / Topic: artificial neural networks - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Attention mechanisms"' showing total 28 results

Start Over Descriptor "Attention mechanisms" Topic artificial neural networks

28 results on '"Attention mechanisms"'

1. 采用 SKM 与 Transformer 的多维脑电情感识别研究.

Author: 梁卓, 李鴻燕, 徐庆, and 陈彬
Subjects: CONVOLUTIONAL neural networks, ARTIFICIAL neural networks, EMOTION recognition, FEATURE extraction, SELF-expression, DEEP learning, RECURRENT neural networks
Abstract: Copyright of Journal of Chongqing University of Technology (Natural Science) is the property of Chongqing University of Technology and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

2. Research on citrus segmentation algorithm based on complex environment.

Author: Jia Jun Zhang, Peng Chao Zhang, Jun Lin Huang, Kai Yue, and Zhi Miao Guo
Subjects: ALGORITHMS, DEEP learning, PERFORMANCE evaluation, GRAIN refinement, ARTIFICIAL neural networks
Abstract: Aiming to address the low efficiency of current deep learning algorithms for segmenting citrus in complex environments, this paper proposes a study on citrus segmentation algorithms based on a multi-scale attention mechanism. The DeepLab V3+ network model was utilized as the primary framework and enhanced to suit the characteristics of the citrus dataset. In this paper, we will introduce a more sophisticated multi-scale attention mechanism to enhance the neural network's capacity to perceive information at different scales, thus improving the model's performance in handling complex scenes and multi-scale objects. The DeepLab V3+ network addresses the challenges of low segmentation accuracy and inadequate refinement of segmentation edges when segmenting citrus in complex scenes, and the experimental results demonstrate that the improved algorithm in this paper achieves 96.8 % in the performance index of MioU and 98.4 % in the performance index of MPA, which improves the segmentation effectiveness to a significant degree. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. Prototype Learning for Medical Time Series Classification via Human–Machine Collaboration.

Author: Xie, Jia, Wang, Zhu, Yu, Zhiwen, Ding, Yasan, and Guo, Bin
Subjects: *ARTIFICIAL neural networks, *CONVOLUTIONAL neural networks, *ATRIAL fibrillation, *PROTOTYPES, *CLASSIFICATION
Abstract: Deep neural networks must address the dual challenge of delivering high-accuracy predictions and providing user-friendly explanations. While deep models are widely used in the field of time series modeling, deciphering the core principles that govern the models' outputs remains a significant challenge. This is crucial for fostering the development of trusted models and facilitating domain expert validation, thereby empowering users and domain experts to utilize them confidently in high-risk decision-making contexts (e.g., decision-support systems in healthcare). In this work, we put forward a deep prototype learning model that supports interpretable and manipulable modeling and classification of medical time series (i.e., ECG signal). Specifically, we first optimize the representation of single heartbeat data by employing a bidirectional long short-term memory and attention mechanism, and then construct prototypes during the training phase. The final classification outcomes (i.e., normal sinus rhythm, atrial fibrillation, and other rhythm) are determined by comparing the input with the obtained prototypes. Moreover, the proposed model presents a human–machine collaboration mechanism, allowing domain experts to refine the prototypes by integrating their expertise to further enhance the model's performance (contrary to the human-in-the-loop paradigm, where humans primarily act as supervisors or correctors, intervening when required, our approach focuses on a human–machine collaboration, wherein both parties engage as partners, enabling more fluid and integrated interactions). The experimental outcomes presented herein delineate that, within the realm of binary classification tasks—specifically distinguishing between normal sinus rhythm and atrial fibrillation—our proposed model, albeit registering marginally lower performance in comparison to certain established baseline models such as Convolutional Neural Networks (CNNs) and bidirectional long short-term memory with attention mechanisms (Bi-LSTMAttns), evidently surpasses other contemporary state-of-the-art prototype baseline models. Moreover, it demonstrates significantly enhanced performance relative to these prototype baseline models in the context of triple classification tasks, which encompass normal sinus rhythm, atrial fibrillation, and other rhythm classifications. The proposed model manifests a commendable prediction accuracy of 0.8414, coupled with macro precision, recall, and F1-score metrics of 0.8449, 0.8224, and 0.8235, respectively, achieving both high classification accuracy as well as good interpretability. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. Open set classification of Hyperspectral images with energy models.

Author: Alwadei, Bashair, Mekhalfi, Mohamed Lamine, Bazi, Yakoub, Al Rahhal, Mohamad M., and Zuair, Mansour
Subjects: *IMAGE recognition (Computer vision), *ARTIFICIAL neural networks, *SPECTRAL imaging
Abstract: Hyperspectral images are rich in spectral data, lending themselves well to pixel-level image classification tasks. Previous studies primarily focus on closed-set classification within the realm of hyperspectral image classification. However, real-world scenarios present the challenge of dealing with object classes not encountered during the training phase, a scenario known as open set classification, which has garnered less attention compared to the closed set paradigm. In this paper, we propose a methodology anchored on ConvMixer for tackling open-set classification by utilizing energybased models. We incorporate a Selective Kernel Attention (SKA) to capture the notion that different feature maps usually correspond to different objects in deep neural networks. Our experimental validation, conducted on two datasets, specifically the WHU-Hi-HanChuan and WHU-Hi-HongHu datasets, showcases promising outcomes of the introduced method. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

5. When attention is not enough to unveil a text's author profile: Enhancing a transformer with a wide branch.

Author: López-Santillán, Roberto, González, Luis C., Montes-y-Gómez, Manuel, and López-Monroy, A. Pastor
Subjects: *TRANSFORMER models, *SOCIAL media, *ARTIFICIAL neural networks, *DEEP learning, *NATURAL language processing, *MACHINE learning, *SPANISH language
Abstract: Author profiling (AP) is a highly relevant natural language processing (NLP) problem; it deals with predicting features of authors such as gender, age and personality traits. It is done by analyzing texts written by the authors themselves; take for instance documents such as books, articles, and more recently posts in social media platforms. In the present study, we focus in the latter, which is an scenario with a number of applications in marketing, security, health and others. Surprisingly, given the achievements of deep learning (DL) strategies on other NLP tasks, for AP DL architectures regularly underperform, left behind by classical machine learning (ML) approaches. In this study we show how a deep learning architecture based on transformers offers competitive results by exploiting a joint-intermediate fusion strategy called the Wide & Deep Transformer (WD-T). Our methodology implements a fusion of contextualized word vector representations and handcrafted features, by using a self-attention mechanism and a novel encoding technique that incorporates stylistic, topic, and personal information from authors. This allows for the creation of more accurate, fine-grained predictions. Our approach attained competitive performance against top-quartile results from the 2017–2019 editions at the Plagiarism analysis, Authorship identification, and Near-duplicate detection forum (PAN) in English and Spanish languages for gender and language variety predictions, and the Kaggle Myers–Briggs-type indicator (MBTI) dataset for personality forecasting. Our proposal consistently surpasses all other deep learning methods in PAN collections by as much as 2.4%, and up to 3.4% in the MBTI dataset. These results suggest that this DL strategy effectively addresses and improves upon the limitations of previous techniques and paves the way for new avenues of inquiry. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

6. RMAFF-PSN: A Residual Multi-Scale Attention Feature Fusion Photometric Stereo Network.

Author: Luo, Kai, Ju, Yakun, Qi, Lin, Wang, Kaixuan, and Dong, Junyu
Subjects: PHOTOMETRIC stereo, ARTIFICIAL neural networks, FEATURE extraction, STEREO vision (Computer science), SURFACE geometry, GEOMETRIC surfaces
Abstract: Predicting accurate normal maps of objects from two-dimensional images in regions of complex structure and spatial material variations is challenging using photometric stereo methods due to the influence of surface reflection properties caused by variations in object geometry and surface materials. To address this issue, we propose a photometric stereo network called a RMAFF-PSN that uses residual multiscale attentional feature fusion to handle the "difficult" regions of the object. Unlike previous approaches that only use stacked convolutional layers to extract deep features from the input image, our method integrates feature information from different resolution stages and scales of the image. This approach preserves more physical information, such as texture and geometry of the object in complex regions, through shallow-deep stage feature extraction, double branching enhancement, and attention optimization. To test the network structure under real-world conditions, we propose a new real dataset called Simple PS data, which contains multiple objects with varying structures and materials. Experimental results on a publicly available benchmark dataset demonstrate that our method outperforms most existing calibrated photometric stereo methods for the same number of input images, especially in the case of highly non-convex object structures. Our method also obtains good results under sparse lighting conditions. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

7. Multimodal deep collaborative filtering recommendation based on dual attention.

Author: Yin, Pei, Ji, Dandan, Yan, Han, Gan, Hongcheng, and Zhang, Jinxian
Subjects: *ARTIFICIAL neural networks, *ATTENTION
Abstract: The current collaborative filtering algorithm is difficult to quantify the interaction between user and item features, which makes it difficult to accurately identify user preferences. Therefore, a multimodal deep collaborative filtering recommendation model based on dual attention for crowdfunding platforms is proposed. The model first uses the dual attention mechanism to quantify investor preferences, then uses deep neural networks to learn the nonlinear interaction of item features, and then combines the collaborative filtering mechanism to model investor preferences and item features to predict the recommendation list. Meanwhile, in terms of features, a large amount of auxiliary information is used to construct a richer feature system through multimodal fusion as a way to alleviate the cold start problem and improve the prediction accuracy. The effect of hyper-parameters on the experimental performance of the real crowdfunding dataset Indiegogo is explored and baseline experiments are designed for comparison. The experimental results show that the proposed model achieves the best recommendation results on the Indiegogo dataset compared to other baseline models. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

8. Attentional Extractive Summarization †.

Author: González, José Ángel, Segarra, Encarna, García-Granada, Fernando, Sanchis, Emilio, and Hurtado, Lluís-F.
Subjects: ARTIFICIAL neural networks, DATA mining, COMPUTER systems
Abstract: In this work, a general theoretical framework for extractive summarization is proposed—the Attentional Extractive Summarization framework. Although abstractive approaches are generally used in text summarization today, extractive methods can be especially suitable for some applications, and they can help with other tasks such as Text Classification, Question Answering, and Information Extraction. The proposed approach is based on the interpretation of the attention mechanisms of hierarchical neural networks, which compute document-level representations of documents and summaries from sentence-level representations, which, in turn, are computed from word-level representations. The models proposed under this framework are able to automatically learn relationships among document and summary sentences, without requiring Oracle systems to compute the reference labels for each sentence before the training phase. These relationships are obtained as a result of a binary classification process, the goal of which is to distinguish correct summaries for documents. Two different systems, formalized under the proposed framework, were evaluated on the CNN/DailyMail and the NewsRoom corpora, which are some of the reference corpora in the most relevant works on text summarization. The results obtained during the evaluation support the adequacy of our proposal and suggest that there is still room for the improvement of our attentional framework. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

9. Breast Cancer Detection in Thermography Using Convolutional Neural Networks (CNNs) with Deep Attention Mechanisms.

Author: Alshehri, Alia and AlSaeed, Duaa
Subjects: CONVOLUTIONAL neural networks, ARTIFICIAL neural networks, COMPUTER-aided diagnosis, BREAST, THERMOGRAPHY, EARLY detection of cancer, MAMMOGRAMS, BREAST cancer
Abstract: Featured Application: Medical diagnosis and computer-aided diagnosis systems. Breast cancer is one of the most common types of cancer among women. Accurate diagnosis at an early stage can reduce the mortality associated with this disease. Governments and health organizations stress the importance of early detection of breast cancer as it is related to an increase in the number of available treatment options and increased survival. Early detection gives patients the best chance of receiving effective treatment. Different types of images and imaging modalities are used in the detection and diagnosis of breast cancer. One of the imaging types is "infrared thermal" breast imaging, where a screening instrument is used to measure the temperature distribution of breast tissue. Although it has not been used often, compared to mammograms, it showed promising results when used for early detection. It also has many advantages as it is non-invasive, safe, painless, and inexpensive. The literature has indicated that the use of thermal images with deep neural networks improves the accuracy of early diagnosis of breast malformation. Therefore, in this paper, we aim to investigate to what extent convolutional neural networks (CNNs) with attention mechanisms (AMs) can provide satisfactory detection results in thermal breast cancer images. We present a model for breast cancer detection based on deep neural networks with AMs using thermal images from the Database for Research Mastology with Infrared Image (DMR-IR). The model will be evaluated in terms of accuracy, sensitivity and specificity, and will be compared against state-of-the-art breast cancer detection methods. The AMs with the CNN model achieved encouraging test accuracy rates of 99.46%, 99.37%, and 99.30% on the breast thermal dataset. The test accuracy of CNNs without AMs was 92.32%, whereas CNNs with AMs achieved an improvement in accuracy of 7%. Moreover, the proposed models outperformed previous models that were reviewed in the literature. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

10. Classify breast cancer pathological tissue images using multi-scale bar convolution pooling structure with patch attention.

Author: Guo, Dongen, Lin, Yuyao, Ji, Kangyi, Han, Linbo, Liao, Yongbo, Shen, Zhen, Feng, Jiangfan, and Tang, Man
Subjects: ARTIFICIAL neural networks, CONVOLUTIONAL neural networks, CANCER diagnosis, BREAST cancer, PATHOLOGISTS
Abstract: Pathological diagnosis plays a crucial role in the diagnosis and treatment of breast cancer. It is of profound clinical significance to construct a neural network model that can automatically classify breast cancer pathological tissue images (BCPTI) to assist pathologists in making accurate diagnoses. It is worth noting that although many convolutional neural network models have shown promising results in the recognition of BCPTI, They often fail to take full advantage of the elongated class pathological features present in BCPTI. Based on this problem, we propose a new feature extraction architecture to increase the performance of the model, which can extract rectangular features and elongated features in BCPTI through multi-scale strip convolution and pooling. In addition, we also propose a novel attention mechanism, which increases the weight values of key features from both channel and spatial aspects. More importantly, to alleviate the problem of the weak ability of convolutional neural networks to extract global features, in terms of spatial attention, we divide the image into nine patches and input them into the multi-layer perceptron to form weights to increase the global feature expression ability of the model. We modified the above innovative solution to the DenseNet model and reduced the batch normalization layer, and activation layer in the original model to maintain the feature diversity of the model. Finally, the binary classification accuracy in BreakHis dataset reaches 99.88%, and the eight classification accuracy reaches 97.62% • This paper proposes a novel CNN for classification in Breakhis dataset with competitive performance. • An attention mechanism using chunking to aggregate local features is proposed. • A multi-scale bar convolution and pool multi-branch structure is proposed. • We verify and compare the performance of the proposed models and modules from multiple perspectives. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. One-Shot Retail Product Identification Based on Improved Siamese Neural Networks.

Author: Wang, Chunchieh, Huang, Chengwei, Zhu, Xiaoming, and Zhao, Liye
Subjects: *ARTIFICIAL neural networks, *DIGITAL transformation, *RETAIL stores, *CUSTOMER experience
Abstract: Conventional retail stores are undergoing digital transformation, and in a typical smart retail store, automatic recognition of retail products is essential for customer experience in the checkout stage. In this paper, we propose an improved Siamese neural network to identify the product from one-shot learning. First, a spatial channel dual attention mechanism is proposed to improve the network architecture. Second, a binary cross-entropy loss function with a distance penalty is adopted to replace the conventional contrastive loss function. The proposed network can better model the details of the products. The experimental results are achieved on two public available databases. The results show that the proposed method outperforms the conventional methods, and it can solve the data insufficient problem in the training stage. Smart retail stores can change the SKUs (Stock Keeping Units) conveniently without collecting a large amount of training samples. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

12. Information agenda as an analogue of attention in sociomorphic neuronal networks.

Author: Andreyuk, Denis, Petrunin, Yuri, Shuranova, Ann, and Ushakov, Vadim
Subjects: NEURAL circuitry, SOCIAL groups, BIOLOGICAL systems, INFORMATION modeling, ARTIFICIAL neural networks
Abstract: Information processing and decision-making performed by a social group can be modelled as a neuronal network activity. Sociomorphic neuronal networks (SNN) are in several ways different from neuromorphic networks; however, the majority of their critical features are similar. This paper describes a mechanism of SNN which functions in the same way as attention does in biological intelligence systems. News agenda serves as an initiating factor for this mechanism. Structured news items are analysed within a group in terms of their relation to the group' structure of values. While current values can differ slightly between the group members, core values are constant for the whole group. The suggested approach provides a basis for developing new tools for modelling information processing in social groups as neuronal network contours. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

13. Feature fusion network based on siamese structure for change detection.

Author: Wang, Gaihua, Dai, Yingying, Zhang, Tianlun, Lin, Jinheng, and Chen, Lei
Subjects: *ARTIFICIAL neural networks, *URBAN growth, *REMOTE sensing, *PROBLEM solving
Abstract: Remote sensing image change detection is to analyze the change information of two images from the same area at different times. It has wide applications in urban expansion, forest detection, and natural disaster. In this paper, Feature Fusion Network is proposed to solve the problems of slow change detection speed and low accuracy. The MobileNetV3 block is adopted to efficiently extract features and a self-attention module is applied to investigate the relationship between heterogeneous feature maps (image features and concatenated features). The method is tested in data sets SZTAKI and LEVIR-CD. With 98.43 percentage correct classification, it is better than other comparative networks, and its space complexity is reduced by about 50%. The experimental results show that it has better performance and can improve the accuracy or speed of change detection. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

14. Improving performance of deep learning models for 3D point cloud semantic segmentation via attention mechanisms.

Author: Vanian, Vazgen, Zamanakos, Georgios, and Pratikakis, Ioannis
Subjects: *DEEP learning, *POINT cloud, *ARTIFICIAL neural networks, *COMPUTER vision, *AUTONOMOUS vehicles
Abstract: 3D Semantic segmentation is a key element for a variety of applications in robotics and autonomous vehicles. For such applications, 3D data are usually acquired by LiDAR sensors resulting in a point cloud, which is a set of points characterized by its unstructured form and inherent sparsity. For the task of 3D semantic segmentation where the corresponding point clouds should be labeled with semantics, the current tendency is the use of deep learning neural network architectures for effective representation learning. On the other hand, various 2D and 3D computer vision tasks have used attention mechanisms which result in an effective re-weighting of the already learned features. In this work, we aim to investigate the role of attention mechanisms for the task of 3D semantic segmentation for autonomous driving, by identifying the significance of different attention mechanisms when adopted in existing deep learning networks. Our study is further supported by an extensive experimentation on two standard datasets for autonomous driving, namely Street3D and SemanticKITTI, that permit to draw conclusions at both a quantitative and qualitative level. Our experimental findings show that there is a clear advantage when attention mechanisms have been adopted, resulting in a superior performance. In particular, we show that the adoption of a Point Transformer in a SPVCNN network, results in an architecture which outperforms the state of the art on the Street3D dataset. [Display omitted] • Improving DL models for 3D point cloud semantic segmentation via attention mechanisms. • Evaluation study of attention modules for 3D semantic segmentation in LiDAR data. • Development and implementation of attention enhanced networks. • Evaluation of attention mechanisms in two datasets for autonomous driving. • Pros and cons of attention mechanisms, given their performance in the two datasets. • Fruitful discussion aiming to identify key features to be either adopted or avoided. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

15. Combining dense elements with attention mechanisms for 3D radiotherapy dose prediction on head and neck cancers.

Author: Cros, Samuel, Bouttier, Hugo, Nguyen‐Tan, Phuc Felix, Vorontsov, Eugene, and Kadoury, Samuel
Subjects: HEAD & neck cancer, ARTIFICIAL neural networks, RADIOTHERAPY, MACHINE learning, CONVOLUTIONAL neural networks
Abstract: Purpose: External radiation therapy planning is a highly complex and tedious process as it involves treating large target volumes, prescribing several levels of doses, as well as avoiding irradiating critical structures such as organs at risk close to the tumor target. This requires highly trained dosimetrists and physicists to generate a personalized plan and adapt it as treatment evolves, thus affecting the overall tumor control and patient outcomes. Our aim is to achieve accurate dose predictions for head and neck (H&N) cancer patients on a challenging in‐house dataset that reflects realistic variability and to further compare and validate the method on a public dataset. Methods: We propose a three‐dimensional (3D) deep neural network that combines a hierarchically dense architecture with an attention U‐net (HDA U‐net). We investigate a domain knowledge objective, incorporating a weighted mean squared error (MSE) with a dose‐volume histogram (DVH) loss function. The proposed HDA U‐net using the MSE‐DVH loss function is compared with two state‐of‐the‐art U‐net variants on two radiotherapy datasets of H&N cases. These include reference dose plans, computed tomography (CT) information, organs at risk (OARs), and planning target volume (PTV) delineations. All models were evaluated using coverage, homogeneity, and conformity metrics as well as mean dose error and DVH curves. Results: Overall, the proposed architecture outperformed the comparative state‐of‐the‐art methods, reaching 0.95 (0.98) on D95 coverage, 1.06 (1.07) on the maximum dose value, 0.10 (0.08) on homogeneity, 0.53 (0.79) on conformity index, and attaining the lowest mean dose error on PTVs of 1.7% (1.4%) for the in‐house (public) dataset. The improvements are statistically significant (p<0.05$p<0.05$) for the homogeneity and maximum dose value compared with the closest baseline. All models offer a near real‐time prediction, measured between 0.43 and 0.88 s per volume. Conclusion: The proposed method achieved similar performance on both realistic in‐house data and public data compared to the attention U‐net with a DVH loss, and outperformed other methods such as HD U‐net and HDA U‐net with standard MSE losses. The use of the DVH objective for training showed consistent improvements to the baselines on most metrics, supporting its added benefit in H&N cancer cases. The quick prediction time of the proposed method allows for real‐time applications, providing physicians a method to generate an objective end goal for the dosimetrist to use as reference for planning. This could considerably reduce the number of iterations between the two expert physicians thus reducing the overall treatment planning time. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

16. A cross-channel multi-scale gated fusion network for recognizing construction and demolition waste from high-resolution remote sensing images.

Author: Zhang, Chaoqun, Zhou, Lei, Du, Mingyi, Yang, Kun, and Luo, Ting
Subjects: *CONSTRUCTION & demolition debris, *DEEP learning, *REMOTE sensing, *IMAGE recognition (Computer vision), *ARTIFICIAL neural networks, *MACHINE learning, *ENVIRONMENTAL quality
Abstract: Timely and accurate survey of construction and demolition waste (C&DW) distribution is of significance for managing C&DW and enhancing the quality of the urban environment, especially in rapidly urbanizing country of China. Automatic C&DW recognition using high-resolution remote sensing images is an extremely important surveying method. Existing C&DW recognition methods can only obtain a rough spatial distribution of C&DW. Mapping precise C&DW distribution on large-scale remote sensing images is still challenging. In this paper, to solve the problem of precise C&DW recognition, a novel deep learning algorithm based on high-resolution remote sensing images is proposed, which we refer to as the cross-channel multi-scale gated fusion network (CCMGNet). CCMGNet extracts depth features of C&DW from RGB and NIR bands by using two independent encoding streams. A gated fusion layer and a multi-scale attention module are designed to effectively fuse the features from two encoding streams. We manually labelled the sample set used for training and testing on 26 remote sensing images captured by the Gaofen-2 (GF-2) satellite. These images were taken in nine typical cities in China over the time period from 2019 to 2021. The results showed that the proposed method is effective in recognizing C&DW, realizing 89.42% and 84.52% in precision and IoU, and is superior to other state-of-the-art deep learning algorithms and existing C&DW recognition methods. The effectiveness of all significant components in CCMGNet was confirmed by ablation experiments. The proposed method was applied to extract C&DW regions in large-scale remote sensing images, with an example of a GF-2 image of Beijing. The high extraction efficiency and satisfactory visual effect demonstrate the potential of the proposed method for surveying the spatial distribution of C&DW. The novel method will play a crucial role in automatically surveying the spatial distribution and size of C&DW across the country. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

17. Multi-Label Fundus Image Classification Using Attention Mechanisms and Feature Fusion.

Author: Li, Zhenwei, Xu, Mengying, Yang, Xiaoli, and Han, Yanqi
Subjects: ARTIFICIAL neural networks, BINOCULAR vision, CLASSIFICATION algorithms, NOSOLOGY, NETWORK performance, VISION disorders
Abstract: Fundus diseases can cause irreversible vision loss in both eyes if not diagnosed and treated immediately. Due to the complexity of fundus diseases, the probability of fundus images containing two or more diseases is extremely high, while existing deep learning-based fundus image classification algorithms have low diagnostic accuracy in multi-labeled fundus images. In this paper, a multi-label classification of fundus disease with binocular fundus images is presented, using a neural network algorithm model based on attention mechanisms and feature fusion. The algorithm highlights detailed features in binocular fundus images, and then feeds them into a ResNet50 network with attention mechanisms to extract fundus image lesion features. The model obtains global features of binocular images through feature fusion and uses Softmax to classify multi-label fundus images. The ODIR binocular fundus image dataset was used to evaluate the network classification performance and conduct ablation experiments. The model's backend is the Tensorflow framework. Through experiments on the test images, this method achieved accuracy, precision, recall, and F1 values of 94.23%, 99.09%, 99.23%, and 99.16%, respectively. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

18. TUMbRAIN: A transformer with a unified mobile residual attention inverted network for diagnosing brain tumors from magnetic resonance scans.

Author: Montalbo, Francis Jesmar P.
Subjects: *ARTIFICIAL neural networks, *CONVOLUTIONAL neural networks, *TRANSFORMER models, *MACHINE learning, *ARTIFICIAL intelligence, *DEEP learning
Abstract: Diagnosing tumors in Magnetic Resonance Imaging (MRI) brain scans is challenging and can lead to errors, even for radiologists. Deep learning, mainly through deep convolutional neural networks, has assisted in automating the diagnosis of these scans. However, there is still room for improvement. Researchers have shown that transformer models hold promise but often remain underutilized due to their need for large amounts of data and complexity compared to traditional neural networks. This paper introduces a new hybrid model called TUMbRAIN (Transformer with a Unified Mobile Residual Attention Inverted Network), which combines a lightweight transformer with a deep convolutional neural network to address these issues. TUMbRAIN incorporates innovative components designed for this purpose, such as the expanded inverted residual block and the unified self-attention mechanism. The results demonstrate that TUMbRAIN outperforms many existing state-of-the-art neural network models, achieving an impressive overall accuracy of 97.94 % with only 1.04 million parameters. These results suggest that hybrid transformer models like TUMbRAIN could significantly advance the automated diagnosis of brain tumors from MRI scans. The study also offers new insights into effectively integrating transformers into traditional neural network architectures, resulting in a cost-effective and accurate deep learning solution for medical imaging. By incorporating these advanced components, TUMbRAIN enhances support for radiological practice through improved diagnostic accuracy and efficiency. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

19. AFLEMP: Attention-based Federated Learning for Emotion recognition using Multi-modal Physiological data.

Author: Gahlan, Neha and Sethia, Divyashikha
Subjects: FEDERATED learning, AFFECTIVE computing, EMOTION recognition, ARTIFICIAL neural networks, AFFECTIVE neuroscience, DATA privacy, TRANSFORMER models
Abstract: Automated emotion recognition systems utilizing physiological signals are essential for affective computing and intelligent interaction. Combining the multiple physiological signals is more precise and effective in accurately assessing a person's emotional state. These automated emotion recognition systems using conventional machine learning techniques require complete access to the physiological data for emotion state classification, compromising sensitive data privacy. Federated Learning (FL) resolves this issue by preserving the user's privacy and sensitive physiological data while recognizing emotions. However, existing FL methods have limitations in handling data heterogeneity in the physiological data and do not measure communication efficiency and scalability. In response to these challenges, this paper proposes a unique novel framework called AFLEMP (Attention-based Federated Learning for Emotion recognition using Multi-modal Physiological data) integrating attention mechanism-based Transformer with an Artificial Neural Network (ANN) model. The framework reduces two types of data heterogeneity: (1) Variation Heterogeneity (VH) in multi-modal EEG, GSR, and ECG physiological signal data using attention mechanisms and (2) Imbalanced Data Heterogeneity (IDH) in the FL environment using scaled weighted federated averaging. This paper validates the proposed AFLEMP framework on two publicly available emotion datasets, AMIGOS and DREAMER, achieving an average accuracy of 88.30% and 84.10%, respectively. The proposed AFLEMP framework proves robust, scalable, and efficient in communication. AFLEMP is the first FL framework to propose for emotion recognition using multi-modal physiological signals while reducing data heterogeneity and outperforming existing FL methods. • AFLEMP for Automated Emotion Recognition using Multi-Modal Physiological Signals: EEG, ECG and GSR. • Federated Learning for preserving sensitive physiological data during emotion recognition. • Addressing variation heterogeneity (VH) and imbalanced data heterogeneity (IDH) in FL environment. • Melds different Attention Mechanisms, Transformer with Artificial Neural Network (ANN). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Azimuth-Sensitive Object Detection of High-Resolution SAR Images in Complex Scenes by Using a Spatial Orientation Attention Enhancement Network.

Author: Ge, Ji, Wang, Chao, Zhang, Bo, Xu, Changgui, and Wen, Xiaoyang
Subjects: *OBJECT recognition (Computer vision), *SPATIAL orientation, *PYRAMIDS, *SYNTHETIC aperture radar, *FEATURE extraction, *ARTIFICIAL neural networks
Abstract: The scattering features of objects in synthetic aperture radar (SAR) imagery are highly sensitive to different azimuth angles, and detecting azimuth-sensitive objects in complex scenes becomes a challenging task. To address this issue, we propose a novel framework called the spatial orientation attention enhancement network (SOAEN) by using aircraft detection in complex scenes of SAR imagery as a case study. Taking YOLOX as the basic framework, this framework introduces the inverted pyramid ConvMixer network (IPCN), the spatial-orientation-enhanced path aggregation feature pyramid network (SOEPAFPN), and the anchor-free decoupled head (AFDH) to achieve performance improvement. A spatial orientation attention module is proposed and introduced into the path aggregation feature pyramid network to form a new structure, the SOEPAFPN, for capturing feature transformations in different directions, highlighting object features and suppressing background effects; the IPCN is adapted to replace the backbone network of YOLOX for enhancing the multiscale feature extraction capability and reducing the computational complexity, while the AFDH is used to decouple object localization and classification to improve the efficiency and accuracy of object localization and classification. The experimental results of the multiple real complex scenes on Gaofen-3 1 m images show that the proposed method achieves the highest detection accuracy, with an average detection rate of 91.22% compared with the YOLO series networks. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

21. 基于迁移学习的讽刺检测.

Author: 李垒昂, 马鸿超, and 周清雷
Subjects: *ARTIFICIAL neural networks, *SUPERVISED learning, *SENTIMENT analysis, *TASK analysis, *TAGS (Metadata), *SARCASM, *KNOWLEDGE transfer
Abstract: Accurate sarcasm detection is crucial for sentiment analysis and other tasks. Traditional approaches rely heavily on discrete handcrafted features. Existing studies mostly formulate sarcasm detection as a standard supervised learning text categorization task, relying on explicit expressions for detecting context incongruity. But supervised learning requires a lot of data, the collection and tagging of these data are difficult. Due to the limited target tasks, it may lead to the low performance of sarcasm detection. Therefore, this paper regarded sarcasm detection as a transfer learning task, combined the supervised learning of sarcasm labeled text with the knowledge transfer of external analytical resources. It improved the neural network model by transferring resource knowledge to improve the detection performance of target task. Experimental results on publicly available datasets show that the proposed sarcasm detection model based on migration learning is superior to the existing advanced sarcasm detection model. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

22. A novel fine-grained rumor detection algorithm with attention mechanism.

Author: Zhang, Ke, Cao, Jianjun, and Pi, Dechang
Subjects: *ARTIFICIAL neural networks, *SOCIAL media, *RUMOR, *USER-generated content
Abstract: Rumors circulating on social media platforms have consistently represented a substantial threat to societal security and stability. Both academia and the industry have dedicated heightened focus to addressing the issue of rumor detection. Recent research has made significant progress in using deep neural networks to model the textual content and propagation structure of rumors. However, these methods model rumor-related features at a coarse-grained level and do not take full advantage of the various contextual information associated with rumors. In this paper, we propose a new model called Hybrid Rumor Detection Model with Co-Attention Mechanisms(CoAHRD), which utilizes the original tweet content, social context information, and user information for rumor detection. First, we use the Fine-Grained Feature Learning(FGFL) algorithm to extract fine-grained features from the tweets. Based on this, we use the Graph Convolutional Network(GCN) to learn the Multi-relation graph propagation structure features of rumors. Next, we combine FGFL features with temporal encoding information to model the temporal structure of rumors. Then, we introduce a User Feature-Based Co-Attention Network(UFCoAN) to learn the tendency of different users to spread rumors. Finally, we fuse the above features through a fully connected layer and perform rumor detection. Extensive experiments on two publicly available datasets, PHEME and TWITTER15, show that our method outperforms the current mainstream methods. In particular, in terms of accuracy, our model improves by 0.9% and 1.6% over the best baseline method on the Twitter15 and PHEME datasets, respectively. • BERTWeet and saliency learning for fine-grained modelling of rumour text. • Using attention mechanisms to learn the temporal structure of rumours. • As a complementary task to rumour detection, segmentation of users into different groups using a user attention network. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. Decoding Visual Motions from EEG Using Attention-Based RNN.

Author: Yang, Dongxu, Liu, Yadong, Zhou, Zongtan, Yu, Yang, and Liang, Xinbin
Subjects: ELECTROENCEPHALOGRAPHY, BRAIN-computer interfaces, CONVOLUTIONAL neural networks, RECURRENT neural networks, VISUAL perception, ARTIFICIAL neural networks
Abstract: The main objective of this paper is to use deep neural networks to decode the electroencephalography (EEG) signals evoked when individuals perceive four types of motion stimuli (contraction, expansion, rotation, and translation). Methods for single-trial and multi-trial EEG classification are both investigated in this study. Attention mechanisms and a variant of recurrent neural networks (RNNs) are incorporated as the decoding model. Attention mechanisms emphasize task-related responses and reduce redundant information of EEG, whereas RNN learns feature representations for classification from the processed EEG data. To promote generalization of the decoding model, a novel online data augmentation method that randomly averages EEG sequences to generate artificial signals is proposed for single-trial EEG. For our dataset, the data augmentation method improves the accuracy of our model (based on RNN) and two benchmark models (based on convolutional neural networks) by 5.60%, 3.92%, and 3.02%, respectively. The attention-based RNN reaches mean accuracies of 67.18% for single-trial EEG decoding with data augmentation. When performing multi-trial EEG classification, the amount of training data decreases linearly after averaging, which may result in poor generalization. To address this deficiency, we devised three schemes to randomly combine data for network training. Accordingly, the results indicate that the proposed strategies effectively prevent overfitting and improve the correct classification rate compared with averaging EEG fixedly (by up to 19.20%). The highest accuracy of the three strategies for multi-trial EEG classification achieves 82.92%. The decoding performance for the methods proposed in this work indicates they have application potential in the brain–computer interface (BCI) system based on visual motion perception. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

24. Deep Attentive Features for Prostate Segmentation in 3D Transrectal Ultrasound.

Author: Wang, Yi, Ni, Dong, Dou, Haoran, Hu, Xiaowei, Zhu, Lei, Yang, Xin, Xu, Ming, Qin, Jing, Heng, Pheng-Ann, and Wang, Tianfu
Subjects: *ARTIFICIAL neural networks, *PROSTATE, *ENDORECTAL ultrasonography, *IMAGE segmentation, *EXOCRINE glands
Abstract: Automatic prostate segmentation in transrectal ultrasound (TRUS) images is of essential importance for image-guided prostate interventions and treatment planning. However, developing such automatic solutions remains very challenging due to the missing/ambiguous boundary and inhomogeneous intensity distribution of the prostate in TRUS, as well as the large variability in prostate shapes. This paper develops a novel 3D deep neural network equipped with attention modules for better prostate segmentation in TRUS by fully exploiting the complementary information encoded in different layers of the convolutional neural network (CNN). Our attention module utilizes the attention mechanism to selectively leverage the multi-level features integrated from different layers to refine the features at each individual layer, suppressing the non-prostate noise at shallow layers of the CNN and increasing more prostate details into features at deep layers. Experimental results on challenging 3D TRUS volumes show that our method attains satisfactory segmentation performance. The proposed attention mechanism is a general strategy to aggregate multi-level deep features and has the potential to be used for other medical image segmentation tasks. The code is publicly available at https://github.com/wulalago/DAF3D. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

25. Contribuciones a la comprensión lectora: mecanismos de atención y alineamiento entre n-gramas para similitud e inferencia interpretable.

Author: Lopez-Gazpio, Iñigo
Subjects: PROGRAMMING languages, COMPUTER systems, ARTIFICIAL neural networks, RESEMBLANCE (Philosophy)
Abstract: Copyright of Procesamiento del Lenguaje Natural is the property of Sociedad Espanola para el Procesamiento del Lenguaje Natural and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2019
Full Text: View/download PDF

26. Attention‐guided evolutionary attack with elastic‐net regularization on face recognition.

Author: Hu, Cong, Li, Yuanbo, Feng, Zhenhua, and Wu, Xiaojun
Subjects: *FACE perception, *ARTIFICIAL neural networks, *HUMAN facial recognition software, *COVARIANCE matrices, *CONVOLUTIONAL neural networks
Abstract: • Propose a novel decision-based black-box adversarial attack method. • Employ the attention mechanism to improve evolutionary attack. • Generate more indistinguishable perturbations with limited queries. In recent years, face recognition has achieved promising results along with the development of advanced Deep Neural Networks (DNNs). The existing face recognition systems are vulnerable to adversarial examples, which brings potential security risks. Evolutionary Attack (EA) has been successfully used to fool face recognition by inducing a minimum perturbation to a face image with few queries. However, EA employs the global information of face images but ignores their local characteristics. In addition, restricting the ℓ 2 -norm of adversarial perturbations hinders the diversity of adversarial perturbations. To solve the above problems, we propose Attention-guided Evolutionary Attack with Elastic-Net Regularization (ERAEA) for attacking face recognition. ERAEA extracts local facial characteristics by attention mechanism, effectively improving the attack effect and image perception quality. In particular, ERAEA adopts an attention mechanism to guide evolutionary direction, which operates on the covariance matrix as it contains crucial information about the evolutionary path. Furthermore, we design an adaptive elastic-net regularization to diversify the adversarial perturbation, accelerating the optimization performance. Extensive experiments obtained on three benchmarks demonstrate that our proposed method achieves better perturbation norm than the state-of-the-art methods with limited queries on face recognition and generates adversarial face images with higher perceptual quality. Besides, ERAEA requires fewer queries to achieve a fixed adversarial perturbation norm. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

27. Age and gender recognition in the wild with deep attention.

Author: Rodríguez, Pau, Cucurull, Guillem, Gonfaus, Josep M., Roca, F. Xavier, and Gonzàlez, Jordi
Subjects: *HUMAN facial recognition software, *DEFORMATIONS (Mechanics), *ARTIFICIAL neural networks, *IMAGE recognition (Computer vision), *FEATURE extraction
Abstract: Face analysis in images in the wild still pose a challenge for automatic age and gender recognition tasks, mainly due to their high variability in resolution, deformation, and occlusion. Although the performance has highly increased thanks to Convolutional Neural Networks (CNNs), it is still far from optimal when compared to other image recognition tasks, mainly because of the high sensitiveness of CNNs to facial variations. In this paper, inspired by biology and the recent success of attention mechanisms on visual question answering and fine-grained recognition, we propose a novel feedforward attention mechanism that is able to discover the most informative and reliable parts of a given face for improving age and gender classification. In particular, given a downsampled facial image, the proposed model is trained based on a novel end-to-end learning framework to extract the most discriminative patches from the original high-resolution image. Experimental validation on the standard Adience, Images of Groups, and MORPH II benchmarks show that including attention mechanisms enhances the performance of CNNs in terms of robustness and accuracy. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

28. HROM: Learning High-Resolution Representation and Object-Aware Masks for Visual Object Tracking.

Author: Zhang, Dawei, Zheng, Zhonglong, Wang, Tianxiang, and He, Yiran
Subjects: *OBJECT tracking (Computer vision), *ARTIFICIAL neural networks, *MASKS, *MATHEMATICAL convolutions, *INFORMATION sharing
Abstract: Siamese network-based trackers consider tracking as features cross-correlation between the target template and the search region. Therefore, feature representation plays an important role for constructing a high-performance tracker. However, all existing Siamese networks extract the deep but low-resolution features of the entire patch, which is not robust enough to estimate the target bounding box accurately. In this work, to address this issue, we propose a novel high-resolution Siamese network, which connects the high-to-low resolution convolution streams in parallel as well as repeatedly exchanges the information across resolutions to maintain high-resolution representations. The resulting representation is semantically richer and spatially more precise by a simple yet effective multi-scale feature fusion strategy. Moreover, we exploit attention mechanisms to learn object-aware masks for adaptive feature refinement, and use deformable convolution to handle complex geometric transformations. This makes the target more discriminative against distractors and background. Without bells and whistles, extensive experiments on popular tracking benchmarks containing OTB100, UAV123, VOT2018 and LaSOT demonstrate that the proposed tracker achieves state-of-the-art performance and runs in real time, confirming its efficiency and effectiveness. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

28 results on '"Attention mechanisms"'

1. 采用 SKM 与 Transformer 的多维脑电情感识别研究.

2. Research on citrus segmentation algorithm based on complex environment.

3. Prototype Learning for Medical Time Series Classification via Human–Machine Collaboration.

4. Open set classification of Hyperspectral images with energy models.

5. When attention is not enough to unveil a text's author profile: Enhancing a transformer with a wide branch.

6. RMAFF-PSN: A Residual Multi-Scale Attention Feature Fusion Photometric Stereo Network.

7. Multimodal deep collaborative filtering recommendation based on dual attention.

8. Attentional Extractive Summarization †.

9. Breast Cancer Detection in Thermography Using Convolutional Neural Networks (CNNs) with Deep Attention Mechanisms.

10. Classify breast cancer pathological tissue images using multi-scale bar convolution pooling structure with patch attention.

11. One-Shot Retail Product Identification Based on Improved Siamese Neural Networks.

12. Information agenda as an analogue of attention in sociomorphic neuronal networks.

13. Feature fusion network based on siamese structure for change detection.

14. Improving performance of deep learning models for 3D point cloud semantic segmentation via attention mechanisms.

15. Combining dense elements with attention mechanisms for 3D radiotherapy dose prediction on head and neck cancers.

16. A cross-channel multi-scale gated fusion network for recognizing construction and demolition waste from high-resolution remote sensing images.

17. Multi-Label Fundus Image Classification Using Attention Mechanisms and Feature Fusion.

18. TUMbRAIN: A transformer with a unified mobile residual attention inverted network for diagnosing brain tumors from magnetic resonance scans.

19. AFLEMP: Attention-based Federated Learning for Emotion recognition using Multi-modal Physiological data.

20. Azimuth-Sensitive Object Detection of High-Resolution SAR Images in Complex Scenes by Using a Spatial Orientation Attention Enhancement Network.

21. 基于迁移学习的讽刺检测.

22. A novel fine-grained rumor detection algorithm with attention mechanism.

23. Decoding Visual Motions from EEG Using Attention-Based RNN.

24. Deep Attentive Features for Prostate Segmentation in 3D Transrectal Ultrasound.

25. Contribuciones a la comprensión lectora: mecanismos de atención y alineamiento entre n-gramas para similitud e inferencia interpretable.

26. Attention‐guided evolutionary attack with elastic‐net regularization on face recognition.

27. Age and gender recognition in the wild with deep attention.

28. HROM: Learning High-Resolution Representation and Object-Aware Masks for Visual Object Tracking.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

28 results on '"Attention mechanisms"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources