1,583 results on '"self-attention mechanism"'
Search Results
2. Multi-view knowledge graph convolutional networks for recommendation
- Author
-
Wang, Xiaofeng, Zhang, Zengjie, Shen, Guodong, Lai, Shuaiming, Chen, Yuntao, and Zhu, Shuailei
- Published
- 2025
- Full Text
- View/download PDF
3. Mineral prospectivity prediction based on the dynamic relation model Atten-GCN: A case study of gold prospecting in the Yingfengjie area, Shaanxi province (northern China)
- Author
-
Rui, Wang, Linfu, Xue, Yongsheng, Li, Jianbang, Wang, Qun, Yan, and Xiangjin, Ran
- Published
- 2025
- Full Text
- View/download PDF
4. Adversarial relationship graph learning soft sensor via negative information exclusion
- Author
-
Jia, Mingwei, Yang, Chao, Pan, Zhouxin, Liu, Qiang, and Liu, Yi
- Published
- 2025
- Full Text
- View/download PDF
5. The short-term wind power prediction based on a multi-layer stacked model of BO[sbnd]CNN-BiGRU-SA
- Author
-
Chen, Wen, Huang, Hongquan, Ma, Xingke, Xu, Xinhang, Guan, Yi, Wei, Guorui, Xiong, Lin, Zhong, Chenglin, Chen, Dejie, and Wu, Zhonglin
- Published
- 2025
- Full Text
- View/download PDF
6. Prediction of wind power ramp events via a self-attention based deep learning approach
- Author
-
Li, Jie, Meng, Fanxi, Zhang, Zichen, and Zhang, Yipu
- Published
- 2024
- Full Text
- View/download PDF
7. Wfold: A new method for predicting RNA secondary structure with deep learning
- Author
-
Yuan, Yongna, Yang, Enjie, and Zhang, Ruisheng
- Published
- 2024
- Full Text
- View/download PDF
8. Spatial and channel enhanced self-attention network for efficient single image super-resolution
- Author
-
Song, Xiaogang, Tan, Yuping, Pang, Xinchao, Zhang, Lei, Lu, Xiaofeng, and Hei, Xinhong
- Published
- 2025
- Full Text
- View/download PDF
9. A hybrid CNN-Transformer model for identification of wheat varieties and growth stages using high-throughput phenotyping
- Author
-
Jeon, Yu-Jin, Hong, Min Jeong, Ko, Chan Seop, Park, So Jin, Lee, Hyein, Lee, Won-Gyeong, and Jung, Dae-Hyun
- Published
- 2025
- Full Text
- View/download PDF
10. A multi-view GNN-based network representation learning framework for recommendation systems
- Author
-
Amara, Amina, Hadj Taieb, Mohamed Ali, and Ben Aouicha, Mohamed
- Published
- 2025
- Full Text
- View/download PDF
11. A convolutional attention network for multi-task classification of stamp ink based on visible and near-infrared spectral information
- Author
-
Xie, Zujie, Yu, Ziru, Duan, Xingyu, Han, Xingzhou, Qin, Da, Cui, Wei, and Yu, Xiangyang
- Published
- 2025
- Full Text
- View/download PDF
12. A novel multi-sensor local and global feature fusion architecture based on multi-sensor sparse Transformer for intelligent fault diagnosis
- Author
-
Yang, Zhenkun, Li, Gang, Xue, Gui, He, Bin, Song, Yue, and Li, Xin
- Published
- 2025
- Full Text
- View/download PDF
13. Optimal cooperative scheduling strategy of energy storage and electric vehicle based on residential building integrated photovoltaic
- Author
-
Yan, Xiuying and He, Xuxin
- Published
- 2024
- Full Text
- View/download PDF
14. Urban rail transit passenger flow prediction with ResCNN-GRU based on self-attention mechanism
- Author
-
Ma, Changxi, Zhang, Bowen, Li, Shukai, and Lu, Youpeng
- Published
- 2024
- Full Text
- View/download PDF
15. Investigation of Gated-CNN and Self-Attention Mechanism for Historical Handwritten Text Recognition
- Author
-
Li, Jizhang, Ahmed, Sarfraz, Nazmul Huda, Md, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huda, M. Nazmul, editor, Wang, Mingfeng, editor, and Kalganova, Tatiana, editor
- Published
- 2025
- Full Text
- View/download PDF
16. DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention
- Author
-
BaoLong, NguyenHuu, Zhang, Chenyu, Shi, Yuzhi, Hirakawa, Tsubasa, Yamashita, Takayoshi, Matsui, Tohgoroh, Fujiyoshi, Hironobu, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Cho, Minsu, editor, Laptev, Ivan, editor, Tran, Du, editor, Yao, Angela, editor, and Zha, Hongbin, editor
- Published
- 2025
- Full Text
- View/download PDF
17. Enhanced 3D Dense U-Net with Two Independent Teachers for Infant Brain Image Segmentation
- Author
-
Khaled, Afifa, Elazab, Ahmed, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Antonacopoulos, Apostolos, editor, Chaudhuri, Subhasis, editor, Chellappa, Rama, editor, Liu, Cheng-Lin, editor, Bhattacharya, Saumik, editor, and Pal, Umapada, editor
- Published
- 2025
- Full Text
- View/download PDF
18. Neighborhood Difference-Enhanced Graph Neural Network Based on Hypergraph for Social Bot Detection
- Author
-
Shi, Shuhao, Li, Yan, Liu, Zihao, Chen, Chen, Chen, Jian, Yan, Bin, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Lin, Zhouchen, editor, Cheng, Ming-Ming, editor, He, Ran, editor, Ubul, Kurban, editor, Silamu, Wushouer, editor, Zha, Hongbin, editor, Zhou, Jie, editor, and Liu, Cheng-Lin, editor
- Published
- 2025
- Full Text
- View/download PDF
19. Trajectory Prediction of Unmanned Surface Vehicle Based on Improved Transformer
- Author
-
Cheng, Zhipeng, Yu, Jian, Chen, Junyu, Ren, Jihuan, Wu, Xiang, Rannenberg, Kai, Editor-in-Chief, Soares Barbosa, Luís, Editorial Board Member, Carette, Jacques, Editorial Board Member, Tatnall, Arthur, Editorial Board Member, Neuhold, Erich J., Editorial Board Member, Stiller, Burkhard, Editorial Board Member, Stettner, Lukasz, Editorial Board Member, Pries-Heje, Jan, Editorial Board Member, Kreps, David, Editorial Board Member, Rettberg, Achim, Editorial Board Member, Furnell, Steven, Editorial Board Member, Mercier-Laurent, Eunika, Editorial Board Member, Winckler, Marco, Editorial Board Member, Malaka, Rainer, Editorial Board Member, Shi, Zhongzhi, editor, Witbrock, Michael, editor, and Tian, Qi, editor
- Published
- 2025
- Full Text
- View/download PDF
20. Named Entity Recognition of Belt Conveyor Faults Based on ALBERT-BiLSTM-SAM-CRF
- Author
-
Zhu, Qi, Cao, Jingjing, Xu, Zhangyi, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Zhang, Haijun, editor, Li, Xianxian, editor, Hao, Tianyong, editor, Meng, Weizhi, editor, Wu, Zhou, editor, and He, Qian, editor
- Published
- 2025
- Full Text
- View/download PDF
21. Echo lite voice fusion network: advancing underwater acoustic voiceprint recognition with lightweight neural architectures: Echo lite voice fusion network: advancing underwater...: J. Wu et al.
- Author
-
Wu, Jiaqi, Guan, Donghai, and Yuan, Weiwei
- Abstract
Underwater acoustic voiceprint recognition, serving as a key technology in the field of biometric identification, presents a wide range of application prospects, especially in areas such as marine resource development, underwater communication, and underwater safety monitoring. Conventional acoustic voiceprint recognition methods exhibit limitations in underwater environments, prompting the need for a lightweight neural network approach to optimally address underwater acoustic voiceprint recognition tasks. This paper introduces a novel lightweight voicing recognition model, the Echo Lite Voice Fusion Network (ELVFN), which incorporates depthwise separable convolution and self-attention mechanism, and significantly improves voicing recognition performance by optimizing acoustic feature extraction technology and hierarchical feature fusion strategy. Concurrently, the computational complexity and parameter quantity of the model are substantially reduced. Comparative analyses with existing acoustic voiceprint recognition models corroborate the superior performance of our model across multiple underwater acoustic datasets. Experimental results demonstrate that ELVFN outperforms in various evaluation metrics, notably in terms of processing efficiency and recognition accuracy. Finally, we discuss the application potential and future development directions of the model, providing an efficient solution for underwater acoustic voiceprint recognition in resource-constrained environments. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
22. A singular spectrum analysis-enhanced BiTCN-selfattention model for runoff prediction.
- Author
-
Wang, Wen-chuan, Ye, Feng-rui, Wang, Yi-yang, and Gu, Miao
- Abstract
To tackle the difficulties and challenges posed by the nonlinear and nonstationary characteristics of runoff sequences in hydrological prediction, this paper aims to provide a novel forecasting method for the field of runoff prediction by constructing an SSA-BiTCN-SelfAttention time series prediction model. This model consists of Singular Spectrum Analysis (SSA), Bi-directional Temporal Convolutional Network (BiTCN), and Self-Attention mechanism (SelfAttention). Firstly, the runoff sequence is decomposed and reconstructed by Singular Spectrum Analysis, and the reconstructed sequence removes the noise and reveals the periodicity, trend, and other information in the runoff data, to facilitate the learning of the subsequent model; after that, the BiTCN model is used for the bidirectional training of the new sequence to validate and combine with the self-attention mechanism to fully explore the dependency relationship within the long sequence, to further improve the performance of the model. To verify the effectiveness of the model, this paper uses multi-year measured runoff data from Jiayuguan Hydrological Station, Yingluxia Hydrological Station, and Manwan Hydrological Station for training and testing. It selects three evaluation metrics: RMSE, MAE, and R2, and analyzes the performance of the SSA-BiTCN-SelfAttention model by comparing it with four models: LSTM, TCN, BiTCN, CNN-LSTM, and BiTCN-SelfAttention. The results show that the SSA-BiTCN-SelfAttention model has the smallest prediction error and the highest accuracy. Compared with the single TCN model, the model improves about 58.36%, 46.43%, and 38.27% in the RMSE index, 63.89%, 57.89%, and 61.88% in the MAE index, and 10.9%, 3.7% and 1.8% in the R2 index. The proposed singular spectrum analysis method can be used for trend and periodicity analysis of runoff data, providing an important basis for hydrological management. The prediction results of the proposed model are the closest to the true values, indicating its strong hydrological prediction ability. It not only provides a new method for runoff prediction but also provides important data references for the rational utilization and scientific planning of water resources. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
23. Enhancing Motor Imagery Classification with Residual Graph Convolutional Networks and Multi-Feature Fusion.
- Author
-
Xu, Fangzhou, Shi, Weiyou, Lv, Chengyan, Sun, Yuan, Guo, Shuai, Feng, Chao, Zhang, Yang, Jung, Tzyy-Ping, and Leng, Jiancai
- Subjects
- *
LARGE-scale brain networks , *MOTOR imagery (Cognition) , *PEARSON correlation (Statistics) , *BRAIN damage , *STROKE rehabilitation , *ELECTROENCEPHALOGRAPHY , *BRAIN waves - Abstract
Stroke, an abrupt cerebrovascular ailment resulting in brain tissue damage, has prompted the adoption of motor imagery (MI)-based brain–computer interface (BCI) systems in stroke rehabilitation. However, analyzing electroencephalogram (EEG) signals from stroke patients poses challenges. To address the issues of low accuracy and efficiency in EEG classification, particularly involving MI, the study proposes a residual graph convolutional network (M-ResGCN) framework based on the modified S-transform (MST), and introduces the self-attention mechanism into residual graph convolutional network (ResGCN). This study uses MST to extract EEG time-frequency domain features, derives spatial EEG features by calculating the absolute Pearson correlation coefficient (aPcc) between channels, and devises a method to construct the adjacency matrix of the brain network using aPcc to measure the strength of the connection between channels. Experimental results involving 16 stroke patients and 16 healthy subjects demonstrate significant improvements in classification quality and robustness across tests and subjects. The highest classification accuracy reached 94.91% and a Kappa coefficient of 0.8918. The average accuracy and F1 scores from 10 times 10-fold cross-validation are 94.38% and 94.36%, respectively. By validating the feasibility and applicability of brain networks constructed using the aPcc in EEG signal analysis and feature encoding, it was established that the aPcc effectively reflects overall brain activity. The proposed method presents a novel approach to exploring channel relationships in MI-EEG and improving classification performance. It holds promise for real-time applications in MI-based BCI systems. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
24. 基于 Vision Transformer 的车辆重识别模型优化.
- Author
-
张 震, 张亚斌, and 田鸿朋
- Abstract
Copyright of Journal of Zhengzhou University (Natural Science Edition) is the property of Journal of Zhengzhou University (Natural Science Edition) Editorial Office and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2025
- Full Text
- View/download PDF
25. Split-and-recombine and vision transformer based 3D human pose estimation.
- Author
-
Lu, Xinyi, Xu, Fan, Hu, Shuiyi, Yu, Tianqi, and Hu, Jianling
- Abstract
Regression of 3D human pose from monocular images faces many challenges, especially for rare poses and occlusions. To solve these problems, we propose SR-ViT, a novel approach based on Split-and-Recombine and Visual Transformer for 3D human pose estimation. Our method first feeds the 2D joint coordinates of multi-frame images into the 3D feature extractor to obtain the 3D features of each frame. After feature fusion with the position embedding information, the global correlation between all frames is modeled by the Transformer encoder, and the final 3D pose output is obtained with a regression head, which achieves the estimation of the 3D pose of the center frame from consecutive multi-frame images and effectively solves the joint occlusion problem. By improving the structure of the 3D feature extractor and the design of the loss function, the prediction performance of rare poses is improved. The model performance is also enhanced by improving the self-attention mechanism in both global and local aspects. Our method has been evaluated on two benchmark datasets, namely, Human3.6M and MPI-INF-3DHP. Experimental results show that our method outperforms the benchmark methods on both datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
26. A multimodal travel route recommendation system leveraging visual Transformers and self-attention mechanisms.
- Author
-
Juan, Zhang, Zhang, Jing, and Gao, Ming
- Subjects
LONG short-term memory ,DEEP learning ,IMAGE fusion ,TRANSFORMER models ,TOURISM - Abstract
Introduction: With the rapid development of the tourism industry, the demand for accurate and personalized travel route recommendations has significantly increased. However, traditional methods often fail to effectively integrate visual and sequential information, leading to recommendations that are both less accurate and less personalized. Methods: This paper introduces SelfAM-Vtrans, a novel algorithm that leverages multimodal data—combining visual Transformers, LSTMs, and self-attention mechanisms—to enhance the accuracy and personalization of travel route recommendations. SelfAM-Vtrans integrates visual and sequential information by employing a visual Transformer to extract features from travel images, thereby capturing spatial relationships within them. Concurrently, a Long Short-Term Memory (LSTM) network encodes sequential data to capture the temporal dependencies within travel sequences. To effectively merge these two modalities, a self-attention mechanism fuses the visual features and sequential encodings, thoroughly accounting for their interdependencies. Based on this fused representation, a classification or regression model is trained using real travel datasets to recommend optimal travel routes. Results and discussion: The algorithm was rigorously evaluated through experiments conducted on real-world travel datasets, and its performance was benchmarked against other route recommendation methods. The results demonstrate that SelfAM-Vtrans significantly outperforms traditional approaches in terms of both recommendation accuracy and personalization. By comprehensively incorporating both visual and sequential data, this method offers travelers more tailored and precise route suggestions, thereby enriching the overall travel experience. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Evaluating the effectiveness of self-attention mechanism in tuberculosis time series forecasting.
- Author
-
Lv, Zhihong, Sun, Rui, Liu, Xin, Wang, Shuo, Guo, Xiaowei, Lv, Yuan, Yao, Min, and Zhou, Junhua
- Subjects
- *
LONG short-term memory , *STANDARD deviations , *BOX-Jenkins forecasting , *MOVING average process , *COGNITIVE psychology - Abstract
Background: With the increasing impact of tuberculosis on public health, accurately predicting future tuberculosis cases is crucial for optimizing of health resources and medical service allocation. This study applies a self-attention mechanism to predict the number of tuberculosis cases, aiming to evaluate its effectiveness in forecasting. Methods: Monthly tuberculosis case data from Changde City between 2010 and 2021 were used to construct a self-attention model, a long short-term memory (LSTM) model, and an autoregressive integrated moving average (ARIMA) model. The performance of these models was evaluated using three metrics: root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). Results: The self-attention model outperformed the other models in terms of prediction accuracy. On the test set, the RMSE of the self-attention model was approximately 7.41% lower than that of the LSTM model, MAE was reduced by about 10.99%, and MAPE was reduced by approximately 9.87%. Compared to the ARIMA model, RMSE was reduced by about 28.86%, MAE by about 32.22%, and MAPE by approximately 29.89%. Conclusion: The self-attention model can effectively improve the prediction accuracy of tuberculosis cases, providing guidance for health departments optimizing of health resources and medical service allocation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Satellite Image Time-Series Classification with Inception-Enhanced Temporal Attention Encoder.
- Author
-
Zhang, Zheng, Zhang, Weixiong, Meng, Yu, Zhao, Zhitao, Tang, Ping, and Li, Hongyi
- Subjects
- *
IMAGE recognition (Computer vision) , *REMOTE-sensing images , *TRANSFORMER models , *TIME series analysis , *CLASSIFICATION - Abstract
In this study, we propose a one-branch IncepTAE network to extract local and global hybrid temporal attention simultaneously and congruously for fine-grained satellite image time series (SITS) classification. Transformer and the temporal self-attention mechanism have been the research focus of SITS classification in recent years. However, its effectiveness seems to diminish in the scenario of fine-grained classification among similar categories, for example, different crop types. Theoretically, most of the existing methods focus on only one type of temporal attention, either global attention or local attention, but actually, both of them are required to achieve fine-grained classification. Even though some works adopt two-branch architecture to extract hybrid attention, they usually lack congruity between different types of temporal attention and hinder the expected discriminating ability. Compared with the existing methods, IncepTAE exhibits multiple methodological novelties. Firstly, we insert average/maximum pooling layers into the calculation of multi-head attention to extract hybrid temporal attention. Secondly, IncepTAE adopts one-branch architecture, which reinforces the interaction and congruity of different temporal information. Thirdly, the proposed IncepTAE is more lightweight due to the use of group convolutions. IncepTAE achieves 95.65% and 97.84% overall accuracy on two challenging datasets, TimeSen2Crop and Ghana. The comparative results with existing state-of-the-art methods demonstrate that IncepTAE is able to achieve superior classification performance and faster inference speed, which is conducive to the large-area application of SITS classification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. A Dual-Path Computational Ghost Imaging Method Based on Convolutional Neural Networks.
- Author
-
Wang, Hexiao, Wu, Jianan, Wang, Mingcong, and Xia, Yu
- Abstract
Ghost imaging is a technique for indirectly reconstructing images by utilizing the second-order or higher-order correlation properties of the light field, which exhibits a robust ability to resist interference. On the premise of ensuring the quality of the image, effectively broadening the imaging range can improve the practicality of the technology. In this paper, a dual-path computational ghost imaging method based on convolutional neural networks is proposed. By using the dual-path detection structure, a wider range of target image information can be obtained, and the imaging range can be expanded. In this paper, for the first time, we try to use the two-channel probe as the input of the convolutional neural network and successfully reconstruct the target image. In addition, the network model incorporates a self-attention mechanism, which can dynamically adjust the network focus and further improve the reconstruction efficiency. Simulation results show that the method is effective. The method in this paper can effectively broaden the imaging range and provide a new idea for the practical application of ghost imaging technology. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Computer vision-driven forest wildfire and smoke recognition via IoT drone cameras.
- Author
-
Wang, Yupeng, Wang, Yongli, Xu, Can, Wang, Xiaoli, and Zhang, Yong
- Subjects
- *
TRANSFORMER models , *FOREST fires , *COMPUTER vision , *FOREST monitoring , *INTERNET of things , *WILDFIRE prevention - Abstract
Forest wildfires often lead to significant casualties and economic losses, making early detection crucial for prevention and control. Internet of Things connected cameras mounted on drone provide wide monitoring coverage and flexibility, while computer vision technology enhances the accuracy and response time of forest wildfire monitoring. However, the small-scale nature of early wildfire targets and the complexity of the forest environment pose significant challenges to accurately and promptly identify fires. To address challenges such as high false-positive rates and inefficiency in existing methods, we propose a Forest Wildfire and Smoke Recognition Network termed FWSRNet. Firstly, we adopt Vision Transformer, which has shown superior performance in recent traditional classification tasks, as the backbone network. Secondly, to enhance the extraction of subtle differential features, we introduce a self-attention mechanism to guide the network in selecting discriminative image patches and calculating their relationships. Next, we employ a contrastive feature learning strategy to eliminate redundant information, making the model more discriminative. Finally, we construct a target loss function for model prediction. Under various proportions of training and testing dataset allocations, the model exhibits recognition accuracies of 94.82, 95.05, 94.90, and 94.80% for forest fires. The average accuracy of 94.89% surpasses five comparative models, demonstrating the potential of this method in IoT-enhanced aerial forest fire recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Mandarin Recognition Based on Self-Attention Mechanism with Deep Convolutional Neural Network (DCNN)-Gated Recurrent Unit (GRU).
- Author
-
Chen, Xun, Wang, Chengqi, Hu, Chao, and Wang, Qin
- Subjects
ARTIFICIAL neural networks ,CONVOLUTIONAL neural networks ,SPEECH perception ,RECURRENT neural networks ,ARTIFICIAL intelligence ,DEEP learning - Abstract
Speech recognition technology is an important branch in the field of artificial intelligence, aiming to transform human speech into computer-readable text information. However, speech recognition technology still faces many challenges, such as noise interference, and accent and speech rate differences. An aim of this paper is to explore a deep learning-based speech recognition method to improve the accuracy and robustness of speech recognition. Firstly, this paper introduces the basic principles of speech recognition and existing mainstream technologies, and then focuses on the deep learning-based speech recognition method. Through comparative experiments, it is found that the self-attention mechanism performs best in speech recognition tasks. In order to further improve speech recognition performance, this paper proposes a deep learning model based on the self-attention mechanism with DCNN-GRU. The model realizes the dynamic attention to an input speech by introducing the self-attention mechanism in a neural network model instead of an RNN and with a deep convolutional neural network, which improves the robustness and recognition accuracy of this model. This experiment uses 170 h of Chinese dataset AISHELL-1. Compared with the deep convolutional neural network, the deep learning model based on the self-attention mechanism with DCNN-GRU accomplishes a reduction of at least 6% in CER. Compared with a bidirectional gated recurrent neural network, the deep learning model based on the self-attention mechanism with DCNN-GRU accomplishes a reduction of 0.7% in CER. And finally, this experiment is performed on a test set analyzed the influencing factors affecting the CER. The experimental results show that this model exhibits good performance in various noise environments and accent conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Improved genetic algorithm based on rule optimization strategy for fibre allocation.
- Author
-
Tan, Feng, Yuan, Zhipeng, Zhang, Yong, Tang, Sheng, Guo, Feng, and Zhang, Shuai
- Subjects
GREEDY algorithms ,MANUFACTURING industries ,CHROMOSOMES ,FIBERS ,ALGORITHMS ,GENETIC algorithms - Abstract
In modern manufacturing industry, in order to adapt to changes in the general environment, the manufacturing industry must improve production efficiency. To this end, this article introduces an improved genetic algorithm based on rule selection to tackle the nondeterministic polynomial hard problem stemming from inventory fibre resources and fibre selection principles in optical cable production. The algorithm aims to maximize inventory score and minimize fibre segmentation rate. It employs a permutation encoding approach to link the genetic algorithm with fibre allocation solutions and applies a self-attention mechanism to determine subset solution weight within each solution. To boost the recombination of favourable gene segments from different chromosomes, a rule optimization strategy is integrated into the crossover operation based on the weights. This operations enhance the algorithm's global search capability and convergence speed. A feasibility repair strategy is then used to inspect and rectify chromosomes, preventing the generation of illegal solutions. The legitimate mutation operation, founded on weight optimization rules, effectively reduces the algorithm's running time by avoiding illegal solutions. By leveraging actual production data from an optical cable manufacturer for simulation, the experimental results confirm the effectiveness of the improved genetic algorithm in addressing the fibre allocation problem. Comparative simulations with the unimproved genetic algorithm and a stepwise greedy algorithm underscore the superiority of the improved genetic algorithm in resolving the fibre allocation problem. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Aero-Engine Fault Detection with an LSTM Auto-Encoder Combined with a Self-Attention Mechanism.
- Author
-
Du, Wenyou, Zhang, Jingyi, Meng, Guanglei, and Zhang, Haoran
- Subjects
LONG short-term memory ,SUPPORT vector machines ,VALUE engineering ,COMPARATIVE studies - Abstract
The safe operation of aero-engines is crucial for ensuring flight safety, and effective fault detection methods are fundamental to achieving this objective. In this paper, we propose a novel approach that integrates an auto-encoder with long short-term memory (LSTM) networks and a self-attention mechanism for the anomaly detection of aero-engine time-series data. The dataset utilized in this study was simulated from real data and injected with fault information. A fault detection model is developed utilizing normal data samples for training and faulty data samples for testing. The LSTM auto-encoder processes the time-series data through an encoder–decoder architecture, extracting latent representations and reconstructing the original inputs. Furthermore, the self-attention mechanism captures long-range dependencies and significant features within the sequences, thereby enhancing the detection accuracy of the model. Comparative analyses with the traditional LSTM auto-encoder, as well as one-class support vector machines (OC-SVM) and isolation forests (IF), reveal that the experimental results substantiate the feasibility and effectiveness of the proposed method, highlighting its potential value in engineering applications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. 改进YOLOv7 的输电线路融冰刀闸状态识别方法.
- Author
-
高绪杰, 李泽滔, 曾华荣, 杨旗, and 张露松
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
35. Graph convolutional networks with the self-attention mechanism for adaptive influence maximization in social networks.
- Author
-
Tang, Jianxin, Song, Shihui, Du, Qian, Yao, Yabing, and Qu, Jitao
- Subjects
SOCIAL networks ,SOCIAL influence ,BUDGET ,RESEARCH personnel ,SEEDS - Abstract
The influence maximization problem that has drawn a great deal of attention from researchers aims to identify a subset of influential spreaders that can maximize the expected influence spread in social networks. Existing works on the problem primarily concentrate on developing non-adaptive policies, where all seeds will be ignited at the very beginning of the diffusion after the identification. However, in non-adaptive policies, budget redundancy could occur as a result of some seeds being naturally infected by other active seeds during the diffusion process. In this paper, the adaptive seeding policies are investigated for the intractable adaptive influence maximization problem. Based on deep learning model, a novel approach named graph convolutional networks with self-attention mechanism (ATGCN) is proposed to address the adaptive influence maximization as a regression task. A controlling parameter is introduced for the adaptive seeding model to make a tradeoff between the spreading delay and influence coverage. The proposed approach leverages the self-attention mechanism to dynamically assign importance weight to node representations efficiently to capture the node influence feature information relevant to the adaptive influence maximization problem. Finally, intensive experimental findings on six real-world social networks demonstrate the superiorities of the adaptive seeding policy over the state-of-the-art baseline methods to the conventional influence maximization problem. Meanwhile, the proposed adaptive seeding policy ATGCN improves the influence spread rate by up to 7% in comparison to the existing state-of-the-art greedy-based adaptive seeding policy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Research on a Passenger Flow Prediction Model Based on BWO-TCLS-Self-Attention.
- Author
-
Liu, Sheng, Du, Lang, Cao, Ting, and Zhang, Tong
- Subjects
LONG short-term memory ,UNDERGROUND areas ,CIVIL defense ,PREDICTION models ,DATA modeling - Abstract
In recent years, with the rapid development of the global demand and scale for deep underground space utilization, deep space has gradually transitioned from single-purpose uses such as underground transportation, civil defense, and commerce to a comprehensive, livable, and disaster-resistant underground ecosystem. This shift has brought increasing attention to the safety of personnel flow in deep spaces. In addressing challenges in deep space passenger flow prediction, such as irregular flow patterns, surges in extreme conditions, large data dimensions, and redundant features complicating the model, this paper proposes a deep space passenger flow prediction model that integrates a Temporal Convolutional Network (TCN) and Long Short-Term Memory (LSTM) network. The model first employs a dual-layer LSTM network structure with a Dropout layer to capture complex temporal dynamics while preventing overfitting. Then, a Self-Attention mechanism and TCN network are introduced to reduce redundant feature data and enhance the model's performance and speed. Finally, the Beluga Whale Optimization (BWO) algorithm is used to optimize hyperparameters, further improving the prediction accuracy of the network. Experimental results demonstrate that the BWO-TCLS-Self-Attention model proposed in this paper achieves an R2 value of 96.94%, with MAE and RMSE values of 118.464 and 218.118, respectively. Compared with some mainstream prediction models, the R2 value has increased, while both MAE and RMSE values have decreased, indicating its ability to accurately predict passenger flow in deep underground spaces. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Multiple instance learning method based on convolutional neural network and self-attention for early cancer detection.
- Author
-
Liu, Junjiang, Zhou, Shusen, Zang, Mujun, Liu, Chanjuan, Liu, Tong, and Wang, Qingjun
- Subjects
- *
CONVOLUTIONAL neural networks , *T cell receptors , *EARLY detection of cancer , *THYROID cancer , *MICA - Abstract
AbstractEarly cancer detection using T-cell receptor sequencing (TCR-seq) and multiple instances learning methods has shown significant effectiveness. We introduce a multiple instance learning method based on convolutional neural networks and self-attention (MICA). First, MICA preprocesses TCR-seq using word vectors and then extracts features using convolutional neural networks. Second, MICA uses an enhanced self-attention mechanism to extract relational features of instances. Finally, MICA can extract the crucial TCR-seq. After cross-validation, MICA achieves an area under the curve (AUC) of 0.911 and 0.946 on the lung and thyroid cancer datasets, which are 7.1% and 2.1% higher than other methods, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Autonomous Extraction Technology for Aquaculture Ponds in Complex Geological Environments Based on Multispectral Feature Fusion of Medium-Resolution Remote Sensing Imagery.
- Author
-
Liang, Zunxun, Wang, Fangxiong, Zhu, Jianfeng, Li, Peng, Xie, Fuding, and Zhao, Yifei
- Subjects
- *
COASTAL zone management , *ECONOMIC security , *REMOTE sensing , *ENVIRONMENTAL degradation , *DEEP learning - Abstract
Coastal aquaculture plays a crucial role in global food security and the economic development of coastal regions, but it also causes environmental degradation in coastal ecosystems. Therefore, the automation, accurate extraction, and monitoring of coastal aquaculture areas are crucial for the scientific management of coastal ecological zones. This study proposes a novel deep learning- and attention-based median adaptive fusion U-Net (MAFU-Net) procedure aimed at precisely extracting individually separable aquaculture ponds (ISAPs) from medium-resolution remote sensing imagery. Initially, this study analyzes the spectral differences between aquaculture ponds and interfering objects such as saltwater fields in four typical aquaculture areas along the coast of Liaoning Province, China. It innovatively introduces a difference index for saltwater field aquaculture zones (DIAS) and integrates this index as a new band into remote sensing imagery to increase the expressiveness of features. A median augmented adaptive fusion module (MEA-FM), which adaptively selects channel receptive fields at various scales, integrates the information between channels, and captures multiscale spatial information to achieve improved extraction accuracy, is subsequently designed. Experimental and comparative results reveal that the proposed MAFU-Net method achieves an F1 score of 90.67% and an intersection over union (IoU) of 83.93% on the CHN-LN4-ISAPS-9 dataset, outperforming advanced methods such as U-Net, DeepLabV3+, SegNet, PSPNet, SKNet, UPS-Net, and SegFormer. This study's results provide accurate data support for the scientific management of aquaculture areas, and the proposed MAFU-Net method provides an effective method for semantic segmentation tasks based on medium-resolution remote sensing images. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Robotic Grasping Detection Algorithm Based on 3D Vision Dual-Stream Encoding Strategy.
- Author
-
Lei, Minglin, Wang, Pandong, Lei, Hua, Ma, Jieyun, Wu, Wei, and Hao, Yongtao
- Subjects
TRANSFORMER models ,COMPUTER vision ,LONG-distance relationships ,FEATURE extraction ,GEOMETRIC shapes ,PREHENSION (Physiology) ,POSE estimation (Computer vision) - Abstract
The automatic generation of stable robotic grasping postures is crucial for the application of computer vision algorithms in real-world settings. This task becomes especially challenging in complex environments, where accurately identifying the geometric shapes and spatial relationships between objects is essential. To enhance the capture of object pose information in 3D visual scenes, we propose a planar robotic grasping detection algorithm named SU-Grasp, which simultaneously focuses on local regions and long-distance relationships. Built upon a U-shaped network, SU-Grasp introduces a novel dual-stream encoding strategy using the Swin Transformer combined with spatial semantic enhancement. Compared to existing baseline methods, our algorithm achieves superior performance across public datasets, simulation tests, and real-world scenarios, highlighting its robust understanding of complex spatial environments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. 基于 CNN 和 Transformer 特征融合的烟雾识别方法.
- Author
-
付 燕, 杨 旭, and 叶 鸥
- Abstract
Currently, many smoke recognition algorithms suffer from high false alarm rates, partly due to the fact that most existing convolutional neural networks (CNNs) mainly focus on local information in smoke images during feature extraction, neglecting the global features of smoke images. This bias towards local information processing can easily lead to misjudgments when dealing with variable and complex smoke images. To address this issue, it is necessary to capture the global features of smoke images more accurately, thereby improving the accuracy of smoke recognition algorithms. Therefore, this paper propose a dual-branch smoke recognition method, TCF-Net, which combines the Inception and Transformer structures. This model is improved to enrich feature diversity while reducing channel redundancy. Additionally, the self-attention mechanism from Transformer is introduced, combining its ability to learn global context information with CNNs capacity to learn local relative position information. During feature extraction, a feature coupling unit (FCU) is embedded to continuously interact the local features and global information in both branches, maximizing the retention of both local and global information and enhancing the performance of the algorithm. The proposed algorithm can classify video frames into three states: black smoke, white smoke, and no smoke. Experimental results show that the improved network can better extract smoke features, reducing the false alarm rate while increasing the accuracy to 97.8%, confirming the excellent performance of the algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. 基于多层级注意力机制和动态阈值的 远程监督关系抽取.
- Author
-
赵红燕, 张莹刚, and 谢斌红
- Subjects
- *
LANGUAGE models , *NOISE control , *MULTILEVEL models , *DATA quality , *SUPERVISION - Abstract
Distant supervision relation extraction faces the problem of data quality, that is, the generated dataset has multiple types of noise, noisy words, noisy sentences and noisy bags. Existing research mainly focuses on the noisy sentences, ignoring the impact of other noise, and cannot completely eliminate the noise. To this end, the paper proposed a distant supervision relation extraction model based on multilevel attention mechanism and dynamic thresholding (MADT). The model firstly used a pre-trained language model to obtain entity-pair semantic representations, then obtained semantic features embedded with keyword information through a bidirectional gated recurrent unit and a self-attention mechanism, and then dealt with the three noise problems sequentially in conjunction with the deep contextual representation of the sentence. In addition, the paper proposed a dynamic thresholding method to further remove noisy sentences, enhance the contribution of positive example sentences to the bag representation, and reduce the impact of noisy bags using a semantic similarity-based attention mechanism. Experiments on the NYT10d and NYT10m datasets show that the MADT model is able to address all levels of noise in distant supervision of relation extraction and effectively improve relation extraction performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. One novel transfer learning-based CLIP model combined with self-attention mechanism for differentiating the tumor-stroma ratio in pancreatic ductal adenocarcinoma.
- Author
-
Liao, Hongfan, Yuan, Jiang, Liu, Chunhua, Zhang, Jiao, Yang, Yaying, Liang, Hongwei, Liu, Haotian, Chen, Shanxiong, and Li, Yongmei
- Abstract
Purpose: To develop a contrastive language-image pretraining (CLIP) model based on transfer learning and combined with self-attention mechanism to predict the tumor-stroma ratio (TSR) in pancreatic ductal adenocarcinoma on preoperative enhanced CT images, in order to understand the biological characteristics of tumors for risk stratification and guiding feature fusion during artificial intelligence-based model representation. Material and methods: This retrospective study collected a total of 207 PDAC patients from three hospitals. TSR assessments were performed on surgical specimens by pathologists and divided into high TSR and low TSR groups. This study developed one novel CLIP-adapter model that integrates the CLIP paradigm with a self-attention mechanism for better utilizing features from multi-phase imaging, thereby enhancing the accuracy and reliability of tumor-stroma ratio predictions. Additionally, clinical variables, traditional radiomics model and deep learning models (ResNet50, ResNet101, ViT_Base_32, ViT_Base_16) were constructed for comparison. Results: The models showed significant efficacy in predicting TSR in PDAC. The performance of the CLIP-adapter model based on multi-phase feature fusion was superior to that based on any single phase (arterial or venous phase). The CLIP-adapter model outperformed traditional radiomics models and deep learning models, with CLIP-adapter_ViT_Base_32 performing the best, achieving the highest AUC (0.978) and accuracy (0.921) in the test set. Kaplan–Meier survival analysis showed longer overall survival in patients with low TSR compared to those with high TSR. Conclusion: The CLIP-adapter model designed in this study provides a safe and accurate method for predicting the TSR in PDAC. The feature fusion module based on multi-modal (image and text) and multi-phase (arterial and venous phase) significantly improves model performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. A Review of Abnormal Crowd Behavior Recognition Technology Based on Computer Vision.
- Author
-
Zhao, Rongyong, Hua, Feng, Wei, Bingyu, Li, Cuiling, Ma, Yulong, Wong, Eric S. W., and Liu, Fengnian
- Subjects
COLLECTIVE behavior ,COMPUTER vision ,COMPUTER engineering ,PUBLIC spaces ,SOFTWARE development tools ,DEEP learning - Abstract
Abnormal crowd behavior recognition is one of the research hotspots in computer vision. Its goal is to use computer vision technology and abnormal behavior detection models to accurately perceive, predict, and intervene in potential abnormal behaviors of the crowd and monitor the status of the crowd system in public places in real time, to effectively prevent and deal with public security risks and ensure public life safety and social order. To this end, focusing on the abnormal crowd behavior recognition technology in the computer vision system, a systematic review study of its theory and cutting-edge technology is conducted. First, the crowd level and abnormal behaviors in public places are defined, and the challenges faced by abnormal crowd behavior recognition are expounded. Then, from the dimensions based on traditional methods and based on deep learning, the mainstream technologies of abnormal behavior recognition are discussed, and the design ideas, advantages, and limitations of various methods are analyzed. Next, the mainstream software tools are introduced to provide a comprehensive reference for the technical framework. Secondly, typical abnormal behavior datasets at home and abroad are sorted out, and the characteristics of these datasets are compared in detail from multiple perspectives such as scale, characteristics, and uses, and the performance indicators of different algorithms on the datasets are compared and analyzed. Finally, the full text is summarized and the future development direction of abnormal crowd behavior recognition technology is prospected. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Self-Attention Spatio-Temporal Deep Collaborative Network for Robust FDIA Detection in Smart Grids.
- Author
-
Zu, Tong and Li, Fengyong
- Subjects
INDEPENDENT system operators ,ELECTRIC power distribution grids ,DEEP learning ,TIME series analysis ,NOISE - Abstract
False data injection attack (FDIA) can affect the state estimation of the power grid by tampering with the measured value of the power grid data, and then destroying the stable operation of the smart grid. Existing work usually trains a detection model by fusing the data-driven features from diverse power data streams. Data-driven features, however, cannot effectively capture the differences between noisy data and attack samples. As a result, slight noise disturbances in the power grid may cause a large number of false detections for FDIA attacks. To address this problem, this paper designs a deep collaborative self-attention network to achieve robust FDIA detection, in which the spatio-temporal features of cascaded FDIA attacks are fully integrated. Firstly, a high-order Chebyshev polynomials-based graph convolution module is designed to effectively aggregate the spatio information between grid nodes, and the spatial self-attention mechanism is involved to dynamically assign attention weights to each node, which guides the network to pay more attention to the node information that is conducive to FDIA detection. Furthermore, the bi-directional Long Short-Term Memory (LSTM) network is introduced to conduct time series modeling and long-term dependence analysis for power grid data and utilizes the temporal self-attention mechanism to describe the time correlation of data and assign different weights to different time steps. Our designed deep collaborative network can effectively mine subtle perturbations from spatiotemporal feature information, efficiently distinguish power grid noise from FDIA attacks, and adapt to diverse attack intensities. Extensive experiments demonstrate that our method can obtain an efficient detection performance over actual load data from New York Independent System Operator (NYISO) in IEEE 14, IEEE 39, and IEEE 118 bus systems, and outperforms state-of-the-art FDIA detection schemes in terms of detection accuracy and robustness. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. 基于多尺度时空全局注意力的遥感影像时间序列农作物分类.
- Author
-
张, 伟雄, 唐, 娉, 孟, 瑜, 赵, 理君, 赵, 智韬, and 张, 正
- Subjects
TRANSFORMER models ,NATURAL language processing ,RECURRENT neural networks ,COMPUTER vision ,AGRICULTURAL resources ,DEEP learning - Abstract
Copyright of Journal of Remote Sensing is the property of Editorial Office of Journal of Remote Sensing & Science Publishing Co. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
46. A Strip Steel Surface Defect Salient Object Detection Based on Channel, Spatial and Self-Attention Mechanisms.
- Author
-
Sun, Yange, Geng, Siyu, Guo, Huaping, Zheng, Chengyi, and Zhang, Li
- Subjects
STEEL strip ,OBJECT recognition (Computer vision) ,TRANSFORMER models ,FEATURE extraction ,SURFACE defects - Abstract
Strip steel is extensively utilized in industries such as automotive manufacturing and aerospace due to its superior machinability, economic benefits, and adaptability. However, defects on the surface of steel strips, such as inclusions, patches, and scratches, significantly affect the performance and service life of the product. Therefore, the salient object detection of surface defects on strip steel is crucial to ensure the quality of the final product. Many factors, such as the low contrast of surface defects on strip steel, the diversity of defect types, complex texture structures, and irregular defect distribution, hinder existing detection technologies from accurately identifying and segmenting defect areas against complex backgrounds. To address the above problems, we propose a novel detector called S3D-SOD for the salient object detection of strip steel surface defects. For the encoding stage, a residual self-attention block is proposed to explore semantic information cues of high-level features to locate and guide low-level feature information. In addition, we apply a general residual channel and spatial attention to low-level features, enabling the model to adaptively focus on the key channels and spatial areas of feature maps with high resolutions, thereby enhancing the encoder features and accelerating the convergence of the model. For the decoding stage, a simple residual decoder block with an upsampling operation is proposed to realize the integration and interaction of feature information between different layers. Here, the simple residual decoder block is used for feature integration due to the following observation: backbone networks like ResNet and the Swin Transformer, after being pretrained on the large dataset ImageNet and then fine-tuned on a smaller dataset for strip steel surface defects, are capable of extracting feature maps that contain both general image features and the specific characteristics required for the salient object detection of strip steel surface defects. The experimental results on the SD-saliency-900 dataset show that S3D-SOD is better than advanced methods, and it has strong generalization ability and robustness. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. HADNet: A Novel Lightweight Approach for Abnormal Sound Detection on Highway Based on 1D Convolutional Neural Network and Multi-Head Self-Attention Mechanism.
- Author
-
Liang, Cong, Chen, Qian, Li, Qiran, Wang, Qingnan, Zhao, Kang, Tu, Jihui, and Jafaripournimchahi, Ammar
- Subjects
CONVOLUTIONAL neural networks ,EXTREME weather ,FEATURE extraction ,TRANSFORMER models ,SAFETY ,TRAFFIC safety - Abstract
Video surveillance is an effective tool for traffic management and safety, but it may face challenges in extreme weather, low visibility, areas outside the monitoring field of view, or during nighttime conditions. Therefore, abnormal sound detection is used in traffic management and safety as an auxiliary tool to complement video surveillance. In this paper, a novel lightweight method for abnormal sound detection based on 1D CNN and Multi-Head Self-Attention Mechanism on the embedded system is proposed, which is named HADNet. First, 1D CNN is employed for local feature extraction, which minimizes information loss from the audio signal during time-frequency conversion and reduces computational complexity. Second, the proposed block based on Multi-Head Self-Attention Mechanism not only effectively mitigates the issue of disappearing gradients, but also enhances detection accuracy. Finally, the joint loss function is employed to detect abnormal audio. This choice helps address issues related to unbalanced training data and class overlap, thereby improving model performance on imbalanced datasets. The proposed HADNet method was evaluated on the MIVIA Road Events and UrbanSound8K datasets. The results demonstrate that the proposed method for abnormal audio detection on embedded systems achieves high accuracy of 99.6% and an efficient detection time of 0.06 s. This approach proves to be robust and suitable for practical applications in traffic management and safety. By addressing the challenges posed by traditional video surveillance methods, HADNet offers a valuable and complementary solution for enhancing safety measures in diverse traffic conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Evaluating the effectiveness of self-attention mechanism in tuberculosis time series forecasting
- Author
-
Zhihong Lv, Rui Sun, Xin Liu, Shuo Wang, Xiaowei Guo, Yuan Lv, Min Yao, and Junhua Zhou
- Subjects
Tuberculosis ,Time series forecasting ,Self-attention mechanism ,ARIMA model ,LSTM model ,Infectious and parasitic diseases ,RC109-216 - Abstract
Abstract Background With the increasing impact of tuberculosis on public health, accurately predicting future tuberculosis cases is crucial for optimizing of health resources and medical service allocation. This study applies a self-attention mechanism to predict the number of tuberculosis cases, aiming to evaluate its effectiveness in forecasting. Methods Monthly tuberculosis case data from Changde City between 2010 and 2021 were used to construct a self-attention model, a long short-term memory (LSTM) model, and an autoregressive integrated moving average (ARIMA) model. The performance of these models was evaluated using three metrics: root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE). Results The self-attention model outperformed the other models in terms of prediction accuracy. On the test set, the RMSE of the self-attention model was approximately 7.41% lower than that of the LSTM model, MAE was reduced by about 10.99%, and MAPE was reduced by approximately 9.87%. Compared to the ARIMA model, RMSE was reduced by about 28.86%, MAE by about 32.22%, and MAPE by approximately 29.89%. Conclusion The self-attention model can effectively improve the prediction accuracy of tuberculosis cases, providing guidance for health departments optimizing of health resources and medical service allocation.
- Published
- 2024
- Full Text
- View/download PDF
49. Utilization of deep learning in ideological and political education
- Author
-
Cai Sulong
- Subjects
ideological and political education ,deep learning ,teaching methods ,lstm model ,self-attention mechanism ,educational effectiveness ,Science ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
As society develops and educational needs continue to change, the traditional way of teaching ideology and politics is facing challenges in terms of efficiency and effectiveness evaluation. In response to the low efficiency of ideological and political education (IPE) methods and the difficulty in accurately and comprehensively evaluating students’ ideological and political literacy and moral qualities, this article used the Long Short-Term Memory with Self-Attention Mechanism (LSTM-SAM) model to conduct experiments on the evaluation of IPE effectiveness. First, by collecting information on IPE from a research center of a certain university in 2023, and then using the LSTM (Long Short-Term Memory) model to catch the long-term dependencies of IPE, the learning trajectory and changing trends of students can be better understood. The self-attention mechanism was applied to dynamically learn and distinguish the importance of different parts in the input sequence, better weighting key features such as student learning behavior and participation level, thereby enhancing the accuracy and robustness of effectiveness evaluation. Finally, the splicing method was adopted to integrate the LSTM model and self-attention mechanism for the experiment, and the teaching efficiency of different teaching methods was statistically analyzed through a questionnaire survey. The test results indicated that the classification accuracy of the LSTM-SAM model reached 98.41%, which was 1.61% higher than the LSTM model. The teaching efficiency was the highest under the gamified teaching method, providing an effective method for evaluating the effectiveness of IPE and providing useful reference for optimizing teaching methods.
- Published
- 2024
- Full Text
- View/download PDF
50. A rate of penetration (ROP) prediction method based on improved dung beetle optimization algorithm and BiLSTM-SA
- Author
-
Mengyuan Xiong, Shuangjin Zheng, Wei Liu, Rongsheng Cheng, Lihui Wang, Haijun Zhang, and Guona Wang
- Subjects
Rate of Penetration ,Bidirectional long short-term Memory Network ,Self-attention mechanism ,Optimization algorithm ,Data Analysis ,Medicine ,Science - Abstract
Abstract In the field of oil drilling, accurately predicting the Rate of Penetration (ROP) is crucial for improving drilling efficiency and reducing costs. Traditional prediction methods and existing machine learning approaches often lack accuracy and generalization capabilities, leading to suboptimal results in practical applications. This study proposes an end-to-end ROP prediction model based on BiLSTM-SA-IDBO, which integrates Bidirectional Long Short-Term Memory (BiLSTM), a Self-Attention mechanism (SA), and an Improved Dung Beetle Optimization algorithm (IDBO), incorporating the Bingham physical equation.We enhanced the DBO algorithm by using Sobol sequences for population initialization and integrating the Golden Sine algorithm and dynamic subtraction factors to develop a more robust IDBO. This optimized the BiLSTM-SA model, resulting in a BiLSTM-SA-IDBO model with an RMSE of 0.065, an R² of 0.963, and an MAE of 0.05 on the test set. Compared to the original BiLSTM-SA model, these metrics improved by 78%, 21%, and 83%, respectively. Additionally, we compared this model with BP Neural Network, Random Forest, XGBoost, and LSTM models, and found that our proposed model significantly outperformed these traditional models. Finally, through practical testing, the model’s excellent predictive ability and generalization were verified, demonstrating its great potential for practical applications.
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.