Back to Search
Start Over
TCAMixer: A lightweight Mixer based on a novel triple concepts attention mechanism for NLP.
- Source :
-
Engineering Applications of Artificial Intelligence . Aug2023:Part C, Vol. 123, pN.PAG-N.PAG. 1p. - Publication Year :
- 2023
-
Abstract
- Large-scale model sizes and expensive computing costs cause the challenge of deploying and applying large pre-trained models. Hence, this paper presents a novel Triple Concepts Attention Mechanism and a lightweight TCAMixer model for edge devices to classify texts. Furthermore, the TCAMixer abstracts textual concepts in a human way, which is unmatched by other counterparts such as pNLP-Mixer (a projection-based MLP-Mixer model for Nature Language Processing) and HyperMixer (a hyper network using dynamic token-mixing layers). Experimental results on several public datasets demonstrate that the TCAMixer outperforms the counterparts by a significant margin, for example, achieving 3% higher accuracy with a smaller model size of 0.177M. Additionally, the TCAMixer achieves a performance of 85% to 98.7% compared to that of large pre-trained models but only occupies 1/3000 to 1/2000 of their size on most test datasets. [ABSTRACT FROM AUTHOR]
- Subjects :
- *NATURAL language processing
*DEEP learning
Subjects
Details
- Language :
- English
- ISSN :
- 09521976
- Volume :
- 123
- Database :
- Academic Search Index
- Journal :
- Engineering Applications of Artificial Intelligence
- Publication Type :
- Academic Journal
- Accession number :
- 164285231
- Full Text :
- https://doi.org/10.1016/j.engappai.2023.106471