Back to Search Start Over

TCAMixer: A lightweight Mixer based on a novel triple concepts attention mechanism for NLP.

Authors :
Liu, Xiaoyan
Tang, Huanling
Zhao, Jie
Dou, Quansheng
Lu, Mingyu
Source :
Engineering Applications of Artificial Intelligence. Aug2023:Part C, Vol. 123, pN.PAG-N.PAG. 1p.
Publication Year :
2023

Abstract

Large-scale model sizes and expensive computing costs cause the challenge of deploying and applying large pre-trained models. Hence, this paper presents a novel Triple Concepts Attention Mechanism and a lightweight TCAMixer model for edge devices to classify texts. Furthermore, the TCAMixer abstracts textual concepts in a human way, which is unmatched by other counterparts such as pNLP-Mixer (a projection-based MLP-Mixer model for Nature Language Processing) and HyperMixer (a hyper network using dynamic token-mixing layers). Experimental results on several public datasets demonstrate that the TCAMixer outperforms the counterparts by a significant margin, for example, achieving 3% higher accuracy with a smaller model size of 0.177M. Additionally, the TCAMixer achieves a performance of 85% to 98.7% compared to that of large pre-trained models but only occupies 1/3000 to 1/2000 of their size on most test datasets. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09521976
Volume :
123
Database :
Academic Search Index
Journal :
Engineering Applications of Artificial Intelligence
Publication Type :
Academic Journal
Accession number :
164285231
Full Text :
https://doi.org/10.1016/j.engappai.2023.106471