1. A Study on Performance Enhancement by Integrating Neural Topic Attention with Transformer-Based Language Model.
- Author
-
Um, Taehum and Kim, Namhyoung
- Subjects
LANGUAGE models ,ARTIFICIAL neural networks ,TRANSFORMER models ,LATENT variables ,STOCHASTIC models - Abstract
As an extension of the transformer architecture, the BERT model has introduced a new paradigm for natural language processing, achieving impressive results in various downstream tasks. However, high-performance BERT-based models—such as ELECTRA, ALBERT, and RoBERTa—suffer from limitations such as poor continuous learning capability and insufficient understanding of domain-specific documents. To address these issues, we propose the use of an attention mechanism to combine BERT-based models with neural topic models. Unlike traditional stochastic topic modeling, neural topic modeling employs artificial neural networks to learn topic representations. Furthermore, neural topic models can be integrated with other neural models and trained to identify latent variables in documents, thereby enabling BERT-based models to sufficiently comprehend the contexts of specific fields. We conducted experiments on three datasets—Movie Review Dataset (MRD), 20Newsgroups, and YELP—to evaluate our model's performance. Compared to the vanilla model, the proposed model achieved an accuracy improvement of 1–2% for the ALBERT model in multiclassification tasks across all three datasets, while the ELECTRA model showed an accuracy improvement of less than 1%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF