Back to Search Start Over

Capturing word positions does help: A multi-element hypergraph gated attention network for document classification.

Authors :
Jin, Yilun
Yin, Wei
Wang, Haoseng
He, Fang
Source :
Expert Systems with Applications. Oct2024, Vol. 251, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

Over the last few years, graph-based methods have manifested a significant enhancement in document mining applications such as spam detection, news recommendation, and legal document classification. However, existing graph-based methods have a limited ability to utilize word position and multi-element information within the documents, limiting their effectiveness in practical application. To mitigate this limitation, we propose a novel multi-element hypergraph gated attention network that can capture word position and multi-element information for accurate document classification. Specifically, a new multi-element hypergraph is first proposed to describe the word position, sentence, and full content within the document. Then, a new multi-element homogenization module is applied to mitigate heterogeneity of constructed hypergraph. Meantime, a new hypergraph gated attention module is proposed to filter noise in the constructed hypergraph and derive various element representations that incorporate word position information. Finally, a new block-wise read-out module is designed to fuse learned element representations into comprehensive document representations for classification. Extensive experiments conducted on several real-world datasets demonstrate that the proposed method not only outperforms related state-of-the-art methods but is also faster, making it suitable for a wide range of practical applications. For instance, our method achieved an accuracy improvement of 1.1 % over the best comparative method on some datasets while also operating at a faster speed. Additionally, it demonstrated an impressive 14 % improvement in accuracy over the well-known Generative Pre-trained Transformer 3.5 (GPT-3.5) on one dataset. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09574174
Volume :
251
Database :
Academic Search Index
Journal :
Expert Systems with Applications
Publication Type :
Academic Journal
Accession number :
177514298
Full Text :
https://doi.org/10.1016/j.eswa.2024.124002