Back to Search Start Over

Self-Interaction Attention Mechanism-Based Text Representation for Document Classification

Authors :
Honghui Chen
Fei Cai
Taihua Shao
Zheng Jianming
Source :
Applied Sciences, Vol 8, Iss 4, p 613 (2018), Applied Sciences; Volume 8; Issue 4; Pages: 613
Publication Year :
2018
Publisher :
MDPI AG, 2018.

Abstract

Document classification has a broad application in the field of sentiment classification, document ranking and topic labeling, etc. Previous neural network-based work has mainly focused on investigating a so-called forward implication, i.e., the preceding text segments are taken as the context of the following text segments when generating the text representation. Such a scenario typically ignores the fact that the semantics of a document are a product of the mutual implication of all text segments in a document. Thus, in this paper, we introduce a concept of interaction and propose a text representation model with Self-interaction Attention Mechanism (TextSAM) for document classification. In particular, we design three aggregated strategies to integrate the interaction into a hierarchical architecture for document classification, i.e., averaging the interaction, maximizing the interaction and adding one more attention layer on the interaction, which leads to three models, i.e., TextSAMAVE, TextSAMMAX and TextSAMATT, respectively. Our comprehensive experimental results on two public datasets, i.e., Yelp 2016 and Amazon Reviews (Electronics), show that our proposals can significantly outperform the state-of-the-art neural-based baselines for document classification, presenting a general improvement in terms of accuracy ranging from 5.97% to 14.05% against the best baseline. Furthermore, we find that our proposals with a self-interaction attention mechanism can obviously alleviate the impact brought by the increase of sentence number as the relative improvement of our proposals against the baselines are enlarged when the sentence number increases.

Details

Language :
English
ISSN :
20763417
Volume :
8
Issue :
4
Database :
OpenAIRE
Journal :
Applied Sciences
Accession number :
edsair.doi.dedup.....d6095a5b11b30cec5effb2e4cf1fdd9e