Back to Search Start Over

When attention is not enough to unveil a text's author profile: Enhancing a transformer with a wide branch.

Authors :
López-Santillán, Roberto
González, Luis C.
Montes-y-Gómez, Manuel
López-Monroy, A. Pastor
Source :
Neural Computing & Applications. May2023, Vol. 35 Issue 13, p9607-9626. 20p.
Publication Year :
2023

Abstract

Author profiling (AP) is a highly relevant natural language processing (NLP) problem; it deals with predicting features of authors such as gender, age and personality traits. It is done by analyzing texts written by the authors themselves; take for instance documents such as books, articles, and more recently posts in social media platforms. In the present study, we focus in the latter, which is an scenario with a number of applications in marketing, security, health and others. Surprisingly, given the achievements of deep learning (DL) strategies on other NLP tasks, for AP DL architectures regularly underperform, left behind by classical machine learning (ML) approaches. In this study we show how a deep learning architecture based on transformers offers competitive results by exploiting a joint-intermediate fusion strategy called the Wide & Deep Transformer (WD-T). Our methodology implements a fusion of contextualized word vector representations and handcrafted features, by using a self-attention mechanism and a novel encoding technique that incorporates stylistic, topic, and personal information from authors. This allows for the creation of more accurate, fine-grained predictions. Our approach attained competitive performance against top-quartile results from the 2017–2019 editions at the Plagiarism analysis, Authorship identification, and Near-duplicate detection forum (PAN) in English and Spanish languages for gender and language variety predictions, and the Kaggle Myers–Briggs-type indicator (MBTI) dataset for personality forecasting. Our proposal consistently surpasses all other deep learning methods in PAN collections by as much as 2.4%, and up to 3.4% in the MBTI dataset. These results suggest that this DL strategy effectively addresses and improves upon the limitations of previous techniques and paves the way for new avenues of inquiry. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09410643
Volume :
35
Issue :
13
Database :
Academic Search Index
Journal :
Neural Computing & Applications
Publication Type :
Academic Journal
Accession number :
163165352
Full Text :
https://doi.org/10.1007/s00521-023-08198-5