Back to Search Start Over

Recursively Autoregressive Autoencoder for Pyramidal Text Representation

Authors :
Karlo Babic
Ana Mestrovic
Source :
IEEE Access, Vol 12, Pp 71361-71370 (2024)
Publication Year :
2024
Publisher :
IEEE, 2024.

Abstract

We introduce Pyramidal Recursive learning (PyRv), a novel method for text representation learning. This approach constructs a pyramidal hierarchy by recursively building representations of phrases, starting from tokens (characters, subwords, or words). At each level, N representations are recursively combined, resulting in N-1 representations on the level above, abstracting the input text from characters or subwords to words, phrases, and potentially sentences. The proposed method employs two learning approaches: autoencoding and autoregression. The autoencoding head decodes encoded representation pairs, while the autoregressive head predicts neighboring representations on both the left and right. This method exhibits four key properties: hierarchical representation, representation compositionality, representation decodability, and self-supervised learning. To implement and validate the proposed method, we train the Pyramidal Recursive Neural Network (PyRvNN) model. Evaluation metrics include autoencoder decodability, plagiarism detection, memorization, and readability. The accuracy of autoencoder decodability serves as an indicator of the validity of the four key properties. Preliminary assessments demonstrate promising results, particularly in machine-paraphrased plagiarism, text readability, and a memorization experiment.

Details

Language :
English
ISSN :
21693536
Volume :
12
Database :
Directory of Open Access Journals
Journal :
IEEE Access
Publication Type :
Academic Journal
Accession number :
edsdoj.f49777cddb9472aa9c34754c8835a50
Document Type :
article
Full Text :
https://doi.org/10.1109/ACCESS.2024.3402830