Back to Search Start Over

Realizing Disentanglement in LM Latent Space via Vocabulary-Defined Semantics

Authors :
Gu, Jian
Aleti, Aldeida
Chen, Chunyang
Zhang, Hongyu
Publication Year :
2024

Abstract

Understanding the latent space of language models (LMs) is important for improving the performance and interpretability of LMs. Existing analyses often fail to provide insights that take advantage of the semantic properties of language models and often overlook crucial aspects of language model adaptation. In response, we introduce a pioneering approach called vocabulary-defined semantics, which establishes a reference frame grounded in LM vocabulary within the LM latent space. We propose a novel technique to compute disentangled logits and gradients in latent space, not entangled ones on vocabulary. Further, we perform semantical clustering on data representations as a novel way of LM adaptation. Through extensive experiments across diverse text understanding datasets, our approach outperforms state-of-the-art methods of retrieval-augmented generation and parameter-efficient finetuning, showcasing its effectiveness and efficiency.<br />Comment: under peer-review

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2401.16184
Document Type :
Working Paper