Start Over

Improving Large Language Model Russian Adaptation with Preliminary Vocabulary Optimization.

Authors :: Tikhomirov, M. M.
Chernyshev, D. I.
Source :: Lobachevskii Journal of Mathematics; Jul2024, Vol. 45 Issue 7, p3211-3219, 9p
Publication Year :: 2024
Abstract: Most of Large Language Model (LLM) text comprehension capabilities come from generative pre-training on large corpora which includes texts of different domains, languages and tasks. As a consequence the LLM performance in a specific language depends on its representation in the training data which for most state-of-the-art models was biased towards English language. The issue is commonly alleviated by further pre-training on the target language, however, due to limited model capacity this often results in knowledge forgetting and text understanding degradation. We argue that the performance drop can be avoided by employing parameter-efficient tuning methods that preserve the integrity of the original model. In this work, we investigate the effectiveness of different vocabulary optimization and adapter tuning schemes for LLM Russian adaptation. Our experimental results with Solar-10.7B LLM show that language adaptation process can be substantially accelerated by transferring the embeddings from smaller language-tuned counterparts. Moreover, we find that preliminary vocabulary optimization stabilizes further adapter-tuning thus improving target language generalization. By applying our two-stage language adaptation approach we obtain state-of-the-art results on Russian Super Glue and MMLU-RU language understanding datasets for sub-30B parameter open-source LLMs. [ABSTRACT FROM AUTHOR]

Details

Language :: English
ISSN :: 19950802
Volume :: 45
Issue :: 7
Database :: Complementary Index
Journal :: Lobachevskii Journal of Mathematics
Publication Type :: Academic Journal
Accession number :: 180368870
Full Text :: https://doi.org/10.1134/S1995080224604120

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Improving Large Language Model Russian Adaptation with Preliminary Vocabulary Optimization.

Abstract

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Improving Large Language Model Russian Adaptation with Preliminary Vocabulary Optimization.

Abstract

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources