Back to Search
Start Over
Adapting Large Language Models for Document-Level Machine Translation
- Publication Year :
- 2024
-
Abstract
- Large language models (LLMs) have significantly advanced various natural language processing (NLP) tasks. Recent research indicates that moderately-sized LLMs often outperform larger ones after task-specific fine-tuning. This study focuses on adapting LLMs for document-level machine translation (DocMT) for specific language pairs. We first investigate the impact of prompt strategies on translation performance and then conduct extensive experiments using two fine-tuning methods, three LLM backbones, and 18 translation tasks across nine language pairs. Our results show that specialized models can sometimes surpass GPT-4 in translation performance but still face issues like off-target translation due to error propagation in decoding. We provide an in-depth analysis of these LLMs tailored for DocMT, examining translation errors, discourse phenomena, strategies for training and inference, the data efficiency of parallel documents, recent test set evaluations, and zero-shot crosslingual transfer. Our findings highlight the strengths and limitations of LLM-based DocMT models and provide a foundation for future research.<br />Comment: 25 pages, 18 tables, 7 figures; ARR Feb 2024, 4/3/2, meta 2, rejected by ACL2024; ARR June 2024, 4.5/3/2, meta 3, rejected by EMNLP2024
- Subjects :
- Computer Science - Computation and Language
Subjects
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2401.06468
- Document Type :
- Working Paper