AstroLLaMA: Towards Specialized Foundation Models in Astronomy

Authors :: Nguyen, Tuan Dung
Ting, Yuan-Sen
Ciucă, Ioana
O'Neill, Charlie
Sun, Ze-Chang
Jabłońska, Maja
Kruk, Sandor
Perkowski, Ernest
Miller, Jack
Li, Jason
Peek, Josh
Iyer, Kartheik
Różański, Tomasz
Khetarpal, Pranav
Zaman, Sharaf
Brodrick, David
Méndez, Sergio J. Rodríguez
Bui, Thang
Goodman, Alyssa
Accomazzi, Alberto
Naiman, Jill
Cranney, Jesse
Schawinski, Kevin
UniverseTBD
Publication Year :: 2023
Abstract: Large language models excel in many human-language tasks but often falter in highly specialized domains like scholarly astronomy. To bridge this gap, we introduce AstroLLaMA, a 7-billion-parameter model fine-tuned from LLaMA-2 using over 300,000 astronomy abstracts from arXiv. Optimized for traditional causal language modeling, AstroLLaMA achieves a 30% lower perplexity than Llama-2, showing marked domain adaptation. Our model generates more insightful and scientifically relevant text completions and embedding extraction than state-of-the-arts foundation models despite having significantly fewer parameters. AstroLLaMA serves as a robust, domain-specific model with broad fine-tuning potential. Its public release aims to spur astronomy-focused research, including automatic paper summarization and conversational agent development.<br />Comment: 6 pages, 3 figures, submitted to IJCNLP-AACL 2023. Comments are welcome. The model can be found on Hugging Face - https://huggingface.co/universeTBD/astrollama