Back to Search Start Over

SaulLM-7B: A pioneering Large Language Model for Law

Authors :
Colombo, Pierre
Pires, Telmo Pessoa
Boudiaf, Malik
Culver, Dominic
Melo, Rui
Corro, Caio
Martins, Andre F. T.
Esposito, Fabrizio
Raposo, Vera LĂșcia
Morgado, Sofia
Desa, Michael
Publication Year :
2024

Abstract

In this paper, we introduce SaulLM-7B, a large language model (LLM) tailored for the legal domain. With 7 billion parameters, SaulLM-7B is the first LLM designed explicitly for legal text comprehension and generation. Leveraging the Mistral 7B architecture as its foundation, SaulLM-7B is trained on an English legal corpus of over 30 billion tokens. SaulLM-7B exhibits state-of-the-art proficiency in understanding and processing legal documents. Additionally, we present a novel instructional fine-tuning method that leverages legal datasets to further enhance SaulLM-7B's performance in legal tasks. SaulLM-7B is released under the MIT License.

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2403.03883
Document Type :
Working Paper