Back to Search Start Over

Assessing the Proficiency of LLMs with Various Tasks and Evaluators.

Authors :
Kim TM
Lee Y
Kim C
Ko T
Source :
Studies in health technology and informatics [Stud Health Technol Inform] 2024 Aug 22; Vol. 316, pp. 552-553.
Publication Year :
2024

Abstract

Previous studies have been limited to giving one or two tasks to Large Language Models (LLMs) and involved a small number of evaluators within a single domain to evaluate the LLM's answer. We assessed the proficiency of four LLMs by applying eight tasks and evaluating 32 results with 17 evaluators from diverse domains, demonstrating the significance of various tasks and evaluators on LLMs.

Subjects

Subjects :
Language
Computer Simulation

Details

Language :
English
ISSN :
1879-8365
Volume :
316
Database :
MEDLINE
Journal :
Studies in health technology and informatics
Publication Type :
Academic Journal
Accession number :
39176801
Full Text :
https://doi.org/10.3233/SHTI240473