Author: "Runhan Shi" / Publisher: elsevier - Searchworks@Jio Institute Digital Library Search Results

1. Benchmarking four large language models’ performance of addressing Chinese patients' inquiries about dry eye disease: A two-phase study

Author: Runhan Shi, Steven Liu, Xinwei Xu, Zhengqiang Ye, Jin Yang, Qihua Le, Jini Qiu, Lijia Tian, Anji Wei, Kun Shan, Chen Zhao, Xinghuai Sun, Xingtao Zhou, and Jiaxu Hong
Subjects: Large language model, Ophthalmology, Dry eye disease, Patient education, Real world interview, Science (General), Q1-390, Social sciences (General), H1-99
Abstract: Purpose: To evaluate the performance of four large language models (LLMs)—GPT-4, PaLM 2, Qwen, and Baichuan 2—in generating responses to inquiries from Chinese patients about dry eye disease (DED). Design: Two-phase study, including a cross-sectional test in the first phase and a real-world clinical assessment in the second phase. Subjects: Eight board-certified ophthalmologists and 46 patients with DED. Methods: The chatbots' responses to Chinese patients' inquiries about DED were assessed by the evaluation. In the first phase, six senior ophthalmologists subjectively rated the chatbots’ responses using a 5-point Likert scale across five domains: correctness, completeness, readability, helpfulness, and safety. Objective readability analysis was performed using a Chinese readability analysis platform. In the second phase, 46 representative patients with DED asked the two language models (GPT-4 and Baichuan 2) that performed best in the in the first phase questions and then rated the answers for satisfaction and readability. Two senior ophthalmologists then assessed the responses across the five domains. Main outcome measures: Subjective scores for the five domains and objective readability scores in the first phase. The patient satisfaction, readability scores, and subjective scores for the five-domains in the second phase. Results: In the first phase, GPT-4 exhibited superior performance across the five domains (correctness: 4.47; completeness: 4.39; readability: 4.47; helpfulness: 4.49; safety: 4.47, p
Published: 2024
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

1 results on '"Runhan Shi"'

1. Benchmarking four large language models’ performance of addressing Chinese patients' inquiries about dry eye disease: A two-phase study

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Language

Publication Type

Database

1 results on '"Runhan Shi"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources