Start Over

A Comparative Study of Large Language Models for Generating Summaries of Breast Cancer Patient-Reported Treatment Toxicities.

Authors :: Wu, D.J.
Bibault, J.E.
Source :: International Journal of Radiation Oncology, Biology, Physics. 2024 Supplement, Vol. 120 Issue 2, pe666-e666. 1p.
Publication Year :: 2024
Abstract: Recent advances in artificial intelligence such as large language models (LLMs) offer a promising avenue for enhancing clinical documentation and monitoring patient-reported outcomes (PRO). This study aims to compare four leading open-source and proprietary LLMs, Mixtral-8x7B, Llama-2B, Qwen-1.5, and GPT-4, in generating summaries of patient-reported symptoms using the adapted Physician Documentation Quality Index (PDQI). A previously reported web-based application utilizing 35 items from the PRO-CTCAE scale was used to create an interactive form for breast cancer patients to report treatment-related symptoms. The four LLMs were used to generate natural language summaries for four hypothetical patients with non-identifiable patient data. Twelve resident physician raters evaluated the summaries using an abbreviated PDQI questionnaire, rating accuracy, usefulness, comprehensibility, succinctness scored on a 5-point Likert scale. IRB approval was not required in accordance with the NIH 2018 Revised Common Rule Requirements as the study used researcher-generated, non-identifiable data. Forty-seven physician ratings were collected. A repeated measures ANOVA showed significant differences in accuracy among the models (F(2.120, 97.52) = 15.30, p < 0.0001), with Mixtral-8x7b (M = 3.60), GPT-4 (M = 3.78), and Qwen-1.5 (M = 3.62) significantly surpassing Llama-2 (M = 2.77, p ≤ 0.001), with no significant difference between the three. Mixtral-8x7b (M = 3.47) outperformed Llama-2 in usefulness (M = 2.87, p<.05) and outperformed Llama-2 and Qwen in succinctness (p<.05). No significant differences were found in comprehensiveness. Reviewers noted 6 mistakes in Llama-2 summaries, and 1 mistake each in Mixtral-8x7b and Qwen-1.5 summaries. This study demonstrates that the latest open-source LLMs, such as Mixtral-8x7b and Qwen-1.5, can match the performance of their closed-source counterpart, GPT-4, in physician-rated measures of documentation quality and outperform their predecessor, Llama-2. This study highlights the narrowing gap between open-source and proprietary LLMs in medical documentation. These findings may help inform the strategic selection of cost-effective and data-safe LLMs for future clinical research and practice integration, potentially democratizing advanced AI tools for a broader healthcare audience. However, further validation in real-world clinical settings is necessary to assess the impact of these models on patient care efficiency and efficacy. [ABSTRACT FROM AUTHOR]

Subjects :: *LANGUAGE models
*GENERATIVE pre-trained transformers
*ARTIFICIAL intelligence
*WEB-based user interfaces
*PATIENT reported outcome measures

Details

Language :: English
ISSN :: 03603016
Volume :: 120
Issue :: 2
Database :: Academic Search Index
Journal :: International Journal of Radiation Oncology, Biology, Physics
Publication Type :: Academic Journal
Accession number :: 179876380
Full Text :: https://doi.org/10.1016/j.ijrobp.2024.07.1461

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

A Comparative Study of Large Language Models for Generating Summaries of Breast Cancer Patient-Reported Treatment Toxicities.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

A Comparative Study of Large Language Models for Generating Summaries of Breast Cancer Patient-Reported Treatment Toxicities.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources