Start Over

ChatGPT-4 Consistency in Interpreting Laryngeal Clinical Images of Common Lesions and Disorders.

Authors :: Maniaci A
Chiesa-Estomba CM
Lechien JR
Source :: Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery [Otolaryngol Head Neck Surg] 2024 Oct; Vol. 171 (4), pp. 1106-1113. Date of Electronic Publication: 2024 Jul 24.
Publication Year :: 2024
Abstract: Objective: To investigate the consistency of Chatbot Generative Pretrained Transformer (ChatGPT)-4 in the analysis of clinical pictures of common laryngological conditions. Study Design: Prospective uncontrolled study. Setting: Multicenter study. Methods: Patient history and clinical videolaryngostroboscopic images were presented to ChatGPT-4 for differential diagnoses, management, and treatment(s). ChatGPT-4 responses were assessed by 3 blinded laryngologists with the artificial intelligence performance instrument (AIPI). The complexity of cases and the consistency between practitioners and ChatGPT-4 for interpreting clinical images were evaluated with a 5-point Likert Scale. The intraclass correlation coefficient (ICC) was used to measure the strength of interrater agreement. Results: Forty patients with a mean complexity score of 2.60 ± 1.15. were included. The mean consistency score for ChatGPT-4 image interpretation was 2.46 ± 1.42. ChatGPT-4 perfectly analyzed the clinical images in 6 cases (15%; 5/5), while the consistency between GPT-4 and judges was high in 5 cases (12.5%; 4/5). Judges reported an ICC of 0.965 for the consistency score (P = .001). ChatGPT-4 erroneously documented vocal fold irregularity (mass or lesion), glottic insufficiency, and vocal cord paralysis in 21 (52.5%), 2 (0.05%), and 5 (12.5%) cases, respectively. ChatGPT-4 and practitioners indicated 153 and 63 additional examinations, respectively (P = .001). The ChatGPT-4 primary diagnosis was correct in 20.0% to 25.0% of cases. The clinical image consistency score was significantly associated with the AIPI score (r <subscript>s</subscript> = 0.830; P = .001). Conclusion: The ChatGPT-4 is more efficient in primary diagnosis, rather than in the image analysis, selecting the most adequate additional examinations and treatments. (© 2024 American Academy of Otolaryngology–Head and Neck Surgery Foundation.)

Subjects :: Humans
Prospective Studies
Male
Female
Middle Aged
Adult
Diagnosis, Differential
Aged
Stroboscopy
Image Interpretation, Computer-Assisted
Artificial Intelligence
Video Recording
Laryngeal Diseases diagnostic imaging
Laryngeal Diseases diagnosis
Laryngoscopy

Details

Language :: English
ISSN :: 1097-6817
Volume :: 171
Issue :: 4
Database :: MEDLINE
Journal :: Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery
Publication Type :: Academic Journal
Accession number :: 39045737
Full Text :: https://doi.org/10.1002/ohn.897

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

ChatGPT-4 Consistency in Interpreting Laryngeal Clinical Images of Common Lesions and Disorders.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

ChatGPT-4 Consistency in Interpreting Laryngeal Clinical Images of Common Lesions and Disorders.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources