Author: "Moawad MHED" / Topic: educational measurement - Searchworks@Jio Institute Digital Library Search Results

1. AI chatbots show promise but limitations on UK medical exam questions: a comparative performance study.

Author: Sadeq MA, Ghorab RMF, Ashry MH, Abozaid AM, Banihani HA, Salem M, Aisheh MTA, Abuzahra S, Mourid MR, Assker MM, Ayyad M, and Moawad MHED
Subjects: Humans, United Kingdom, Education, Medical methods, Artificial Intelligence, Students, Medical, Educational Measurement methods
Abstract: Large language models (LLMs) like ChatGPT have potential applications in medical education such as helping students study for their licensing exams by discussing unclear questions with them. However, they require evaluation on these complex tasks. The purpose of this study was to evaluate how well publicly accessible LLMs performed on simulated UK medical board exam questions. 423 board-style questions from 9 UK exams (MRCS, MRCP, etc.) were answered by seven LLMs (ChatGPT-3.5, ChatGPT-4, Bard, Perplexity, Claude, Bing, Claude Instant). There were 406 multiple-choice, 13 true/false, and 4 "choose N" questions covering topics in surgery, pediatrics, and other disciplines. The accuracy of the output was graded. Statistics were used to analyze differences among LLMs. Leaked questions were excluded from the primary analysis. ChatGPT 4.0 scored (78.2%), Bing (67.2%), Claude (64.4%), and Claude Instant (62.9%). Perplexity scored the lowest (56.1%). Scores differed significantly between LLMs overall (p < 0.001) and in pairwise comparisons. All LLMs scored higher on multiple-choice vs true/false or "choose N" questions. LLMs demonstrated limitations in answering certain questions, indicating refinements needed before primary reliance in medical education. However, their expanding capabilities suggest a potential to improve training if thoughtfully implemented. Further research should explore specialty specific LLMs and optimal integration into medical curricula., (© 2024. The Author(s).)
Published: 2024
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

1 results on '"Moawad MHED"'

1. AI chatbots show promise but limitations on UK medical exam questions: a comparative performance study.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Language

Publication Type

Database

1 results on '"Moawad MHED"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources