Back to Search Start Over

Evaluation of ChatGPT's Performance in the Turkish Board of Orthopaedic Surgery Examination.

Authors :
Yigitbay, Ahmet
Source :
Medical Bulletin of Haseki / Haseki Tip Bulteni. Sep2024, Vol. 62 Issue 4, p243-249. 7p.
Publication Year :
2024

Abstract

Aim: Technological advances lead to significant changes in education and evaluation processes in medicine. In particular, artificial intelligence and natural language processing developments offer new opportunities in the health sector. This article evaluates Chat Generative Pre-Trained Transformer's (ChatGPT) performance in the Turkish Orthopaedics and Traumatology Education Council (TOTEK) Qualifying Written Examination and its applicability. Methods: To evaluate ChatGPT's performance, TOTEK Qualifying Written Examination questions from the last five years were entered as data. The results of ChatGPT were assessed under four parameters and compared with the actual exam results. The results were analyzed statistically. Results: Of the 500 questions, 458 were used as data in this study. Chat Generative Pre-Trained Transformer scored 40.2%, 26.3%, 37.3%, 32.9%, and 35.8% in the 2019, 2020, 2021, 2022, and 2023 TOTEK Qualifying Written Examination, respectively. When the correct answer percentages of ChatGPT according to years and the simple linear regression model applied to these data were analyzed, it was determined that there was a slightly decreasing trend in the correct answer rates as the years progressed. ChatGPT's TOTEK Qualifying Written Examination performance showed a statistically significant difference from the actual exam results. It was observed that the correct answer percentage of ChatGPT was below the general average success scores of the exam for each year. Conclusions: This analysis of artificial intelligence's applicability in the field and its role in training processes is essential to assess ChatGPT's potential uses and limitations. Chat Generative Pre-Trained Transformer can be a training tool, especially for knowledge-based and logical questions on specific topics. Still, its current performance is not at a level that can replace human decision-making in specialized medical fields. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13020072
Volume :
62
Issue :
4
Database :
Academic Search Index
Journal :
Medical Bulletin of Haseki / Haseki Tip Bulteni
Publication Type :
Academic Journal
Accession number :
180648383
Full Text :
https://doi.org/10.4274/haseki.galenos.2024.10038