Fink, Anna, Nattenmüller, Johanna, Rau, Stephan, Rau, Alexander, Tran, Hien, Bamberg, Fabian, Reisert, Marco, Kotter, Elmar, Diallo, Thierno, and Russe, Maximilian F.
Objectives: This study evaluated the effect of enhancing a GPT-4 model with retrieval-augmented generation on its ability to diagnose and classify traumatic injuries based on radiology reports.In this prospective proof-of-concept study, we used retrieval-augmented generation as a zero-shot learning approach to provide expert knowledge from the RadioGraphics top ten reading list for trauma radiology to the GPT-4 model, creating the context-aware TraumaCB. Radiological report findings of 50 traumatic injuries were independently generated by two radiologists. The performance of the TraumaCB compared to the generic GPT-4 was evaluated by three board-certified radiologists, assessing the accuracy and trustworthiness of the chatbot responses in the 100 reports created.The TraumaCB achieved 100% correct diagnoses, 96% correct classification, and 87% correct grading, outperforming the generic GPT-4 with 93% correct diagnoses, 70% correct classification, and 48% correct grading. TraumaCB sources consistently achieved a median rating of 5.0 for explanation and trust. Challenges encountered mainly involved traumatic injuries lacking widely accepted classification systems.Augmenting a commercial GPT-4 model with retrieval-augmented generation improves its diagnostic and classification capabilities, positioning it as a valuable tool for efficiently assessing traumatic injuries across various anatomical regions in trauma radiology.QuestionRetrieval-augmented generation has the potential to enhance generic chatbots with task-specific knowledge of emergency radiology.FindingsThe TraumaCB excelled in accuracy, particularly in injury classification and grading, and provided explanations along with the sources used, increasing transparency and facilitating verification.Clinical relevanceThe TraumaCB provides accurate, fast, and transparent access to trauma radiology classifications, potentially increasing the efficiency of image interpretation in emergency departments and enabling customized reports based on local or individual preferences.Materials and methods: This study evaluated the effect of enhancing a GPT-4 model with retrieval-augmented generation on its ability to diagnose and classify traumatic injuries based on radiology reports.In this prospective proof-of-concept study, we used retrieval-augmented generation as a zero-shot learning approach to provide expert knowledge from the RadioGraphics top ten reading list for trauma radiology to the GPT-4 model, creating the context-aware TraumaCB. Radiological report findings of 50 traumatic injuries were independently generated by two radiologists. The performance of the TraumaCB compared to the generic GPT-4 was evaluated by three board-certified radiologists, assessing the accuracy and trustworthiness of the chatbot responses in the 100 reports created.The TraumaCB achieved 100% correct diagnoses, 96% correct classification, and 87% correct grading, outperforming the generic GPT-4 with 93% correct diagnoses, 70% correct classification, and 48% correct grading. TraumaCB sources consistently achieved a median rating of 5.0 for explanation and trust. Challenges encountered mainly involved traumatic injuries lacking widely accepted classification systems.Augmenting a commercial GPT-4 model with retrieval-augmented generation improves its diagnostic and classification capabilities, positioning it as a valuable tool for efficiently assessing traumatic injuries across various anatomical regions in trauma radiology.QuestionRetrieval-augmented generation has the potential to enhance generic chatbots with task-specific knowledge of emergency radiology.FindingsThe TraumaCB excelled in accuracy, particularly in injury classification and grading, and provided explanations along with the sources used, increasing transparency and facilitating verification.Clinical relevanceThe TraumaCB provides accurate, fast, and transparent access to trauma radiology classifications, potentially increasing the efficiency of image interpretation in emergency departments and enabling customized reports based on local or individual preferences.Results: This study evaluated the effect of enhancing a GPT-4 model with retrieval-augmented generation on its ability to diagnose and classify traumatic injuries based on radiology reports.In this prospective proof-of-concept study, we used retrieval-augmented generation as a zero-shot learning approach to provide expert knowledge from the RadioGraphics top ten reading list for trauma radiology to the GPT-4 model, creating the context-aware TraumaCB. Radiological report findings of 50 traumatic injuries were independently generated by two radiologists. The performance of the TraumaCB compared to the generic GPT-4 was evaluated by three board-certified radiologists, assessing the accuracy and trustworthiness of the chatbot responses in the 100 reports created.The TraumaCB achieved 100% correct diagnoses, 96% correct classification, and 87% correct grading, outperforming the generic GPT-4 with 93% correct diagnoses, 70% correct classification, and 48% correct grading. TraumaCB sources consistently achieved a median rating of 5.0 for explanation and trust. Challenges encountered mainly involved traumatic injuries lacking widely accepted classification systems.Augmenting a commercial GPT-4 model with retrieval-augmented generation improves its diagnostic and classification capabilities, positioning it as a valuable tool for efficiently assessing traumatic injuries across various anatomical regions in trauma radiology.QuestionRetrieval-augmented generation has the potential to enhance generic chatbots with task-specific knowledge of emergency radiology.FindingsThe TraumaCB excelled in accuracy, particularly in injury classification and grading, and provided explanations along with the sources used, increasing transparency and facilitating verification.Clinical relevanceThe TraumaCB provides accurate, fast, and transparent access to trauma radiology classifications, potentially increasing the efficiency of image interpretation in emergency departments and enabling customized reports based on local or individual preferences.Conclusion: This study evaluated the effect of enhancing a GPT-4 model with retrieval-augmented generation on its ability to diagnose and classify traumatic injuries based on radiology reports.In this prospective proof-of-concept study, we used retrieval-augmented generation as a zero-shot learning approach to provide expert knowledge from the RadioGraphics top ten reading list for trauma radiology to the GPT-4 model, creating the context-aware TraumaCB. Radiological report findings of 50 traumatic injuries were independently generated by two radiologists. The performance of the TraumaCB compared to the generic GPT-4 was evaluated by three board-certified radiologists, assessing the accuracy and trustworthiness of the chatbot responses in the 100 reports created.The TraumaCB achieved 100% correct diagnoses, 96% correct classification, and 87% correct grading, outperforming the generic GPT-4 with 93% correct diagnoses, 70% correct classification, and 48% correct grading. TraumaCB sources consistently achieved a median rating of 5.0 for explanation and trust. Challenges encountered mainly involved traumatic injuries lacking widely accepted classification systems.Augmenting a commercial GPT-4 model with retrieval-augmented generation improves its diagnostic and classification capabilities, positioning it as a valuable tool for efficiently assessing traumatic injuries across various anatomical regions in trauma radiology.QuestionRetrieval-augmented generation has the potential to enhance generic chatbots with task-specific knowledge of emergency radiology.FindingsThe TraumaCB excelled in accuracy, particularly in injury classification and grading, and provided explanations along with the sources used, increasing transparency and facilitating verification.Clinical relevanceThe TraumaCB provides accurate, fast, and transparent access to trauma radiology classifications, potentially increasing the efficiency of image interpretation in emergency departments and enabling customized reports based on local or individual preferences.Key Points: This study evaluated the effect of enhancing a GPT-4 model with retrieval-augmented generation on its ability to diagnose and classify traumatic injuries based on radiology reports.In this prospective proof-of-concept study, we used retrieval-augmented generation as a zero-shot learning approach to provide expert knowledge from the RadioGraphics top ten reading list for trauma radiology to the GPT-4 model, creating the context-aware TraumaCB. Radiological report findings of 50 traumatic injuries were independently generated by two radiologists. The performance of the TraumaCB compared to the generic GPT-4 was evaluated by three board-certified radiologists, assessing the accuracy and trustworthiness of the chatbot responses in the 100 reports created.The TraumaCB achieved 100% correct diagnoses, 96% correct classification, and 87% correct grading, outperforming the generic GPT-4 with 93% correct diagnoses, 70% correct classification, and 48% correct grading. TraumaCB sources consistently achieved a median rating of 5.0 for explanation and trust. Challenges encountered mainly involved traumatic injuries lacking widely accepted classification systems.Augmenting a commercial GPT-4 model with retrieval-augmented generation improves its diagnostic and classification capabilities, positioning it as a valuable tool for efficiently assessing traumatic injuries across various anatomical regions in trauma radiology.QuestionRetrieval-augmented generation has the potential to enhance generic chatbots with task-specific knowledge of emergency radiology.FindingsThe TraumaCB excelled in accuracy, particularly in injury classification and grading, and provided explanations along with the sources used, increasing transparency and facilitating verification.Clinical relevanceThe TraumaCB provides accurate, fast, and transparent access to trauma radiology classifications, potentially increasing the efficiency of image interpretation in emergency departments and enabling customized reports based on local or individual preferences. [ABSTRACT FROM AUTHOR]