Back to Search Start Over

Multimodal Healthcare AI: Identifying and Designing Clinically Relevant Vision-Language Applications for Radiology

Authors :
Yildirim, Nur
Richardson, Hannah
Wetscherek, Maria T.
Bajwa, Junaid
Jacob, Joseph
Pinnock, Mark A.
Harris, Stephen
de Castro, Daniel Coelho
Bannur, Shruthi
Hyland, Stephanie L.
Ghosh, Pratik
Ranjit, Mercy
Bouzid, Kenza
Schwaighofer, Anton
Pérez-García, Fernando
Sharma, Harshita
Oktay, Ozan
Lungren, Matthew
Alvarez-Valle, Javier
Nori, Aditya
Thieme, Anja
Publication Year :
2024

Abstract

Recent advances in AI combine large language models (LLMs) with vision encoders that bring forward unprecedented technical capabilities to leverage for a wide range of healthcare applications. Focusing on the domain of radiology, vision-language models (VLMs) achieve good performance results for tasks such as generating radiology findings based on a patient's medical image, or answering visual questions (e.g., 'Where are the nodules in this chest X-ray?'). However, the clinical utility of potential applications of these capabilities is currently underexplored. We engaged in an iterative, multidisciplinary design process to envision clinically relevant VLM interactions, and co-designed four VLM use concepts: Draft Report Generation, Augmented Report Review, Visual Search and Querying, and Patient Imaging History Highlights. We studied these concepts with 13 radiologists and clinicians who assessed the VLM concepts as valuable, yet articulated many design considerations. Reflecting on our findings, we discuss implications for integrating VLM capabilities in radiology, and for healthcare AI more generally.<br />Comment: to appear at CHI 2024

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2402.14252
Document Type :
Working Paper
Full Text :
https://doi.org/10.1145/3613904.3642013