Back to Search Start Over

CliniDigest: A Case Study in Large Language Model Based Large-Scale Summarization of Clinical Trial Descriptions

Authors :
White, Renee D.
Peng, Tristan
Sripitak, Pann
Johansen, Alexander Rosenberg
Snyder, Michael
Publication Year :
2023

Abstract

A clinical trial is a study that evaluates new biomedical interventions. To design new trials, researchers draw inspiration from those current and completed. In 2022, there were on average more than 100 clinical trials submitted to ClinicalTrials.gov every day, with each trial having a mean of approximately 1500 words [1]. This makes it nearly impossible to keep up to date. To mitigate this issue, we have created a batch clinical trial summarizer called CliniDigest using GPT-3.5. CliniDigest is, to our knowledge, the first tool able to provide real-time, truthful, and comprehensive summaries of clinical trials. CliniDigest can reduce up to 85 clinical trial descriptions (approximately 10,500 words) into a concise 200-word summary with references and limited hallucinations. We have tested CliniDigest on its ability to summarize 457 trials divided across 27 medical subdomains. For each field, CliniDigest generates summaries of $\mu=153,\ \sigma=69 $ words, each of which utilizes $\mu=54\%,\ \sigma=30\% $ of the sources. A more comprehensive evaluation is planned and outlined in this paper.<br />Comment: 7 pages, 3 figures, 3 tables, conference: ACM GoodIt 23'; Second co-author: Tristan Peng; Citation: White, Peng, et al

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2307.14522
Document Type :
Working Paper
Full Text :
https://doi.org/10.1145/3582515.3609559