1. Structuring and centralizing breast cancer real-world biomarker data from pathology reports through C-LAB artificial intelligence platform
- Author
-
Florent Le Borgne, Camille Garnier, Camille Morisseau, Yanis Navarrete, Yanina Echeverria, Juan Mir, Jaume Calafell, Tanguy Perennec, Olivier Kerdraon, Jean-Sébastien Frenel, Judith Raimbourg, Mario Campone, Maria Fe Paz, and François Bocquet
- Subjects
Computer applications to medicine. Medical informatics ,R858-859.7 - Abstract
Purpose To evaluate the effectiveness of C-LAB ® , an artificial intelligence (AI) platform, in extracting, structuring, and centralizing biomarker data from breast cancer pathology reports within the challenging, heterogeneous dataset of the Institut de Cancérologie de l’Ouest (ICO). Methods C-LAB ® was deployed at the ICO to analyze HER2 and hormonal receptor data from breast cancer pathology reports. During the development phase, 292 anatomic pathology reports were used to design and refine the rule-based extraction algorithm through an iterative process of monitoring and adjustments. After finalizing the algorithm, it was applied to a total of 2323 anatomic pathology reports. To evaluate the platform's accuracy, performance metrics could only be calculated for a subset of these reports that were also available in the structured National Epidemiological Strategy and Medical Economics (ESME) database. Out of the 2323 pathology reports belonging to 487 patients analyzed by C-LAB ® , 666 corresponded to 97 patients present in the ESME database. These reports were used as the gold standard for performance assessment, as ESME provides structured data against which the outputs of the C-LAB ® algorithm could be compared. Results C-LAB ® achieved over 80% agreement with human extractions (precision, recall, and F1-score) in structuring biomarker data from complex, unstructured pathology reports, despite dataset variability and optical character recognition errors. While the ESME database served as a benchmark, its reliance on single manual data entry without secondary review introduces potential inaccuracies, suggesting the observed performance reflects close alignment between human and algorithmic extractions rather than absolute accuracy. C-LAB ® demonstrates significant potential to reduce manual workload, centralize data, and enable scalable, real-time reporting. Conclusion AI technologies like C-LAB ® show significant potential in creating accessible and actionable digital factories from complex pathology data, aiding in the precision management of diseases such as breast cancer diagnostics and treatment.
- Published
- 2025
- Full Text
- View/download PDF