Xulong Wang, Anna B. Berry, Yuhang Liu, Diane Wuest, Kristine Rinn, Iya Khalil, Patricia Dawson, Mariko Tameishi, Anna Holman, Vivek Mehta, Lauren K. Summers, George Richard Birchfield, Tanya A. Wahl, Mary Atwood, Wei Zheng, Boris Hayete, Erin Ellis, Henry S. Kaplan, Xiaoyu Liu, J D. Beatty, Candy Bonham, Thomas D. Brown, and Shlece Alexander
Background: In the era of personalized medicine, a major challenge is harnessing longitudinal data across the cancer care continuum, which includes multimodal data sets of biologic, molecular, and clinical information about patients (pts) and their tumors. There is a growing need for new computing analytics, such as machine learning–an important tool in healthcare bio-informatics. We report our approach to building cancer disease models in an unbiased manner through utilization of a causal machine learning and simulation platform. Methods: The Swedish Cancer Institute (SCI) Personalized Medicine Research Program (PMRP) is a prospective registration protocol with the objective of establishing a centralized longitudinal, molecular, and phenotypic data repository. Since 2014, over 1,030 pts have been enrolled, having undergone next-generation sequencing (NGS) profiling of their tumors. Of these pts, we identified 100 breast cancer pts who also have detailed longitudinal clinical annotation within our SCI Breast Cancer Registry. All de-identified data, variables, and data points in the multimodal data types are integrated into normalized data frames to include demographics, cancer risks, tumor specifications, tumor sequencing, initial and subsequent cancer treatments, and outcomes data. A reverse engineering approach, via the Reverse Engineering and Forward Simulation (REFS) platform, is being utilized, focusing on discovering the complex causal mechanisms that determine which therapies will produce the best outcomes for an individual pt. This method goes beyond traditional approaches that rely on data correlations to match treatments to pts. The breast cancer causal model uncovers many of the possible combinations of causal relationships that drive outcomes and enables “what if?” simulations of a variety of interventions, across pts, to determine optimal therapies. Performance metrics and model robustness will be explored using a stratified, n-fold (e.g., 10-fold) cross-validation procedure, which is designed to provide an unbiased estimate of model generalization to new observations. Results: The causal model and simulations can elevate the providers' abilities to better understand treatment responses based on pts' unique clinical data and mutational statuses; study different treatment options to optimize management; and understand the complex interactions among variables that lead to a range of treatment outcomes. Conclusions: Knowledge generated from the simulations of the disease model can potentially streamline and support the clinical decision-making process, to include molecular tumor board deliberations, and ultimately assist providers in arriving at optimal treatment recommendations for pts. Citation Format: Henry Kaplan, Anna Berry, Kristine Rinn, Erin Ellis, George Birchfield, Tanya Wahl, Xiaoyu Liu, Mariko Tameishi, J D. Beatty, Patricia Dawson, Vivek Mehta, Anna Holman, Mary Atwood, Shlece Alexander, Candy Bonham, Lauren Summers, Iya Khalil, Boris Hayete, Diane Wuest, Wei Zheng, Yuhang Liu, Xulong Wang, Thomas David Brown. Machine learning approach to personalized medicine in breast cancer patients: Development of data-driven, personalized, causal modeling through identification and understanding of optimal treatments for predicting better disease outcomes [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 5299.