11 results on '"Pfaff, Emily"'
Search Results
2. Preexisting Autoimmunity Is Associated With Increased Severity of Coronavirus Disease 2019: A Retrospective Cohort Study Using Data From the National COVID Cohort Collaborative (N3C).
- Author
-
Yadaw, Arjun S, Sahner, David K, Sidky, Hythem, Afzali, Behdad, Hotaling, Nathan, Pfaff, Emily R, and Mathé, Ewy A
- Subjects
COVID-19 ,CONFIDENCE intervals ,ANTI-inflammatory agents ,AUTOIMMUNE diseases ,RETROSPECTIVE studies ,ACQUISITION of data ,SEVERITY of illness index ,RISK assessment ,CATASTROPHIC illness ,MEDICAL records ,DESCRIPTIVE statistics ,HOSPITAL care ,RESEARCH funding ,LOGISTIC regression analysis ,IMMUNOSUPPRESSIVE agents ,ODDS ratio ,LONGITUDINAL method ,DISEASE risk factors ,DISEASE complications - Abstract
Background Identifying individuals with a higher risk of developing severe coronavirus disease 2019 (COVID-19) outcomes will inform targeted and more intensive clinical monitoring and management. To date, there is mixed evidence regarding the impact of preexisting autoimmune disease (AID) diagnosis and/or immunosuppressant (IS) exposure on developing severe COVID-19 outcomes. Methods A retrospective cohort of adults diagnosed with COVID-19 was created in the National COVID Cohort Collaborative enclave. Two outcomes, life-threatening disease and hospitalization, were evaluated by using logistic regression models with and without adjustment for demographics and comorbidities. Results Of the 2 453 799 adults diagnosed with COVID-19, 191 520 (7.81%) had a preexisting AID diagnosis and 278 095 (11.33%) had a preexisting IS exposure. Logistic regression models adjusted for demographics and comorbidities demonstrated that individuals with a preexisting AID (odds ratio [OR], 1.13; 95% confidence interval [CI]: 1.09–1.17; P <.001), IS exposure (OR, 1.27; 95% CI: 1.24–1.30; P <.001), or both (OR, 1.35; 95% CI: 1.29–1.40; P <.001) were more likely to have a life-threatening disease. These results were consistent when hospitalization was evaluated. A sensitivity analysis evaluating specific IS revealed that tumor necrosis factor inhibitors were protective against life-threatening disease (OR, 0.80; 95% CI:.66–.96; P =.017) and hospitalization (OR, 0.80; 95% CI:.73–.89; P <.001). Conclusions Patients with preexisting AID, IS exposure, or both are more likely to have a life-threatening disease or hospitalization. These patients may thus require tailored monitoring and preventative measures to minimize negative consequences of COVID-19. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
3. De-black-boxing health AI: demonstrating reproducible machine learning computable phenotypes using the N3C-RECOVER Long COVID model in the All of Us data repository.
- Author
-
Pfaff, Emily R, Girvin, Andrew T, Crosskey, Miles, Gangireddy, Srushti, Master, Hiral, Wei, Wei-Qi, Kerchberger, V Eric, Weiner, Mark, Harris, Paul A, Basford, Melissa, Lunt, Chris, Chute, Christopher G, Moffitt, Richard A, Haendel, Melissa, and Consortia, N3C and RECOVER
- Abstract
Machine learning (ML)-driven computable phenotypes are among the most challenging to share and reproduce. Despite this difficulty, the urgent public health considerations around Long COVID make it especially important to ensure the rigor and reproducibility of Long COVID phenotyping algorithms such that they can be made available to a broad audience of researchers. As part of the NIH Researching COVID to Enhance Recovery (RECOVER) Initiative, researchers with the National COVID Cohort Collaborative (N3C) devised and trained an ML-based phenotype to identify patients highly probable to have Long COVID. Supported by RECOVER, N3C and NIH's All of Us study partnered to reproduce the output of N3C's trained model in the All of Us data enclave, demonstrating model extensibility in multiple environments. This case study in ML-based phenotype reuse illustrates how open-source software best practices and cross-site collaboration can de-black-box phenotyping algorithms, prevent unnecessary rework, and promote open science in informatics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. Clinical encounter heterogeneity and methods for resolving in networked EHR data: a study from N3C and RECOVER programs.
- Author
-
Leese, Peter, Anand, Adit, Girvin, Andrew, Manna, Amin, Patel, Saaya, Yoo, Yun Jae, Wong, Rachel, Haendel, Melissa, Chute, Christopher G, Bennett, Tellen, Hajagos, Janos, Pfaff, Emily, and Moffitt, Richard
- Abstract
Objective Clinical encounter data are heterogeneous and vary greatly from institution to institution. These problems of variance affect interpretability and usability of clinical encounter data for analysis. These problems are magnified when multisite electronic health record (EHR) data are networked together. This article presents a novel, generalizable method for resolving encounter heterogeneity for analysis by combining related atomic encounters into composite "macrovisits." Materials and Methods Encounters were composed of data from 75 partner sites harmonized to a common data model as part of the NIH Researching COVID to Enhance Recovery Initiative, a project of the National Covid Cohort Collaborative. Summary statistics were computed for overall and site-level data to assess issues and identify modifications. Two algorithms were developed to refine atomic encounters into cleaner, analyzable longitudinal clinical visits. Results Atomic inpatient encounters data were found to be widely disparate between sites in terms of length-of-stay (LOS) and numbers of OMOP CDM measurements per encounter. After aggregating encounters to macrovisits, LOS and measurement variance decreased. A subsequent algorithm to identify hospitalized macrovisits further reduced data variability. Discussion Encounters are a complex and heterogeneous component of EHR data and native data issues are not addressed by existing methods. These types of complex and poorly studied issues contribute to the difficulty of deriving value from EHR data, and these types of foundational, large-scale explorations, and developments are necessary to realize the full potential of modern real-world data. Conclusion This article presents method developments to manipulate and resolve EHR encounter data issues in a generalizable way as a foundation for future research and analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
5. Harmonizing units and values of quantitative data elements in a very large nationally pooled electronic health record (EHR) dataset.
- Author
-
Bradwell, Katie R, Wooldridge, Jacob T, Amor, Benjamin, Bennett, Tellen D, Anand, Adit, Bremer, Carolyn, Yoo, Yun Jae, Qian, Zhenglong, Johnson, Steven G, Pfaff, Emily R, Girvin, Andrew T, Manna, Amin, Niehaus, Emily A, Hong, Stephanie S, Zhang, Xiaohan Tanner, Zhu, Richard L, Bissell, Mark, Qureshi, Nabeel, Saltz, Joel, and Haendel, Melissa A
- Abstract
Objective The goals of this study were to harmonize data from electronic health records (EHRs) into common units, and impute units that were missing. Materials and Methods The National COVID Cohort Collaborative (N3C) table of laboratory measurement data—over 3.1 billion patient records and over 19 000 unique measurement concepts in the Observational Medical Outcomes Partnership (OMOP) common-data-model format from 55 data partners. We grouped ontologically similar OMOP concepts together for 52 variables relevant to COVID-19 research, and developed a unit-harmonization pipeline comprised of (1) selecting a canonical unit for each measurement variable, (2) arriving at a formula for conversion, (3) obtaining clinical review of each formula, (4) applying the formula to convert data values in each unit into the target canonical unit, and (5) removing any harmonized value that fell outside of accepted value ranges for the variable. For data with missing units for all the results within a lab test for a data partner, we compared values with pooled values of all data partners, using the Kolmogorov-Smirnov test. Results Of the concepts without missing values, we harmonized 88.1% of the values, and imputed units for 78.2% of records where units were absent (41% of contributors' records lacked units). Discussion The harmonization and inference methods developed herein can serve as a resource for initiatives aiming to extract insight from heterogeneous EHR collections. Unique properties of centralized data are harnessed to enable unit inference. Conclusion The pipeline we developed for the pooled N3C data enables use of measurements that would otherwise be unavailable for analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
6. Supporting research, protecting data: one institution's approach to clinical data warehouse governance.
- Author
-
Walters, Kellie M, Jojic, Anna, Pfaff, Emily R, Rape, Marie, Spencer, Donald C, Shaheen, Nicholas J, Lamm, Brent, and Carey, Timothy S
- Abstract
Institutions must decide how to manage the use of clinical data to support research while ensuring appropriate protections are in place. Questions about data use and sharing often go beyond what the Health Insurance Portability and Accountability Act of 1996 (HIPAA) considers. In this article, we describe our institution's governance model and approach. Common questions we consider include (1) Is a request limited to the minimum data necessary to carry the research forward? (2) What plans are there for sharing data externally?, and (3) What impact will the proposed use of data have on patients and the institution? In 2020, 302 of the 319 requests reviewed were approved. The majority of requests were approved in less than 2 weeks, with few or no stipulations. For the remaining requests, the governance committee works with researchers to find solutions to meet their needs while also addressing our collective goal of protecting patients. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
7. Synergies between centralized and federated approaches to data quality: a report from the national COVID cohort collaborative.
- Author
-
Pfaff, Emily R, Girvin, Andrew T, Gabriel, Davera L, Kostka, Kristin, Morris, Michele, Palchuk, Matvey B, Lehmann, Harold P, Amor, Benjamin, Bissell, Mark, Bradwell, Katie R, Gold, Sigfried, Hong, Stephanie S, Loomba, Johanna, Manna, Amin, McMurry, Julie A, Niehaus, Emily, Qureshi, Nabeel, Walden, Anita, Zhang, Xiaohan Tanner, and Zhu, Richard L
- Abstract
Objective: In response to COVID-19, the informatics community united to aggregate as much clinical data as possible to characterize this new disease and reduce its impact through collaborative analytics. The National COVID Cohort Collaborative (N3C) is now the largest publicly available HIPAA limited dataset in US history with over 6.4 million patients and is a testament to a partnership of over 100 organizations.Materials and Methods: We developed a pipeline for ingesting, harmonizing, and centralizing data from 56 contributing data partners using 4 federated Common Data Models. N3C data quality (DQ) review involves both automated and manual procedures. In the process, several DQ heuristics were discovered in our centralized context, both within the pipeline and during downstream project-based analysis. Feedback to the sites led to many local and centralized DQ improvements.Results: Beyond well-recognized DQ findings, we discovered 15 heuristics relating to source Common Data Model conformance, demographics, COVID tests, conditions, encounters, measurements, observations, coding completeness, and fitness for use. Of 56 sites, 37 sites (66%) demonstrated issues through these heuristics. These 37 sites demonstrated improvement after receiving feedback.Discussion: We encountered site-to-site differences in DQ which would have been challenging to discover using federated checks alone. We have demonstrated that centralized DQ benchmarking reveals unique opportunities for DQ improvement that will support improved research analytics locally and in aggregate.Conclusion: By combining rapid, continual assessment of DQ with a large volume of multisite data, it is possible to support more nuanced scientific questions with the scale and rigor that they require. [ABSTRACT FROM AUTHOR]- Published
- 2022
- Full Text
- View/download PDF
8. The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment.
- Author
-
Haendel, Melissa A, Chute, Christopher G, Bennett, Tellen D, Eichmann, David A, Guinney, Justin, Kibbe, Warren A, Payne, Philip R O, Pfaff, Emily R, Robinson, Peter N, Saltz, Joel H, Spratt, Heidi, Suver, Christine, Wilbanks, John, Wilcox, Adam B, Williams, Andrew E, Wu, Chunlei, Blacketer, Clair, Bradford, Robert L, Cimino, James J, and Clark, Marshall
- Abstract
Objective: Coronavirus disease 2019 (COVID-19) poses societal challenges that require expeditious data and knowledge sharing. Though organizational clinical data are abundant, these are largely inaccessible to outside researchers. Statistical, machine learning, and causal analyses are most successful with large-scale data beyond what is available in any given organization. Here, we introduce the National COVID Cohort Collaborative (N3C), an open science community focused on analyzing patient-level data from many centers.Materials and Methods: The Clinical and Translational Science Award Program and scientific community created N3C to overcome technical, regulatory, policy, and governance barriers to sharing and harmonizing individual-level clinical data. We developed solutions to extract, aggregate, and harmonize data across organizations and data models, and created a secure data enclave to enable efficient, transparent, and reproducible collaborative analytics.Results: Organized in inclusive workstreams, we created legal agreements and governance for organizations and researchers; data extraction scripts to identify and ingest positive, negative, and possible COVID-19 cases; a data quality assurance and harmonization pipeline to create a single harmonized dataset; population of the secure data enclave with data, machine learning, and statistical analytics tools; dissemination mechanisms; and a synthetic data pilot to democratize data access.Conclusions: The N3C has demonstrated that a multisite collaborative learning health network can overcome barriers to rapidly build a scalable infrastructure incorporating multiorganizational clinical data for COVID-19 analytics. We expect this effort to save lives by enabling rapid collaboration among clinicians, researchers, and data scientists to identify treatments and specialized care and thereby reduce the immediate and long-term impacts of COVID-19. [ABSTRACT FROM AUTHOR]- Published
- 2021
- Full Text
- View/download PDF
9. A novel approach for exposing and sharing clinical data: the Translator Integrated Clinical and Environmental Exposures Service.
- Author
-
Fecho, Karamarie, Pfaff, Emily, Xu, Hao, Champion, James, Cox, Steve, Stillwell, Lisa, Peden, David B, Bizon, Chris, Krishnamurthy, Ashok, Tropsha, Alexander, and Ahalt, Stanley C
- Abstract
Objective: This study aimed to develop a novel, regulatory-compliant approach for openly exposing integrated clinical and environmental exposures data: the Integrated Clinical and Environmental Exposures Service (ICEES).Materials and Methods: The driving clinical use case for research and development of ICEES was asthma, which is a common disease influenced by hundreds of genes and a plethora of environmental exposures, including exposures to airborne pollutants. We developed a pipeline for integrating clinical data on patients with asthma-like conditions with data on environmental exposures derived from multiple public data sources. The data were integrated at the patient and visit level and used to create de-identified, binned, "integrated feature tables," which were then placed behind an OpenAPI.Results: Our preliminary evaluation results demonstrate a relationship between exposure to high levels of particulate matter ≤2.5 µm in diameter (PM2.5) and the frequency of emergency department or inpatient visits for respiratory issues. For example, 16.73% of patients with average daily exposure to PM2.5 >9.62 µg/m3 experienced 2 or more emergency department or inpatient visits for respiratory issues in year 2010 compared with 7.93% of patients with lower exposures (n = 23 093).Discussion: The results validated our overall approach for openly exposing and sharing integrated clinical and environmental exposures data. We plan to iteratively refine and expand ICEES by including additional years of data, feature variables, and disease cohorts.Conclusions: We believe that ICEES will serve as a regulatory-compliant model and approach for promoting open access to and sharing of integrated clinical and environmental exposures data. [ABSTRACT FROM AUTHOR]- Published
- 2019
- Full Text
- View/download PDF
10. Recruiting for a pragmatic trial using the electronic health record and patient portal: successes and lessons learned.
- Author
-
Pfaff, Emily, Lee, Adam, Bradford, Robert, Pae, Jinhee, Potter, Clarence, Blue, Paul, Knoepp, Patricia, Thompson, Kristie, Roumie, Christianne L, Crenshaw, David, Servis, Remy, and DeWalt, Darren A
- Abstract
Objective: Querying electronic health records (EHRs) to find patients meeting study criteria is an efficient method of identifying potential study participants. We aimed to measure the effectiveness of EHR-driven recruitment in the context of ADAPTABLE (Aspirin Dosing: A Patient-centric Trial Assessing Benefits and Long-Term Effectiveness)-a pragmatic trial aiming to recruit 15 000 patients.Materials and Methods: We compared the participant yield of 4 recruitment methods: in-clinic recruitment by a research coordinator, letters, direct email, and patient portal messages. Taken together, the latter 2 methods comprised our EHR-driven electronic recruitment workflow.Results: The electronic recruitment workflow sent electronic messages to 12 254 recipients; 13.5% of these recipients visited the study website, and 4.2% enrolled in the study. Letters were sent to 427 recipients; 5.6% visited the study website, and 3.3% enrolled in the study. Coordinators recruited 339 participants in clinic; 23.6% visited the study website, and 16.8% enrolled in the study. Five-hundred-nine of the 580 UNC enrollees (87.8%) were recruited using an electronic method.Discussion: Electronic recruitment reached a wide net of patients, recruited many participants to the study, and resulted in a workflow that can be reused for future studies. In-clinic recruitment saw the highest yield, suggesting that a combination of recruitment methods may be the best approach. Future work should account for demographic skew that may result by recruiting from a pool of patient portal users.Conclusion: The success of electronic recruitment for ADAPTABLE makes this workflow well worth incorporating into an overall recruitment strategy, particularly for a pragmatic trial. [ABSTRACT FROM AUTHOR]- Published
- 2019
- Full Text
- View/download PDF
11. An efficient approach for surveillance of childhood diabetes by type derived from electronic health record data: the SEARCH for Diabetes in Youth Study.
- Author
-
Zhong, Victor W., Obeid, Jihad S., Craig, Jean B., Pfaff, Emily R., Thomas, Joan, Jaacks, Lindsay M., Beavers, Daniel P., Carey, Timothy S., Lawrence, Jean M., Dabelea, Dana, Hamman, Richard F., Bowlby, Deborah A., Pihoker, Catherine, Saydah, Sharon H., and Mayer-Davis, Elizabeth J.
- Abstract
Objective: To develop an efficient surveillance approach for childhood diabetes by type across 2 large US health care systems, using phenotyping algorithms derived from electronic health record (EHR) data.Materials and Methods: Presumptive diabetes cases <20 years of age from 2 large independent health care systems were identified as those having ≥1 of the 5 indicators in the past 3.5 years, including elevated HbA1c, elevated blood glucose, diabetes-related billing codes, patient problem list, and outpatient anti-diabetic medications. EHRs of all the presumptive cases were manually reviewed, and true diabetes status and diabetes type were determined. Algorithms for identifying diabetes cases overall and classifying diabetes type were either prespecified or derived from classification and regression tree analysis. Surveillance approach was developed based on the best algorithms identified.Results: We developed a stepwise surveillance approach using billing code-based prespecified algorithms and targeted manual EHR review, which efficiently and accurately ascertained and classified diabetes cases by type, in both health care systems. The sensitivity and positive predictive values in both systems were approximately ≥90% for ascertaining diabetes cases overall and classifying cases with type 1 or type 2 diabetes. About 80% of the cases with "other" type were also correctly classified. This stepwise surveillance approach resulted in a >70% reduction in the number of cases requiring manual validation compared to traditional surveillance methods.Conclusion: EHR data may be used to establish an efficient approach for large-scale surveillance for childhood diabetes by type, although some manual effort is still needed. [ABSTRACT FROM AUTHOR]- Published
- 2016
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.