10 results on '"Malin, Bradley"'
Search Results
2. Dobbs and the future of health data privacy for patients and healthcare organizations.
- Author
-
Clayton, Ellen Wright, Embí, Peter J, and Malin, Bradley A
- Abstract
The Supreme Court recently overturned settled case law that affirmed a pregnant individual's Constitutional right to an abortion. While many states will commit to protect this right, a large number of others have enacted laws that limit or outright ban abortion within their borders. Additional efforts are underway to prevent pregnant individuals from seeking care outside their home state. These changes have significant implications for delivery of healthcare as well as for patient-provider confidentiality. In particular, these laws will influence how information is documented in and accessed via electronic health records and how personal health applications are utilized in the consumer domain. We discuss how these changes may lead to confusion and conflict regarding use of health information, both within and across state lines, why current health information security practices may need to be reconsidered, and what policy options may be possible to protect individuals' health information. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
3. Ensuring electronic medical record simulation through better training, modeling, and evaluation.
- Author
-
Zhang, Ziqi, Yan, Chao, Mesa, Diego A, Sun, Jimeng, and Malin, Bradley A
- Abstract
Objective: Electronic medical records (EMRs) can support medical research and discovery, but privacy risks limit the sharing of such data on a wide scale. Various approaches have been developed to mitigate risk, including record simulation via generative adversarial networks (GANs). While showing promise in certain application domains, GANs lack a principled approach for EMR data that induces subpar simulation. In this article, we improve EMR simulation through a novel pipeline that (1) enhances the learning model, (2) incorporates evaluation criteria for data utility that informs learning, and (3) refines the training process.Materials and Methods: We propose a new electronic health record generator using a GAN with a Wasserstein divergence and layer normalization techniques. We designed 2 utility measures to characterize similarity in the structural properties of real and simulated EMRs in the original and latent space, respectively. We applied a filtering strategy to enhance GAN training for low-prevalence clinical concepts. We evaluated the new and existing GANs with utility and privacy measures (membership and disclosure attacks) using billing codes from over 1 million EMRs at Vanderbilt University Medical Center.Results: The proposed model outperformed the state-of-the-art approaches with significant improvement in retaining the nature of real records, including prediction performance and structural properties, without sacrificing privacy. Additionally, the filtering strategy achieved higher utility when the EMR training dataset was small.Conclusions: These findings illustrate that EMR simulation through GANs can be substantially improved through more appropriate training, modeling, and evaluation criteria. [ABSTRACT FROM AUTHOR]- Published
- 2020
- Full Text
- View/download PDF
4. Interaction patterns of trauma providers are associated with length of stay.
- Author
-
You Chen, Patel, Mayur B., McNaughton, Candace D., Malin, Bradley A., and Chen, You
- Abstract
Background: Trauma-related hospitalizations drive a high percentage of health care expenditure and inpatient resource consumption, which is directly related to length of stay (LOS). Robust and reliable interactions among health care employees can reduce LOS. However, there is little known about whether certain patterns of interactions exist and how they relate to LOS and its variability. The objective of this study is to learn interaction patterns and quantify the relationship to LOS within a mature trauma system and long-standing electronic medical record (EMR).Methods: We adapted a spectral co-clustering methodology to infer the interaction patterns of health care employees based on the EMR of 5588 hospitalized adult trauma survivors. The relationship between interaction patterns and LOS was assessed via a negative binomial regression model. We further assessed the influence of potential confounders by age, number of health care encounters to date, number of access action types care providers committed to patient EMRs, month of admission, phenome-wide association study codes, procedure codes, and insurance status.Results: Three types of interaction patterns were discovered. The first pattern exhibited the most collaboration between employees and was associated with the shortest LOS. Compared to this pattern, LOS for the second and third patterns was 0.61 days (P = 0.014) and 0.43 days (P = 0.037) longer, respectively. Although the 3 interaction patterns dealt with different numbers of patients in each admission month, our results suggest that care was provided for similar patients.Discussion: The results of this study indicate there is an association between LOS and the extent to which health care employees interact in the care of an injured patient. The findings further suggest that there is merit in ascertaining the content of these interactions and the factors that induce these differences in interaction patterns within a trauma system. [ABSTRACT FROM AUTHOR]- Published
- 2018
- Full Text
- View/download PDF
5. Identifying collaborative care teams through electronic medical record utilization patterns.
- Author
-
You Chen, Lorenzi, Nancy M., Sandberg, Warren S., Wolgast, Kelly, Malin, Bradley A., and Chen, You
- Abstract
Objective: The goal of this investigation was to determine whether automated approaches can learn patient-oriented care teams via utilization of an electronic medical record (EMR) system.Materials and Methods: To perform this investigation, we designed a data-mining framework that relies on a combination of latent topic modeling and network analysis to infer patterns of collaborative teams. We applied the framework to the EMR utilization records of over 10 000 employees and 17 000 inpatients at a large academic medical center during a 4-month window in 2010. Next, we conducted an extrinsic evaluation of the patterns to determine the plausibility of the inferred care teams via surveys with knowledgeable experts. Finally, we conducted an intrinsic evaluation to contextualize each team in terms of collaboration strength (via a cluster coefficient) and clinical credibility (via associations between teams and patient comorbidities).Results: The framework discovered 34 collaborative care teams, 27 (79.4%) of which were confirmed as administratively plausible. Of those, 26 teams depicted strong collaborations, with a cluster coefficient > 0.5. There were 119 diagnostic conditions associated with 34 care teams. Additionally, to provide clarity on how the survey respondents arrived at their determinations, we worked with several oncologists to develop an illustrative example of how a certain team functions in cancer care.Discussion: Inferred collaborative teams are plausible; translating such patterns into optimized collaborative care will require administrative review and integration with management practices.Conclusions: EMR utilization records can be mined for collaborative care patterns in large complex medical centers. [ABSTRACT FROM AUTHOR]- Published
- 2017
- Full Text
- View/download PDF
6. Optimizing annotation resources for natural language de-identification via a game theoretic framework.
- Author
-
Li, Muqun, Carrell, David, Aberdeen, John, Hirschman, Lynette, Kirby, Jacqueline, Li, Bo, Vorobeychik, Yevgeniy, and Malin, Bradley A.
- Abstract
Objective: Electronic medical records (EMRs) are increasingly repurposed for activities beyond clinical care, such as to support translational research and public policy analysis. To mitigate privacy risks, healthcare organizations (HCOs) aim to remove potentially identifying patient information. A substantial quantity of EMR data is in natural language form and there are concerns that automated tools for detecting identifiers are imperfect and leak information that can be exploited by ill-intentioned data recipients. Thus, HCOs have been encouraged to invest as much effort as possible to find and detect potential identifiers, but such a strategy assumes the recipients are sufficiently incentivized and capable of exploiting leaked identifiers. In practice, such an assumption may not hold true and HCOs may overinvest in de-identification technology. The goal of this study is to design a natural language de-identification framework, rooted in game theory, which enables an HCO to optimize their investments given the expected capabilities of an adversarial recipient.Methods: We introduce a Stackelberg game to balance risk and utility in natural language de-identification. This game represents a cost-benefit model that enables an HCO with a fixed budget to minimize their investment in the de-identification process. We evaluate this model by assessing the overall payoff to the HCO and the adversary using 2100 clinical notes from Vanderbilt University Medical Center. We simulate several policy alternatives using a range of parameters, including the cost of training a de-identification model and the loss in data utility due to the removal of terms that are not identifiers. In addition, we compare policy options where, when an attacker is fined for misuse, a monetary penalty is paid to the publishing HCO as opposed to a third party (e.g., a federal regulator).Results: Our results show that when an HCO is forced to exhaust a limited budget (set to $2000 in the study), the precision and recall of the de-identification of the HCO are 0.86 and 0.8, respectively. A game-based approach enables a more refined cost-benefit tradeoff, improving both privacy and utility for the HCO. For example, our investigation shows that it is possible for an HCO to release the data without spending all their budget on de-identification and still deter the attacker, with a precision of 0.77 and a recall of 0.61 for the de-identification. There also exist scenarios in which the model indicates an HCO should not release any data because the risk is too great. In addition, we find that the practice of paying fines back to a HCO (an artifact of suing for breach of contract), as opposed to a third party such as a federal regulator, can induce an elevated level of data sharing risk, where the HCO is incentivized to bait the attacker to elicit compensation.Conclusions: A game theoretic framework can be applied in leading HCO's to optimized decision making in natural language de-identification investments before sharing EMR data. [ABSTRACT FROM AUTHOR]- Published
- 2016
- Full Text
- View/download PDF
7. Building bridges across electronic health record systems through inferred phenotypic topics.
- Author
-
Chen, You, Ghosh, Joydeep, Bejan, Cosmin Adrian, Gunter, Carl A., Gupta, Siddharth, Kho, Abel, Liebovitz, David, Sun, Jimeng, Denny, Joshua, and Malin, Bradley
- Abstract
Objective Data in electronic health records (EHRs) is being increasingly leveraged for secondary uses, ranging from biomedical association studies to comparative effectiveness. To perform studies at scale and transfer knowledge from one institution to another in a meaningful way, we need to harmonize the phenotypes in such systems. Traditionally, this has been accomplished through expert specification of phenotypes via standardized terminologies, such as billing codes. However, this approach may be biased by the experience and expectations of the experts, as well as the vocabulary used to describe such patients. The goal of this work is to develop a data-driven strategy to (1) infer phenotypic topics within patient populations and (2) assess the degree to which such topics facilitate a mapping across populations in disparate healthcare systems. Methods We adapt a generative topic modeling strategy, based on latent Dirichlet allocation, to infer phenotypic topics. We utilize a variance analysis to assess the projection of a patient population from one healthcare system onto the topics learned from another system. The consistency of learned phenotypic topics was evaluated using (1) the similarity of topics, (2) the stability of a patient population across topics, and (3) the transferability of a topic across sites. We evaluated our approaches using four months of inpatient data from two geographically distinct healthcare systems: (1) Northwestern Memorial Hospital (NMH) and (2) Vanderbilt University Medical Center (VUMC). Results The method learned 25 phenotypic topics from each healthcare system. The average cosine similarity between matched topics across the two sites was 0.39, a remarkably high value given the very high dimensionality of the feature space. The average stability of VUMC and NMH patients across the topics of two sites was 0.988 and 0.812, respectively, as measured by the Pearson correlation coefficient. Also the VUMC and NMH topics have smaller variance of characterizing patient population of two sites than standard clinical terminologies (e.g., ICD9), suggesting they may be more reliably transferred across hospital systems. Conclusions Phenotypic topics learned from EHR data can be more stable and transferable than billing codes for characterizing the general status of a patient population. This suggests that EHR-based research may be able to leverage such phenotypic topics as variables when pooling patient populations in predictive models. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
8. Secure construction of k-unlinkable patient records from distributed providers
- Author
-
Malin, Bradley
- Subjects
- *
MEDICAL records , *LEGAL status of patients , *HEALTH facilities , *COMPUTER algorithms , *COMPUTERS in medicine , *HOSPITAL admission & discharge , *DATABASES , *MEDICAL informatics - Abstract
Abstract: Objectives: Healthcare organizations must adopt measures to uphold their patients’ right to anonymity when sharing sensitive records, such as DNA sequences, to publicly accessible databanks. This is often achieved by suppressing patient identifiable information; however, such a practice is insufficient because the same organizations may disclose identified patient information, devoid of the sensitive information, for other purposes and patients’ organization-visit patterns, or trails, can re-identify records to the identities from which they were derived. There exist various algorithms that healthcare organizations can apply to ascertain when a patient''s record is susceptible to trail re-identification, but they require organizations to exchange information regarding the identities of their patients prior to data protection certification. In this paper, we introduce an algorithmic approach to formally thwart trail re-identification in a secure setting. Methods and materials: We present a framework that allows data holders to securely collaborate through a third party. In doing so, healthcare organizations keep all sensitive information in an encrypted state until the third party certifies that the data to be disclosed satisfies a formal data protection model. The model adopted for this work is an extended form of k-unlinkability, a protection model that, until this work, was applied in a non-secure setting only. Given the framework and protection model, we develop an algorithm to generate data that satisfies the protection model. In doing so, we enable healthcare organizations to prevent trail re-identification without revealing identified information. Results: Theoretically, we prove that the proposed data protection model does not leak information, even in the context of an organization''s prior knowledge. Empirically, we use real world hospital discharge records to demonstrate that, while the secure protocol induces additional suppression of patient information in comparison to an existing non-secure approach, the quantity of data disclosed by the secure protocol remains substantial. For instance, in a population of over 7700 sickle cell anemia patients, the non-secure protocol discloses 99.48% of DNA records whereas the secure protocol permits the disclosure of 99.41%. Conclusions: Our results demonstrate healthcare organizations can collaborate to disclose significant quantities of personal biomedical data without violating their anonymity in the process. [Copyright &y& Elsevier]
- Published
- 2010
- Full Text
- View/download PDF
9. A computational model to protect patient data from location-based re-identification
- Author
-
Malin, Bradley
- Subjects
- *
ARTIFICIAL intelligence in medicine , *RIGHT of privacy , *PATIENTS' rights , *GENOMICS , *MEDICAL record access control , *DATABASES - Abstract
Objective: Health care organizations must preserve a patient''s anonymity when disclosing personal data. Traditionally, patient identity has been protected by stripping identifiers from sensitive data such as DNA. However, simple automated methods can re-identify patient data using public information. In this paper, we present a solution to prevent a threat to patient anonymity that arises when multiple health care organizations disclose data. In this setting, a patient''s location visit pattern, or “trail”, can re-identify seemingly anonymous DNA to patient identity. This threat exists because health care organizations (1) cannot prevent the disclosure of certain types of patient information and (2) do not know how to systematically avoid trail re-identification. In this paper, we develop and evaluate computational methods that health care organizations can apply to disclose patient-specific DNA records that are impregnable to trail re-identification. Methods and materials: To prevent trail re-identification, we introduce a formal model called k-unlinkability, which enables health care administrators to specify different degrees of patient anonymity. Specifically, k-unlinkability is satisfied when the trail of each DNA record is linkable to no less than k identified records. We present several algorithms that enable health care organizations to coordinate their data disclosure, so that they can determine which DNA records can be shared without violating k-unlinkability. We evaluate the algorithms with the trails of patient populations derived from publicly available hospital discharge databases. Algorithm efficacy is evaluated using metrics based on real world applications, including the number of suppressed records and the number of organizations that disclose records. Results: Our experiments indicate that it is unnecessary to suppress all patient records that initially violate k-unlinkability. Rather, only portions of the trails need to be suppressed. For example, if each hospital discloses 100% of its data on patients diagnosed with cystic fibrosis, then 48% of the DNA records are 5-unlinkable. A naïve solution would suppress the 52% of the DNA records that violate 5-unlinkability. However, by applying our protection algorithms, the hospitals can disclose 95% of the DNA records, all of which are 5-unlinkable. Similar findings hold for all populations studied. Conclusion: This research demonstrates that patient anonymity can be formally protected in shared databases. Our findings illustrate that significant quantities of patient-specific data can be disclosed with provable protection from trail re-identification. The configurability of our methods allows health care administrators to quantify the effects of different levels of privacy protection and formulate policy accordingly. [Copyright &y& Elsevier]
- Published
- 2007
- Full Text
- View/download PDF
10. De-identification of clinical narratives through writing complexity measures.
- Author
-
Li, Muqun, Carrell, David, Aberdeen, John, Hirschman, Lynette, and Malin, Bradley A.
- Subjects
- *
ELECTRONIC health records , *MACHINE learning , *STYLOMETRY , *RADIOLOGY , *COMPARATIVE studies , *HOSPITAL admission & discharge - Abstract
Purpose Electronic health records contain a substantial quantity of clinical narrative, which is increasingly reused for research purposes. To share data on a large scale and respect privacy, it is critical to remove patient identifiers. De-identification tools based on machine learning have been proposed; however, model training is usually based on either a random group of documents or a pre-existing document type designation (e.g., discharge summary). This work investigates if inherent features, such as the writing complexity, can identify document subsets to enhance de-identification performance. Methods We applied an unsupervised clustering method to group two corpora based on writing complexity measures: a collection of over 4500 documents of varying document types (e.g., discharge summaries, history and physical reports, and radiology reports) from Vanderbilt University Medical Center (VUMC) and the publicly available i2b2 corpus of 889 discharge summaries. We compare the performance (via recall, precision, and F -measure) of de-identification models trained on such clusters with models trained on documents grouped randomly or VUMC document type. Results For the Vanderbilt dataset, it was observed that training and testing de-identification models on the same stylometric cluster (with the average F -measure of 0.917) tended to outperform models based on clusters of random documents (with an average F -measure of 0.881). It was further observed that increasing the size of a training subset sampled from a specific cluster could yield improved results (e.g., for subsets from a certain stylometric cluster, the F -measure raised from 0.743 to 0.841 when training size increased from 10 to 50 documents, and the F -measure reached 0.901 when the size of the training subset reached 200 documents). For the i2b2 dataset, training and testing on the same clusters based on complexity measures (average F -score 0.966) did not significantly surpass randomly selected clusters (average F -score 0.965). Conclusions Our findings illustrate that, in environments consisting of a variety of clinical documentation, de-identification models trained on writing complexity measures are better than models trained on random groups and, in many instances, document types. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.