1. Classification with imbalance: A similarity-based method for predicting respiratory failure
- Author
-
Sakyajit Bhattacharya, Vijay Huddar, Harsh Shrivastava, and Vaibhav Rajan
- Subjects
Computer science ,business.industry ,Pattern recognition ,Machine learning ,computer.software_genre ,Imbalanced data ,ComputingMethodologies_PATTERNRECOGNITION ,Respiratory failure ,Binary classification ,Nursing notes ,Intensive care ,Icu stay ,Acute respiratory failure ,Artificial intelligence ,business ,computer ,Classifier (UML) - Abstract
Binary classification based methods are commonly used for designing predictive models in healthcare. A common problem in many healthcare datasets is that of imbalance, where there are far more observations in one class than the other during training. In such conditions, most classifiers do not have good predictive accuracy with respect to the under-represented class. We design a new similarity-based classifier to learn from imbalanced datasets, wherein input features are transformed using similarity with respect to a chosen subset of training points. We empirically demonstrate the superiority of our algorithm over state-of-the-art methods for imbalanced data classification in real and synthetic datasets. We also illustrate the application of our classifier in predicting Acute Respiratory Failure (ARF), a critical complication in Intensive Care Units (ICU), using semi-structured text contained in nursing notes recorded during a patient's ICU stay. Our experiments, on more than 800 patient records show that using our new classifier to learn from text- based features can effectively be used to predict ARF and, potentially, other complications in ICUs.
- Published
- 2015
- Full Text
- View/download PDF