Back to Search Start Over

Machine learning method using position-specific mutation based classification outperforms one hot coding for disease severity prediction in haemophilia 'A'.

Authors :
Singh VK
Maurya NS
Mani A
Yadav RS
Source :
Genomics [Genomics] 2020 Nov; Vol. 112 (6), pp. 5122-5128. Date of Electronic Publication: 2020 Sep 11.
Publication Year :
2020

Abstract

Haemophilia is an X-linked genetic disorder in which A and B types are the most common that occur due to absence or lack of protein factors VIII and IX, respectively. Severity of the disease depends on mutation. Available Machine Learning (ML) methods that predict the mutational severity by using traditional encoding approaches, generally have high time complexity and compromised accuracy. In this study, Haemophilia 'A' patient mutation dataset containing 7784 mutations was processed by the proposed Position-Specific Mutation (PSM) and One-Hot Encoding (OHE) technique to predict the disease severity. The dataset processed by PSM and OHE methods was analyzed and trained for classification of mutation severity level using various ML algorithms. Surprisingly, PSM outperformed OHE, both in terms of time efficiency and accuracy, with training and prediction time improvement in the range of approximately 91 to 98% and 80 to 99% respectively. The severity prediction accuracy also improved by using PSM with different ML algorithms.<br /> (Copyright © 2020 Elsevier Inc. All rights reserved.)

Details

Language :
English
ISSN :
1089-8646
Volume :
112
Issue :
6
Database :
MEDLINE
Journal :
Genomics
Publication Type :
Academic Journal
Accession number :
32927010
Full Text :
https://doi.org/10.1016/j.ygeno.2020.09.020