Back to Search Start Over

Medical data quality assessment: On the development of an automated framework for medical data curation

Authors :
Pezoulas, V.C. Kourou, K.D. Kalatzis, F. Exarchos, T.P. Venetsanopoulou, A. Zampeli, E. Gandolfo, S. Skopouli, F. De Vita, S. Tzioufas, A.G. Fotiadis, D.I.
Publication Year :
2019

Abstract

Data quality assessment has gained attention in the recent years since more and more companies and medical centers are highlighting the importance of an automated framework to effectively manage the quality of their big data. Data cleaning, also known as data curation, lies in the heart of the data quality assessment and is a key aspect prior to the development of any data analytics services. In this work, we present the objectives, functionalities and methodological advances of an automated framework for data curation from a medical perspective. The steps towards the development of a system for data quality assessment are first described along with multidisciplinary data quality measures. A three-layer architecture which realizes these steps is then presented. Emphasis is given on the detection and tracking of inconsistencies, missing values, outliers, and similarities, as well as, on data standardization to finally enable data harmonization. A case study is conducted in order to demonstrate the applicability and reliability of the proposed framework on two well-established cohorts with clinical data related to the primary Sjögren's Syndrome (pSS). Our results confirm the validity of the proposed framework towards the automated and fast identification of outliers, inconsistencies, and highly-correlated and duplicated terms, as well as, the successful matching of more than 85% of the pSS-related medical terms in both cohorts, yielding more accurate, relevant, and consistent clinical data. © 2019

Details

Language :
English
Database :
OpenAIRE
Accession number :
edsair.od......2127..460bc569ea40825f84e13e8aa39d9c5c