Back to Search Start Over

Semantic preservation of standardized healthcare documents in big data.

Authors :
Hussain, Shujaat
Hussain, Maqbool
Afzal, Muhammad
Hussain, Jamil
Bang, Jaehun
Seung, Hyonwoo
Lee, Sungyoung
Source :
International Journal of Medical Informatics. Sep2019, Vol. 129, p133-145. 13p.
Publication Year :
2019

Abstract

<bold>Background: </bold>Standardized healthcare documents have a high adoption rate in today's hospital setup. This brings several challenges as processing the documents on a large scale takes a toll on the infrastructure. The complexity of these documents compounds the issue of handling them which is why applying big data techniques is necessary. The nature of big data techniques can trigger accuracy/semantic loss in health documents when they are partitioned for processing. This semantic loss is critical with respect to clinical use as well as insurance, or medical education.<bold>Methods: </bold>In this paper we propose a novel technique to avoid any semantic loss that happens during the conventional partitioning of healthcare documents in big data through a constraint model based on the conformance of clinical document standard and user based use cases. We used clinical document architecture (CDAR) datasets on Hadoop Distributed File System (HDFS) through uniquely configured setup. We identified the affected documents with respect to semantic loss after partitioning and separated them into two sets: conflict free documents and conflicted documents. The resolution for conflicted documents was done based on different resolution strategies that were mapped according to CDAR specification. The first part of the technique is focused in identifying the type of conflict in the blocks that arises after partitioning. The second part focuses on the resolution mapping of the conflicts based on the constraints applied depending on the validation and user scenario.<bold>Results: </bold>We used a publicly available dataset of CDAR documents, identified all conflicted documents and resolved all the them successfully to avoid any semantic loss. In our experiment we tested up to 87,000 CDAR documents and successfully identified the conflicts and resolved the semantic issues.<bold>Conclusion: </bold>We have presented a novel study that focuses on the semantics of big data which did not compromise the performance and resolved the semantic issues risen during the processing of clinical documents. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13865056
Volume :
129
Database :
Academic Search Index
Journal :
International Journal of Medical Informatics
Publication Type :
Academic Journal
Accession number :
138293937
Full Text :
https://doi.org/10.1016/j.ijmedinf.2019.05.024