Back to Search Start Over

OCR-MRD: Performance Analysis of Different Optical Character Recognition Engines for Medical Report Digitization

Authors :
Pulkit Batra
Nimish Phalnikar
Deepesh Kurmi
Jitendra Tembhurne
Parul Sahare
Tausif Diwan
Publication Year :
2023
Publisher :
Research Square Platform LLC, 2023.

Abstract

In the modern era, the necessity of digitization is increasing in a rapid manner day-to-day. The healthcare industries are working towards operating in a paperless environment. Digitizing the medical lab records help the patients in hassle-free management of their medical data. It may also prove beneficial for insurance companies for designing various medical insurance policies which can be patient-centric rather than being generalized. Optical Character Recognition (OCR) technology is demonstrated its usefulness for such cases and thus, to know the best possible solution for digitizing the medical lab records, there is a need to perform an extensive comparative study on the different OCR techniques available for this purpose. It is observed that the current research is focused mainly on the pre-processing image techniques for OCR development, however, their effects on OCR performance specially for medical report digitization yet not been studied. Herein this work, three OCR Engines viz Tesseract, EasyOCR and DocTR, and 6 pre-processing techniques: image binarization, brightness transformations, gamma correction, sigmoid stretching, bilateral filtering and image sharpening are surveyed in detail. In addition, an extensive comparative study of the performance of the OCR Engines while applying the different combinations of the image pre-processing techniques, and their effect on the OCR accuracy is presented.

Details

Database :
OpenAIRE
Accession number :
edsair.doi...........886a76c7360c69fe2124b0944096fe17
Full Text :
https://doi.org/10.21203/rs.3.rs-2513255/v1