Back to Search Start Over

Improved parcel sorting by combining automatic speech and character recognition.

Authors :
Singh, Amriteshwar
Sangwan, Abhijeet
Hansen, John H. L.
Source :
2012 IEEE International Conference on Emerging Signal Processing Applications; 1/ 1/2012, p52-55, 4p
Publication Year :
2012

Abstract

Automatic postal sorting systems have traditionally relied on optical character recognition (OCR) technology. While OCR systems perform well for flat mail items such as envelopes, the performance deteriorates for parcels. In this study, we propose a new multimodal solution for parcel sorting which combines automatic speech recognition (ASR) technology with OCR in order to deliver better performance. Our multimodal approach is based on estimating OCR output confidence, and then optionally using ASR system output when OCR results show low confidence. Particularly, we proposed a Levenshtein edit distance (LED) based measure to compute OCR confidence. Based on the OCR confidence measure, a dynamic fusion strategy is developed that forms its final decision based on (i) OCR output alone, (ii) ASR output alone, and (iii) combination of ASR and OCR outputs. The proposed system is evaluated on speech and image data collected in real-world conditions. Our experiments show that the proposed multimodal solution achieves an overall zip code recognition rate of 90.2%, which is a substantial improvement over ASR alone (81%) and OCR alone (80.6%) systems. This advancement represents an important contribution that leverages OCR and ASR technologies to improve address recognition in parcels. [ABSTRACT FROM PUBLISHER]

Details

Language :
English
ISBNs :
9781467308991
Database :
Complementary Index
Journal :
2012 IEEE International Conference on Emerging Signal Processing Applications
Publication Type :
Conference
Accession number :
86556461
Full Text :
https://doi.org/10.1109/ESPA.2012.6152444