1. Review of techniques and models used in optical chemical structure recognition in images and scanned documents.
- Author
-
Musazade, Fidan, Jamalova, Narmin, and Hasanov, Jamaladdin
- Subjects
COMPUTER vision ,CHEMICAL structure ,IMAGE recognition (Computer vision) ,NATURAL language processing ,ARTIFICIAL intelligence ,CHEMICAL formulas ,MACHINE learning - Abstract
Extraction of chemical formulas from images was not in the top priority of Computer Vision tasks for a while. The complexity both on the input and prediction sides has made this task challenging for the conventional Artificial Intelligence and Machine Learning problems. A binary input image which might seem trivial for convolutional analysis was not easy to classify, since the provided sample was not representative of the given molecule: to describe the same formula, a variety of graphical representations which do not resemble each other can be used. Considering the variety of molecules, the problem shifted from classification to that of formula generation, which makes Natural Language Processing (NLP) a good candidate for an effective solution. This paper describes the evolution of approaches from rule-based structure analyses to complex statistical models, and compares the efficiency of models and methodologies used in the recent years. Although the latest achievements deliver ideal results on particular datasets, the authors mention possible problems for various scenarios and provide suggestions for further development. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF