Start Over

Optical Music Recognition with Convolutional Sequence-to-Sequence Models

Authors :: van der Wel, E.
Ullrich, K.
Hu, X.
Cunningham, S.J.
Turnbull, D.
Duan, Z.
Amsterdam Machine Learning lab (IVI, FNWI)
Source :: ISMIR 2017: Proceedings of the 18th International Society for Music Information Retrieval Conference : October 23-27, 2017, Suzhou, China, 731-737, STARTPAGE=731;ENDPAGE=737;TITLE=ISMIR 2017
Publication Year :: 2017
Publisher :: Zenodo, 2017.
Abstract: Optical Music Recognition (OMR) is an important technology within Music Information Retrieval. Deep learning models show promising results on OMR tasks, but symbol-level annotated data sets of sufficient size to train such models are not available and difficult to develop. We present a deep learning architecture called a Convolutional Sequence-to-Sequence model to both move towards an end-to-end trainable OMR pipeline, and apply a learning process that trains on full sentences of sheet music instead of individually labeled symbols. The model is trained and evaluated on a human generated data set, with various image augmentations based on real-world scenarios. This data set is the first publicly available set in OMR research with sufficient size to train and evaluate deep learning models. With the introduced augmentations a pitch recognition accuracy of 81% and a duration accuracy of 94% is achieved, resulting in a note level accuracy of 80%. Finally, the model is compared to commercially available methods, showing a large improvements over these applications.<br />Comment: ISMIR 2017

Subjects :: FOS: Computer and information sciences
Sound (cs.SD)
Computer Vision and Pattern Recognition (cs.CV)
Computer Science - Computer Vision and Pattern Recognition
Computer Science - Sound
Information Retrieval (cs.IR)
Computer Science - Information Retrieval

Details

Database :: OpenAIRE
Journal :: ISMIR 2017: Proceedings of the 18th International Society for Music Information Retrieval Conference : October 23-27, 2017, Suzhou, China, 731-737, STARTPAGE=731;ENDPAGE=737;TITLE=ISMIR 2017
Accession number :: edsair.doi.dedup.....3fa645042f8ec2d1e5a310e6e4869c96
Full Text :: https://doi.org/10.5281/zenodo.1415664