1. Best practices for convolutional neural networks applied to visual document analysis
- Author
-
John Platt, David W. Steinkraus, and Patrice Y. Simard
- Subjects
Training set ,Artificial neural network ,business.industry ,Computer science ,Machine learning ,computer.software_genre ,Convolutional neural network ,Support vector machine ,Set (abstract data type) ,Handwriting recognition ,Artificial intelligence ,business ,computer ,MNIST database - Abstract
Neural networks are a powerful technology forclassification of visual inputs arising from documents.However, there is a confusing plethora of different neuralnetwork methods that are used in the literature and inindustry. This paper describes a set of concrete bestpractices that document analysis researchers can use toget good results with neural networks. The mostimportant practice is getting a training set as large aspossible: we expand the training set by adding a newform of distorted data. The next most important practiceis that convolutional neural networks are better suited forvisual document tasks than fully connected networks. Wepropose that a simple "do-it-yourself" implementation ofconvolution with a flexible architecture is suitable formany visual document problems. This simpleconvolutional neural network does not require complexmethods, such as momentum, weight decay, structure-dependentlearning rates, averaging layers, tangent prop,or even finely-tuning the architecture. The end result is avery simple yet general architecture which can yieldstate-of-the-art performance for document analysis. Weillustrate our claims on the MNIST set of English digitimages.
- Published
- 2005
- Full Text
- View/download PDF