Activation Functions: Experimentation and Comparison

Authors :: Disha Gangadia
Source :: 2021 6th International Conference for Convergence in Technology (I2CT).
Publication Year :: 2021
Publisher :: IEEE, 2021.
Abstract: Activation functions are mathematical functions that are used to activate the neurons of an Artificial Neural Network. Non-linear activation functions mainly help a neural network to converge faster while learning and finding patterns in the complex input data. A neural network learns by updating the weights, which is done using the Back Propagation algorithm, which uses first-order derivatives of the activation functions to calculate the gradient descent. This paper tests various existing and proposed activation functions against Minst and Cifar10 datasets for image classification using a shallow Convolutional Neural Network (CNN) Architecture. Based on the results, some of the proposed activation functions: SMod = $x$ * tanh ( $x$ ), the Absolute/Mod Function, a scaled version of Swish, and some other activation functions, are found to be promising. Some of these are then tested against Deeper Neural Networks for various datasets, and it is observed that the average error rate is improved by 2.77. Along with that, suggestions on which activation functions to be used for shallow and deep layers of a Deep Neural Network are provided, resulting in better performance.

Subjects :: 0209 industrial biotechnology
Artificial neural network
Computer science
Hyperbolic function
Word error rate
02 engineering and technology
Function (mathematics)
Convolutional neural network
Backpropagation
Statistical classification
020901 industrial engineering & automation
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Gradient descent
Algorithm

Database :: OpenAIRE
Journal :: 2021 6th International Conference for Convergence in Technology (I2CT)
Accession number :: edsair.doi...........1c0fb37fa4b4d95d642972a54eacae06
Full Text :: https://doi.org/10.1109/i2ct51068.2021.9417890