Descriptor: "activation functions" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"activation functions"' showing total 473 results

Start Over Descriptor "activation functions"

473 results on '"activation functions"'

1. Pinning down the accuracy of physics-informed neural networks under laminar and turbulent-like aortic blood flow conditions

Author: Aghaee, Arman and Khan, M. Owais
Published: 2025
Full Text: View/download PDF

2. fKAN: Fractional Kolmogorov–Arnold Networks with trainable Jacobi basis functions

Author: Afzal Aghaei, Alireza
Published: 2025
Full Text: View/download PDF

3. Performance analysis of activation functions in molecular property prediction using Message Passing Graph Neural Networks

Author: Chanana, Garima
Published: 2025
Full Text: View/download PDF

4. AFX-PE: Adaptive Fixed-Point Processing Engine for Neural Network Accelerators

Author: Raut, Gopal, Thakur, Ritambhara, Edavoor, Pranose, Selvakumar, David, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Gupta, Anu, editor, Pandey, Jai Gopal, editor, Chaturvedi, Nitin, editor, and Dwivedi, Devesh, editor
Published: 2025
Full Text: View/download PDF

5. Enhancing Histopathological Image Analysis: A Study on Effect of Color Normalization and Activation Functions

Author: Sudhamsh, G. V. S., Rashmi, R., Girisha, S., Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Bairwa, Amit Kumar, editor, Tiwari, Varun, editor, Vishwakarma, Santosh Kumar, editor, Tuba, Milan, editor, and Ganokratanaa, Thittaporn, editor
Published: 2025
Full Text: View/download PDF

6. Determination of the performance of training algorithms and activation functions in meteorological drought index prediction with nonlinear autoregressive neural network.

Author: Gümüş, Münevver Gizem, Çiftçi, Hasan Çağatay, and Gümüş, Kutalmış
Abstract: Analysis of long-term meteorological data is critical for monitoring climate trends and understanding the drought situation in a given region. In this study, monthly average precipitation data from the Niğde meteorological station in Turkey covering the period 1950–2020 were used. Within the scope of the study, seven different drought index methods were used for drought analysis, and the number and percentages of drought conditions were calculated according to these indices. For example, according to the Standardized Precipitation Index (SPI) method, the proportion of dry periods was determined as 16.2% and the proportion of humid periods as 83.8%. The Mann-Kendall trend analysis performed to determine the drought trends of the region revealed an increasing trend towards humidity in all indices (e.g., z = 1.299, p = 0.194 for SPI). In the study, 60-month drought forecasts covering the years 2020–2025 were realized using the Nonlinear Autoregressive Neural Network (NARNN) model, and the results were compared with the Autoregressive (AR) model. In the prediction performance analysis, the NARNN model showed superior prediction performance for all indices with lower RMSE values (e.g., NARNN RMSE = 0.977 for SPI; AR RMSE = 1.704). The prediction performances of different training algorithms and activation functions used in the NARNN model were analyzed. The best performance was obtained with the trainbr training algorithm and sigmoid activation function (e.g., RMSE = 0.997 for SPI). Based on these best parameters, more than 70% of the drought conditions during the 2020–2025 period were found to be normal or humid according to NARNN predictions. This study demonstrates the superiority of the NARNN model in nonlinear time series analyses and that it is a reliable tool, especially for future drought forecasts. In addition, comprehensive analyses with different index methods have significantly contributed to understanding the long-term drought trends in the Niğde region. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

7. Unification of popular artificial neural network activation functions.

Author: Mostafanejad, Mohammad
Subjects: *ARTIFICIAL neural networks, *IMAGE recognition (Computer vision), *FRACTIONAL calculus, *MACHINE learning, *ALGORITHMS, *FRACTIONAL programming
Abstract: We present a unified representation of the most popular neural network activation functions. Adopting Mittag-Leffler functions of fractional calculus, we propose a flexible and compact functional form that is able to interpolate between various activation functions and mitigate common problems in training deep neural networks such as vanishing and exploding gradients. The presented gated representation extends the scope of fixed-shape activation functions to their adaptive counterparts whose shape can be learnt from the training data. The derivatives of the proposed functional form can also be expressed in terms of Mittag-Leffler functions making it suitable for backpropagation algorithms. By training an array of neural network architectures of different complexities on various benchmark datasets, we demonstrate that adopting a unified gated representation of activation functions offers a promising and affordable alternative to individual built-in implementations of activation functions in conventional machine learning frameworks. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. System identification of a nonlinear continuously stirred tank reactor using fractional neural network

Author: Meshach Kumar, Utkal Mehta, and Giansalvo Cirrincione
Subjects: Fractional calculus, Neural Networks, System modeling, Chemical process, Activation functions, CSTR, Chemical engineering, TP155-156
Abstract: Chemical processes are vital in various industries but are often complex and nonlinear, making accurate modeling essential. Traditional linear approaches struggle with dynamic behaviour and changing conditions. This paper explores the advantages of the new theory of fractional neural networks (FNNs), focusing on applying fractional activation functions for continuous stirred tank reactor (CSTR) modeling. The proposed approach offers promising solutions for real-time modeling of a CSTR. Various numerical analyses demonstrate the robustness of FNNs in handling data reduction, achieving better generalization, and sensitivity to noise, which is crucial for real-world applications. The identification process is more generalized and can enhance adaptability and improve industrial plant management efficiency. This research contributes to the growing field of real-time modeling, highlighting its potential to address the complexities in chemical processes.
Published: 2024
Full Text: View/download PDF

9. Three-Dimensional Instance Segmentation Using the Generalized Hough Transform and the Adaptive n-Shifted Shuffle Attention.

Author: Mulindwa, Desire Burume, Du, Shengzhi, and Liu, Qingxue
Subjects: *AUGMENTED reality, *AUTONOMOUS vehicles, *ROBOTICS, *NOISE, *HOUGH transforms
Abstract: The progress of 3D instance segmentation techniques has made it essential for several applications, such as augmented reality, autonomous driving, and robotics. Traditional methods usually have challenges with complex indoor scenes made of multiple objects with different occlusions and orientations. In this work, the authors present an innovative model that integrates a new adaptive n-shifted shuffle (ANSS) attention mechanism with the Generalized Hough Transform (GHT) for robust 3D instance segmentation of indoor scenes. The proposed technique leverages the n-shifted sigmoid activation function, which improves the adaptive shuffle attention mechanism, permitting the network to dynamically focus on relevant features across various regions. A learnable shuffling pattern is produced through the proposed ANSS attention mechanism to spatially rearrange the relevant features, thus augmenting the model's ability to capture the object boundaries and their fine-grained details. The integration of GHT furnishes a vigorous framework to localize and detect objects in the 3D space, even when heavy noise and partial occlusions are present. The authors evaluate the proposed method on the challenging Stanford 3D Indoor Spaces Dataset (S3DIS), where it establishes its superiority over existing methods. The proposed approach achieves state-of-the-art performance in both mean Intersection over Union (IoU) and overall accuracy, showcasing its potential for practical deployment in real-world scenarios. These results illustrate that the integration of the ANSS and the GHT yields a robust solution for 3D instance segmentation tasks. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. Smoothing piecewise linear activation functions based on mollified square root functions.

Author: Pan, Tony Yuxiang, Yang, Guangyu, Zhao, Junli, and Ding, Jieyu
Subjects: *SQUARE root, *SMOOTHNESS of functions, *SKILLETS, *DEEP learning
Abstract: Activation functions(AFs) are crucial components in neural networks for deep learning. Piecewise linear functions(PLFs) have been widely employed to act as AFs, thanks to their computational efficiencies and simplicities. However, PLFs are not completely differentiable, have potential problems of training. The analytical expressions of AFs based on pure PLFs can be smoothed via mollified square root function(MSRF) method, inspired by SquarePlus method of ReLU approximation. In this paper, we propose a proposition defining AFs as maximum or minimum of two PLFs, and transform the results into a smoothed function via MSRF method. Based on MSRF, we modify the well-known AFs composed of two PLFs, three or four PLFs to regularized ones systematically, including ReLU, LReLU, vReLU, Step, Biplolar, BReLU(Bounded ReLU), Htanh(Hard Tanh), Pan(Frying pan function), STF(Soft Thresholding Formulas), HTF(Soft Thresholding Formulas), SReLU(S-shaped ReLU), MReLU(Mexican hat type ReLU), TSF(Trapezoid-shaped function) functions. Additionally, according to the equivalences of SquarePlus and SoftPlus functions, some classic compound AFs, such as ELU, Swish, Mish, SoftSign Logish and DLU functions, can be expressed via MSRF method also. The derivatives of their mollified versions demonstrate their smoothness properties. The proposed method can be extended to AFs composed of multiple PLFs easily, which will be investigated deeply in the future. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. Adaptive activation functions for predictive modeling with sparse experimental data.

Author: Pourkamali-Anaraki, Farhad, Nasrin, Tahamina, Jensen, Robert E., Peterson, Amy M., and Hansen, Christopher J.
Subjects: *IMAGE recognition (Computer vision), *PREDICTION models, *DATA modeling, *CLASSIFICATION, *ENGINEERING
Abstract: A pivotal aspect in the design of neural networks lies in selecting activation functions, crucial for introducing nonlinear structures that capture intricate input–output patterns. While the effectiveness of adaptive or trainable activation functions has been studied in domains with ample data, like image classification problems, significant gaps persist in understanding their influence on classification accuracy and predictive uncertainty in settings characterized by limited data availability. This research aims to address these gaps by investigating the use of two types of adaptive activation functions. These functions incorporate shared and individual trainable parameters per hidden layer and are examined in three testbeds derived from additive manufacturing problems containing fewer than 100 training instances. Our investigation reveals that adaptive activation functions, such as Exponential Linear Unit (ELU) and Softplus, with individual trainable parameters, result in accurate and confident prediction models that outperform fixed-shape activation functions and the less flexible method of using identical trainable activation functions in a hidden layer. Therefore, this work presents an elegant way of facilitating the design of adaptive neural networks in scientific and engineering problems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. MANNOSE - PEROXYDISULFATE REACTION: QUALITATIVE PRODUCT ANALYSIS, SURFACE EFFECT AND EXPERIMENTAL KINETIC MEASUREMENTS.

Author: ABUALREISH, MUSTAFA J. A.
Subjects: *OXIDIZING agents, *MANNOSE, *ACTIVATION energy, *FORMIC acid, *REDUCING agents
Abstract: Peroxydisulfate (PDS) is a powerful oxidizing agent for reducing sugars. Throughout the redox interaction with the D(+)mannose molecule, the current work used the iodometric technique of measurement to track the unconsumed (PDS) at various time intervals. The existence offormaldehyde and formic acid in the qualitative test of the volatile redox reaction products suggests that the mannose molecule undergoes oxidative cleavage at its C-C bonds and the aldehydic and primary alcoholic groups. According to surface effect experiments, the redox reaction is characterized by chain reactions. Based on the results of the kinetics experiments, it was determined that the reaction rate is first order for [PDS] and fractional order for [Mannose] and that both respond more rapidly as the concentrations of these substances rise. Investigating the effect of temperature on the redox reaction under controlled experimental settings revealed that increasing the temperature resulted in a faster pace. The activation energy of the redox process was calculated to be 26.85 kcal/mole based on the oxidation rate measurements taken at different temperatures. Other activation functions, such as the frequency factor, the change in free energy, and the change in entropy, were also measured at different temperatures. A rate law was developed, and appropriate reaction pathways for the oxidation of D(+)mannose were proposed based on the collected experimental data... [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. Constructing a smoothed Leaky ReLU using a linear combination of the smoothed ReLU and identity function

Author: Zhu, Meng, Min, Weidong, Li, Jiahao, Liu, Mengxue, Deng, Ziyang, and Zhang, Yao
Published: 2025
Full Text: View/download PDF

14. Time-variant quadratic programming solving by using finitely-activated RNN models with exact settling time

Author: Sun, Mingxuan, Zhang, Yu, Wang, Liming, Wu, Yuxin, and Zhong, Guomin
Published: 2025
Full Text: View/download PDF

15. Comparison of Multilayer Perceptron with an Optimal Activation Function and Long Short-Term Memory for Rainfall-Runoff Simulations and Ungauged Catchment Runoff Prediction: Comparison of Multilayer Perceptron with an Optimal Activation Function and Long Short-Term Memory for Rainfall-Runoff Simulations and Ungauged Catchment Runoff Prediction

Author: Shin, Mun-Ju and Jung, Yong
Published: 2024
Full Text: View/download PDF

16. A novel optimized parametric hyperbolic tangent swish activation function for 1D-CNN: application of sensor-based human activity recognition and anomaly detection.

Author: Ankalaki, Shilpa and Thippeswamy, M. N.
Subjects: HUMAN activity recognition, INTRUSION detection systems (Computer security), DEEP learning, CONVOLUTIONAL neural networks, ACTIVITIES of daily living
Abstract: Human activity recognition (HAR) and abnormal / anomaly detection have significant applications for health monitoring in a smart environment. Abnormal / anomaly prediction during daily activities helps to indicate whether the person is healthy or having behavior issues that need assistance. HAR has been accomplished using deep learning approaches. The activation functions employed in deep learning models have a significant impact on the training model and the reliability of the model. Several activation functions are developed for deep learning models. Nevertheless, most existing activation functions suffer from the dying gradient problem and lack of utilization of large negative input values. This work proposes a novel activation function called Optimized Parametric Hyperbolic Tangent Swish (OP-Tanish), which is non-monotonicity, unbounded in both negative and positive directions, smooth variations, and introduces a higher degree of non-linearity than Relu and other state-of-art activation functions. We test this activation function by training on customized shallow 1D- Convolutional Neural Networks (CS1DCNN) for adequate recognition of human activities and anomaly detection. The contributions are compared to the state-of-the-art activation functions on benchmark datasets (UCI-HAR, PAMPA2, Opportunity, and Daphnet Gait HAR datasets; UP-FALL and Simulated Activities of Daily Living (SIMADL) anomaly datasets), namely: ReLu and its variants, SWISH, MISH, and LiSHT. The proposed OP-Tanish activation function outperforms other state-of-art activation functions with an accuracy of 99.58%, 99.58%, 95.14%, and 97.79%over UCI, Opportunity, PAMPA2, and Daphnet Gait datasets, respectively, and achieved an accuracy of 97.28% and 98% on UP-Fall and SIMADL dataset. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. Custom Convolutional Neural Network Model for Identification of Nutritional Deficiencies in Children.

Author: Ankalaki, Shilpa, Biradar, Vidyadevi G., G., Kushal, and N., Kavya
Subjects: ARTIFICIAL neural networks, CONVOLUTIONAL neural networks, IMAGE recognition (Computer vision), COGNITIVE development, MALNUTRITION in children, DEEP learning
Abstract: Undernutrition occurs when there are deficiencies in essential vitamins, while overnutrition refers to consuming an excessive amount of nutrients, leading to issues such as obesity, diabetes, and related health problems. Malnutrition often stems from the economic and social status of parents. In children, malnutrition can significantly result in physical and mental growth issues. Therefore, there is a need for early prediction of malnutrition in children to mitigate its adverse effects. Children image comprises great amount of information which can be analyzed for distinguishing between a nourished or malnourished children. The success stories of convolutional neural networks in image classification are the motivation to develop a deep learning model for the classification. In this work a custom designed convolutional neural network model is proposed for classification of children into category nourished or malnourished. The dataset consists of 630 and 1530 nourished and malnutrition children’s images respectively. The model is trained on children’s images database that includes both web scraped images and synthetic images. The convolutional neural network model is optimized by selecting optimal activation function through experimentation. The model is trained for 100 epochs with Adam optimizer and various activation functions. CNN with Relu with weight decay activation function obtained considerably good results with accuracy of 93.44%, precision 0.89, 0.85 recall, 0.87 F1 score for nourished class; and precision 0.95, 0.96 recall, and 0.96 F1 score for malnourished category. The model obtained a good accuracy of 92.83% for Leaky ReLu with precision 0.8, 0.9 recall, 0.85 F1 score for nourished and precision 0.97, 0.94 recall and 0.95 F1 score malnourished. Further, to develop trust in the model visualization of activation maps, convolutional layers and filters are implemented. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. Evaluation of UDP-Based DDoS Attack Detection by Neural Network Classifier with Convex Optimization and Activation Functions.

Author: Dasari, Kishorebabu, Mekala, Srinivas, and Kaka, Jhansi Rani
Subjects: DENIAL of service attacks, CLASSIFICATION algorithms, FEATURE selection, PEARSON correlation (Statistics), INTERNET security, RECURRENT neural networks
Abstract: Distributed Denial of Service (DDoS) stands as a critical cybersecurity concern, representing a malicious tactic employed by hackers to disrupt online services, network resources, or host systems, rendering them inaccessible to legitimate users. DDoS attack detection is essential as it has a wide-ranging impact on the field of computer science. This is quantitative research to evaluate Multilayer Perceptron (MLP) classification algorithm with different optimization methods and different activation functions on UDP-based DDoS attack detection. The CIC-DDoS2019 DDoS evaluation dataset, known for its inclusion of modern DDoS attack types, was instrumental in this study by the Canadian Institute for Cyber Security. The CIC-DDoS2019 dataset encompasses eleven DDoS attack datasets, which are UDP, UDP-Lag, NTP, and TFTP datasets were utilized in this investigation. This study proposes a novel feature selection approach. It specifically targets datasets related to UDP-based DDoS attacks. The approach aims to identify groups of features that share the uncorrelated characteristic. It means None of the features within a subset have a significant relationship with each other as measured by three correlation methods: Pearson, Spearman, and Kendall. To further validate the proposed approach, the researchers conducted experiments on a specially crafted DDoS attack dataset. MLP classification algorithm along with ADAM optimization method and Tanh activation function produce the better results for UDP-based DDoS attack detection. This combination produces the better accuracy values of 99.97 for UDP Flood attack, 99.77 for UDP-Lag attack, 99.70 for NTP attack, 99.93 for TFTP attack and 99.76 for UDP customized DDoS attack. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. A COMPARATIVE EXPLORATION OF ACTIVATION FUNCTIONS FOR IMAGE CLASSIFICATION IN CONVOLUTIONAL NEURAL NETWORKS.

Author: MAKHDOOM, FAIZA and RAHMAN, JAMSHAID UL
Subjects: ARTIFICIAL neural networks, DEEP learning, ARTIFICIAL intelligence, MACHINE learning, DIGITAL image processing, COMPUTER vision
Abstract: Activation functions play a crucial role in enabling neural networks to carry out tasks with increased flexibility by introducing non-linearity. The selection of appropriate activation functions becomes even more crucial, especially in the context of deeper networks where the objective is to learn more intricate patterns. Among various deep learning tools, Convolutional Neural Networks (CNNs) stand out for their exceptional ability to learn complex visual patterns. In practice, ReLu is commonly employed in convolutional layers of CNNs, yet other activation functions like Swish can demonstrate superior training performance while maintaining good testing accuracy on different datasets. This paper presents an optimally refined strategy for deep learning-based image classification tasks by incorporating CNNs with advanced activation functions and an adjustable setting of layers. A thorough analysis has been conducted to support the effectiveness of various activation functions when coupled with the favorable softmax loss, rendering them suitable for ensuring a stable training process. The results obtained on the CIFAR-10 dataset demonstrate the favorability and stability of the adopted strategy throughout the training process. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Performance of Drought Indices on Maize Production in Northern Nigeria Using Artificial Neural Network Model.

Author: Adepoju, Adedayo A., Ogundunmade, Tayo P., and Adenuga, Grace O.
Subjects: ARTIFICIAL neural networks, DROUGHTS
Abstract: Drought is widely known to put the ecosystem at risk. It ensues when there is a major rainfall shortage that causes hydrological discrepancies and alters the land productive structures. The degree of rainfall influences the growth and harvests of maize, particularly where irrigation is not practicable. In some parts of northern Nigeria, rainfall is unpredictable and often lower than the quantity needed for a viable crop. For the detection, classification, and control of drought conditions, drought indices are used. There has been notable progress in the last few years in terms of modelling droughts by utilizing statistical or physical models. Despite the successes documented by most of these approaches; a plain, effective, and well-built statistical model is the artificial neural network (ANN). The use of artificial neural networks (ANN) to evaluate the impact of drought indices on maize output in the 17 northern Nigerian states is presented in this research. For a 25-year period from 1993 to 2018, observed annual data of drought indices, RDI, and the Palmer drought indices, which comprise SCPDSI, SCPHDI, and SCWLPM, as well as maize yield (measured in tonnes) in Northern states of Nigeria. The ANN model was evaluated using several activation functions (sigmoid, hyperbolic tangent, and rectified linear unit), hidden layers (1, 2, and 3), and training sets (70%, 80%, and 90%). The Mean Square Error (MSE) was employed to evaluate each ANN model's performance. In summary, most of the states' lowest mean square errors (MSEs) were generated via RELU. Also, as the training percentage increases, the mean square error increases. [ABSTRACT FROM AUTHOR]
Published: 2024

21. Dynamic base station allocation for 6G wireless networks through narrow neural network.

Author: Kamble, Pradnya and Shaikh, Alam N.
Subjects: ARTIFICIAL neural networks, ARTIFICIAL intelligence, TERAHERTZ technology, COMPUTER network traffic, WIRELESS communications, RESOURCE allocation
Abstract: The 6G wireless communication system will utilize the terahertz (THz) frequency band (0.1-10 THz) to meet customer demand for increased data rates and ultra-high-speed communication in future applications. The exponential surge in data traffic, which is supported by dynamic resource allocation. To mitigate this challenge, the use of artificial intelligence-based methods, such as narrow neural network (NNN), can help to smooth the performance of the network. In this paper, an NNN-based approach for dynamic base station allocation for 6G wireless networks is proposed 14 different 6G parameters used to train the NNN model, initially achieving an accuracy of 89.5% and an F1 score of 0.72 for 200 users. Results demonstrate the efficacy of the proposed NNN approach for dynamic decision-making in 6G networks and its potential for application in other domains where similar problems exist. Moreover, the proposed narrow neural network model shows improved results with an increase in number of users and decrease in fully connected layers and regularization strength (lambda). The validation accuracy received is 98.9% and 99.6% for thousand users with single fully connected layer, none (linear) activation function and regularization strength lambda values of 0.01 and 0.001. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. Building Blocks

Author: Singh, Pradeep, Raman, Balasubramanian, Kacprzyk, Janusz, Series Editor, Singh, Pradeep, and Raman, Balasubramanian
Published: 2024
Full Text: View/download PDF

23. Hybrid Approach—Diabetic Retinopathy Classification Through Activation Function Optimization

Author: Hegde, Nikhil Venkatraman, Lewis, Jebon Tarun, Malghan, Rashmi Laxmikant, Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Malik, Hasmat, editor, Mishra, Sukumar, editor, Sood, Y. R., editor, García Márquez, Fausto Pedro, editor, and Ustun, Taha Selim, editor
Published: 2024
Full Text: View/download PDF

24. Can Monetary Policy Uncertainty Predict Exchange Rate Volatility? New Evidence from Hybrid Neural Network−GARCH Model

Author: Maneejuk, Parevee, Chitkasame, Terdthiti, Klinlampu, Chaiwat, Rakpho, Pichayakone, Kacprzyk, Janusz, Series Editor, Novikov, Dmitry A., Editorial Board Member, Shi, Peng, Editorial Board Member, Cao, Jinde, Editorial Board Member, Polycarpou, Marios, Editorial Board Member, Pedrycz, Witold, Editorial Board Member, Kreinovich, Vladik, editor, Yamaka, Woraphon, editor, and Leurcharusmee, Supanika, editor
Published: 2024
Full Text: View/download PDF

25. Neural Networks and Deep Learning

Author: Hashemi, Amin, Dowlatshahi, Mohammad Bagher, Kulkarni, Anand J, Section editor, Kulkarni, Anand J., editor, and Gandomi, Amir H., editor
Published: 2024
Full Text: View/download PDF

26. Cross-Relational Reasoning for Neural Tensor Networks

Author: Falck, Tristan, Coulter, Duncan, Rannenberg, Kai, Editor-in-Chief, Soares Barbosa, Luís, Editorial Board Member, Carette, Jacques, Editorial Board Member, Tatnall, Arthur, Editorial Board Member, Neuhold, Erich J., Editorial Board Member, Stiller, Burkhard, Editorial Board Member, Stettner, Lukasz, Editorial Board Member, Pries-Heje, Jan, Editorial Board Member, Kreps, David, Editorial Board Member, Rettberg, Achim, Editorial Board Member, Furnell, Steven, Editorial Board Member, Mercier-Laurent, Eunika, Editorial Board Member, Winckler, Marco, Editorial Board Member, Malaka, Rainer, Editorial Board Member, Maglogiannis, Ilias, editor, Iliadis, Lazaros, editor, Macintyre, John, editor, Avlonitis, Markos, editor, and Papaleonidas, Antonios, editor
Published: 2024
Full Text: View/download PDF

27. Bits and Beats: Computing Rhythmic Information as Bitwise Operations Optimized for Machine Learning

Author: Gualda, Fernando, Hartmanis, Juris, Founding Editor, Goos, Gerhard, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Noll, Thomas, editor, Montiel, Mariana, editor, Gómez, Francisco, editor, Hamido, Omar Costa, editor, Besada, José Luis, editor, and Martins, José Oliveira, editor
Published: 2024
Full Text: View/download PDF

28. Importance of the Activation Function in Extreme Learning Machine for Acid Sulfate Soil Classification

Author: Estévez, Virginia, Mattbäck, Stefan, Björk, Kaj-Mikael, Lim, Meng-Hiot, Series Editor, and Björk, Kaj-Mikael, editor
Published: 2024
Full Text: View/download PDF

29. Exploring the Efficiency of Neural Networks for Solving Dynamic Process Problems: The Fisher Equation Investigation

Author: Karachurin, Raul, Ladygin, Stanislav, Ryabov, Pavel, Shilnikov, Kirill, Kudryashov, Nikolay, Kacprzyk, Janusz, Series Editor, Samsonovich, Alexei V., editor, and Liu, Tingting, editor
Published: 2024
Full Text: View/download PDF

30. Filtering Approaches and Mish Activation Function Applied on Handwritten Chinese Character Recognition

Author: Zhong Yingna, Kauthar Mohd Daud, Kohbalan Moorthy, and Ain Najiha Mohamad Nor
Subjects: activation functions, convolutional neural network, cnn, filtering approaches, machine learning, Electronic computers. Computer science, QA75.5-76.95
Abstract: Handwritten Chinese Characters (HCC) have recently received much attention as a global means of exchanging information and knowledge. The start of the information age has increased the number of paper documents that must be electronically saved and shared. The recognition accuracy of online handwritten Chinese characters has reached its limit as online characters are more straightforward than offline characters. Furthermore, online character recognition enables stronger involvement and flexibility than offline characters. Deep learning techniques, such as convolutional neural networks (CNN), have superseded conventional Handwritten Chinese Character Recognition (HCCR) solutions, as proven in image identification. Nonetheless, because of the large number of comparable characters and styles, there is still an opportunity to improve the present recognition accuracy by adopting different activation functions, including Mish, Sigmoid, Tanh, and ReLU. The main goal of this study is to apply a filter and activation function that has a better impact on the recognition system to improve the performance of the recognition CNN model. In this study, we implemented different filter techniques and activation functions in CNN to offline Chinese characters to understand the effects of the model's recognition outcome. Two CNN layers are proposed given that they achieve comparative performances using fewer-layer CNN. The results demonstrate that the Weiner filter has better recognition performance than the median and average filters. Furthermore, the Mish activation function performs better than the Sigmoid, Tanh, and ReLU functions.
Published: 2024
Full Text: View/download PDF

31. Advancing brain tumor detection with neurofusion: an innovative CNN-LSTM model featuring a novel activation function

Author: Rawat, Usha and Rai, C. S.
Published: 2024
Full Text: View/download PDF

32. Data-driven solitons dynamics and parameters discovery in the generalized nonlinear dispersive mKdV-type equation via deep neural networks learning.

Author: Wang, Xiaoli, Han, Wenjing, Wu, Zekang, and Yan, Zhenya
Abstract: In this paper, we study the dynamics of data-driven solutions and identify the unknown parameters of the nonlinear dispersive modified KdV-type (mKdV-type) equation based on physics-informed neural networks (PINNs). Specifically, we learn the soliton solution, the combination of a soliton and an anti-soliton solution, the combination of two solitons and one anti-soliton solution, and the combination of two solitons and two anti-solitons solution of the mKdV-type equation by two different transformations. Meanwhile, we learn the data-driven kink solution, peakon solution, and periodic solution using the PINNs method. By utilizing image simulations, we conduct a detailed analysis of the nonlinear dynamical behaviors of the aforementioned solutions in the spatial-temporal domain. Our findings indicate that the PINNs method solves the mKdV-type equation with relative errors of O (10 - 3) or O (10 - 4) for the multi-soliton and kink solutions, respectively, while relative errors for the peakon and periodic solutions reach O (10 - 2) . In addition, the tanh function has the best training effect by comparing eight common activation functions (e.g., ReLU (x) , ELU (x) , SiLU (x) , sigmoid (x) , swish (x) , sin (x) , cos (x) , and tanh (x) ). For the inverse problem, we invert the soliton solution and identify the unknown parameters with relative errors reaching O (10 - 2) or O (10 - 3) . Furthermore, we discover that adding appropriate noise to the initial condition enhances the robustness of the model. Our research results are crucial for understanding phenomena such as interactions in travelling waves, aiding in the discovery of physical processes and dynamic features in nonlinear systems, which have significant implications in fields such as nonlinear optics and plasma physics. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Comparison of neural networks techniques to predict subsurface parameters based on seismic inversion: a machine learning approach.

Author: Verma, Nitin, Maurya, S. P., kant, Ravi, Singh, K. H., Singh, Raghav, Singh, A. P., Hema, G., Srivastava, M. K., Tiwari, Alok K., Kushwaha, P. K., and Singh, Richa
Subjects: *ARTIFICIAL neural networks, *MACHINE learning, *FEEDFORWARD neural networks, *RADIAL basis functions, *ACOUSTIC impedance, *PARAMETER estimation
Abstract: Seismic inversion, complemented by machine learning algorithms, significantly improves the accuracy and efficiency of subsurface parameter estimation from seismic data. In this comprehensive study, a comparative analysis of machine learning techniques is conducted to predict subsurface parameters within the inter-well region. The objective involves employing three separate machine learning algorithms namely Probabilistic Neural Network (PNN), multilayer feedforward neural network (MLFNN), and Radial Basis Function Neural Network (RBFNN). The study commences by generating synthetic data, which is then subjected to machine learning techniques for inversion into subsurface parameters. The results unveil exceptionally detailed subsurface information across various methods. Subsequently, these algorithms are applied to real data from the Blackfoot field in Canada to predict porosity, density, and P-wave velocity within the inter-well region. The inverted results exhibit a remarkable alignment with well-log parameters, achieving an average correlation of 0.75, 0.77, and 0.86 for MLFNN, RBFNN, and PNN algorithms, respectively. The inverted volumes portray a consistent pattern of impedance variations spanning 7000–18000 m/s*g/cc, porosity ranging from 5 to 20%, and density within the range of 1.9–2.9 g/cc across the region. Importantly, all these methods yield mutually corroborative results, with PNN displaying a slight edge in estimation precision. Additionally, the interpretation of the inverted findings highlights anomalous zones characterized by low impedance, low density, and high porosity, seamlessly aligning with well-log data and being identified as sand channel. This study underscores the potential for seismic inversion, driven by machine learning techniques, to swiftly and cost-effectively determine critical subsurface parameters like acoustic impedance and porosity. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. Fast deep learning with tight frame wavelets.

Author: Cao, Haitao
Subjects: *ARTIFICIAL neural networks, *DEEP learning, *DERIVATIVES (Mathematics), *COST functions, *WAVELETS (Mathematics), *ENERGY transfer
Abstract: The cost function gradient vanishing or exploding problem and slow convergence speed are key issues when training deep neural networks (DNNs). In this paper, we investigate the forward and backward propagation processes of DNN training and explore the properties of the activation function and derivative function (ADF) employed. The outputs' distribution of ADF with near-zero mean is proposed to reduce gradient problems. Additionally, the constant energy transfer of propagating data in the training process is also proposed to speed up convergence further. Based on wavelet frame theory, we derive a novel ADF, i.e., tight frame wavelet activation function (TFWAF) and tight frame wavelet derivative function (TFWDF) of the Mexican hat wavelet, to stabilize and accelerate DNN training. The nonlinearity of wavelet functions can strengthen the learning capacity of DNN models, while the sparse property of wavelets derived can reduce the overfitting problem and enhance the robustness of models. Experiments demonstrate that the proposed method stabilizes the DNN training process and accelerates convergence. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. Analyzing Activation Functions With Transfer Learning-Based Layer Customization for Improved Brain Tumor Classification

Author: Soumyarashmi Panigrahi, Dibya Ranjan Das Adhikary, and Binod Kumar Pattanayak
Subjects: Activation functions, brain tumor, pre-trained models, GELU, Mish, SELU, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Brain tumors pose a significant global health concern, requiring early and accurate detection for effective treatment. Our study presents a binary brain tumor classification architecture leveraging Deep Neural Network (DNN) pre-trained models to reduce misclassification rates. We modified five Convolutional Neural Network (CNN) models using Transfer Learning (TL) and evaluated the effects of seven different Activation Functions (AF). Our proposed architecture was trained, tested, and validated using the “Br35H: Brain Tumor Detection 2020” dataset. The results show that our modified DenseNet121 with Swish AF achieves the best classification performance, with a balanced test accuracy of 99.14% and high scores in Area Under the Curve (AUC), Cohen’s Kappa, Precision, Recall, F1-Score, and Specificity. The proposed architecture also demonstrates practical values in improving medical outcomes, enabling radiologists to focus on complex cases and patient care. It also reduces manual classification time and effort, leading to cost savings for healthcare facilitators. Our study highlights the potential of DNN in brain tumor classification, paving the way for advancements in medical imaging and healthcare technology. The proposed architecture can be adapted for various medical imaging tasks, making it a valuable tool for medical professionals and contributing to improved patient outcomes and enhanced healthcare efficiency.
Published: 2024
Full Text: View/download PDF

36. Enhancing neural network classification using fractional-order activation functions

Author: Meshach Kumar, Utkal Mehta, and Giansalvo Cirrincione
Subjects: Fractional calculus, Neural networks, Classification, Multilayer perceptron, Activation functions, Accuracy, Electronic computers. Computer science, QA75.5-76.95
Abstract: In this paper, a series of novel activation functions is presented, which is derived using the improved Riemann–Liouville conformable fractional derivative (RLCFD). This study investigates the use of fractional activation functions in Multilayer Perceptron (MLP) models and their impact on the performance of classification tasks, verified using the IRIS, MNIST and FMNIST datasets. Fractional activation functions introduce a non-integer power exponent, allowing for improved capturing of complex patterns and representations. The experiment compares MLP models employing fractional activation functions, such as fractional sigmoid, hyperbolic tangent and rectified linear units, against traditional models using standard activation functions, their improved versions and existing fractional functions. The numerical studies have confirmed the theoretical observations mentioned in the paper. The findings highlight the potential usage of new functions as a valuable tool in deep learning in classification. The study suggests incorporating fractional activation functions in MLP architectures can lead to superior accuracy and robustness.
Published: 2024
Full Text: View/download PDF

37. Conditional random k satisfiability modeling for k = 1, 2 (CRAN2SAT) with non-monotonic Smish activation function in discrete Hopfield neural network

Author: Nurshazneem Roslan, Saratha Sathasivam, and Farah Liyana Azizan
Subjects: discrete hopfield neural network (dhnn), conditional random two satisfiability, non-systematic logic, activation functions, smish activation function, potential logic mining, Mathematics, QA1-939
Abstract: The current development of logic satisfiability in discrete Hopfield neural networks (DHNN)has been segregated into systematic logic and non-systematic logic. Most of the research tends to improve non-systematic logical rules to various extents, such as introducing the ratio of a negative literal and a flexible hybrid logical structure that combines systematic and non-systematic structures. However, the existing non-systematic logical rule exhibited a drawback concerning the impact of negative literal within the logical structure. Therefore, this paper presented a novel class of non-systematic logic called conditional random k satisfiability for k = 1, 2 while intentionally disregarding both positive literals in second-order clauses. The proposed logic was embedded into the discrete Hopfield neural network with the ultimate goal of minimizing the cost function. Moreover, a novel non-monotonic Smish activation function has been introduced with the aim of enhancing the quality of the final neuronal state. The performance of the proposed logic with new activation function was compared with other state of the art logical rules in conjunction with five different types of activation functions. Based on the findings, the proposed logic has obtained a lower learning error, with the highest total neuron variation TV = 857 and lowest average of Jaccard index, JSI = 0.5802. On top of that, the Smish activation function highlights its capability in the DHNN based on the result ratio of improvement Zm and TV. The ratio of improvement for Smish is consistently the highest throughout all the types of activation function, showing that Smish outperforms other types of activation functions in terms of Zm and TV. This new development of logical rule with the non-monotonic Smish activation function presents an alternative strategy to the logic mining technique. This finding will be of particular interest especially to the research areas of artificial neural network, logic satisfiability in DHNN and activation function.
Published: 2024
Full Text: View/download PDF

38. SWAG: A Novel Neural Network Architecture Leveraging Polynomial Activation Functions for Enhanced Deep Learning Efficiency

Author: Saeid Safaei, Zerotti Woods, Khaled Rasheed, Thiab R. Taha, Vahid Safaei, Juan B. Gutierrez, and Hamid R. Arabnia
Subjects: Activation functions, factorial coefficient, neural network design, polynomial activation function, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Deep learning techniques have demonstrated significant capabilities across numerous applications, with deep neural networks (DNNs) showing promising results. However, training these networks efficiently, especially when determining the most suitable nonlinear activation functions, remains a significant challenge. While the ReLU activation function has been widely adopted, other hand-designed functions have been proposed. One such approach is the trainable activation functions. This paper introduces a novel neural network design, the SWAG. In this structure, instead of evolving, activation functions consistently form a polynomial basis. Each hidden layer in this architecture comprises k sub-layers that use polynomial activation functions adjusted by a factorial coefficient, followed by a Concatenate layer and a layer employing a linear activation function. Leveraging the Stone-Weierstrass approximation theorem, we demonstrate that utilizing a diverse set of polynomial activation functions allows neural networks to retain universal approximation capabilities. The SWAG algorithm’s architecture is then presented, where data normalization is emphasized, and a new optimized version of SWAG is proposed, which reduces the computational challenge of managing higher degrees of input. This optimization harnesses the Taylor series method by utilizing lower-degree terms to compute higher-degree terms efficiently. This paper thus contributes an innovative neural network architecture that optimizes polynomial activation functions, promising more efficient and robust deep learning applications.
Published: 2024
Full Text: View/download PDF

39. Introducing activation functions into segmented regression model to address lag effects of interventions

Author: Xiangliang Zhang, Kunpeng Wu, Yan Pan, Wenfang Zhong, Yixiang Zhou, Tingting Guo, Rong Yin, and Wen Chen
Subjects: Intervention evaluation, Interrupted time series, Activation functions, Segmented regression, Statistical methods, Simulation study, Medicine (General), R5-920
Abstract: Abstract The interrupted time series (ITS) design is widely used to examine the effects of large-scale public health interventions and has the highest level of evidence validity. However, there is a notable gap regarding methods that account for lag effects of interventions. To address this, we introduced activation functions (ReLU and Sigmoid) to into the classic segmented regression (CSR) of the ITS design during the lag period. This led to the proposal of proposed an optimized segmented regression (OSR), namely, OSR-ReLU and OSR-Sig. To compare the performance of the models, we simulated data under multiple scenarios, including positive or negative impacts of interventions, linear or nonlinear lag patterns, different lag lengths, and different fluctuation degrees of the outcome time series. Based on the simulated data, we examined the bias, mean relative error (MRE), mean square error (MSE), mean width of the 95% confidence interval (CI), and coverage rate of the 95% CI for the long-term impact estimates of interventions among different models. OSR-ReLU and OSR-Sig yielded approximately unbiased estimates of the long-term impacts across all scenarios, whereas CSR did not. In terms of accuracy, OSR-ReLU and OSR-Sig outperformed CSR, exhibiting lower values in MRE and MSE. With increasing lag length, the optimized models provided robust estimates of long-term impacts. Regarding precision, OSR-ReLU and OSR-Sig surpassed CSR, demonstrating narrower mean widths of 95% CI and higher coverage rates. Our optimized models are powerful tools, as they can model the lag effects of interventions and provide more accurate and precise estimates of the long-term impact of interventions. The introduction of an activation function provides new ideas for improving of the CSR model.
Published: 2023
Full Text: View/download PDF

40. Modified state activation functions of deep learning-based SC-FDMA channel equalization system

Author: Mohamed A. Mohamed, Hassan A. Hassan, Mohamed H. Essai, Hamada Esmaiel, Ahmed S. Mubarak, and Osama A. Omer
Subjects: Activation functions, Deep artificial neural networks, Deep learning, Channel equalization, Symbol detection, Long-short-term memory, Telecommunication, TK5101-6720, Electronics, TK7800-8360
Abstract: Abstract The most important function of the deep learning (DL) channel equalization and symbol detection systems is the ability to predict the user’s original transmitted data. Generally, the behavior and performance of the deep artificial neural networks (DANNs) rely on three main aspects: the network structure, the learning algorithms, and the activation functions (AFs) used in each node in the network. Long short-term memory (LSTM) recurrent neural networks have shown some success in channel equalization and symbol detection. The AFs used in the DANN play a significant role in how the learning algorithms converge. Our article shows how modifying the AFs used in the tanh units (block input and output) of the LSTM units can significantly boost the DL equalizer's performance. Additionally, the learning process of the DL model was optimized with the help of two distinct error-measuring functions: default (cross-entropy) and sum of squared error (SSE). The DL model's performance with different AFs is compared. This comparison is conducted using three distinct learning algorithms: Adam, RMSProp, and SGdm. The findings clearly demonstrate that the most frequently used AFs (sigmoid and hyperbolic tangent functions) do not really make a significant contribution to perfect network behaviors in channel equalization. On the other hand, there are a lot of non-common AFs that can outperform the frequently employed ones. Furthermore, the outcomes demonstrate that the recommended loss functions (SSE) exhibit superior performance in addressing the channel equalization challenge compared to the default loss functions (cross-entropy).
Published: 2023
Full Text: View/download PDF

41. Mitigating bias through random activation function selection.

Author: Locke, James M., Paradice, David, and Rainer, R. Kelly
Subjects: *MACHINE learning, *ARTIFICIAL intelligence
Abstract: Neural networks have evolved into strong and dependable machine learning systems. However, training these systems requires human intervention in selecting neural network parameters and evaluating results. This human intervention exposes the training of a neural network to human bias. One key task in neural network learning success is selecting optimal activation functions for the hidden and output layers. Activation functions are hyperparameters that cannot be adjusted during learning. By programming a neural network to repeatedly monitor the loss function to sense learning decline, an underperforming activation function can be discarded at an appropriate point in favor of another one until convergence, thus minimizing human intervention and potential bias. The results of this study indicate that parameterizing activation functions can improve neural network accuracy while achieving convergence. Swapping activation functions during training offer solutions to both the user bias problem and the risk of suboptimal learning. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. Revisiting activation functions: empirical evaluation for image understanding and classification.

Author: Verma, Shradha, Chug, Anuradha, and Singh, Amit Prakash
Abstract: In this paper, the authors have devised four novel activation functions by coupling and combining a few existing functions implemented with four standard CNN architectures namely VGG19, ResNet50, InceptionV3, and DenseNet121 on existing benchmark datasets CIFAR10, CIFAR100, and MNIST. The best-performing function is also implemented with EfficientNet (B0-B7) models, on the abovementioned datasets, with improved performance as compared to the Swish function. Apart from classification accuracy, measures such as top5 accuracy, Cohen's kappa, precision, recall, f1-score, RMSE, MSE, and loss have been recorded. The authors have also implemented Monte-Carlo Dropout to evaluate the uncertainty of the implementations in terms of Brier Score, mean, variance, standard deviation, and peak accuracy values. Lastly, a few instances of classification with the Tiny-ImageNet dataset, semantic segmentation, object detection, and hyperspectral image classification have been implemented for evaluation and comparison purposes. The research work presented in this paper aims to experiment with combinations of different activation functions, analyze the performance and find a better version that improves the performance of DNNs for the task of image understanding and classification. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. Improving the Prediction Accuracy of MRI Brain Tumor Detection and Segmentation.

Author: Padmapriya, S. T., Chandrakumar, T., and Kalaiselvi, T.
Subjects: BRAIN tumors, ARTIFICIAL neural networks, MAGNETIC resonance imaging
Abstract: Brain tumors were the most common kind of tumor in humans. Brain tumors can be detected from various imaging technologies. The proposed research work strives to improve the prediction accuracy of brain tumor detection and segmentation from MRI of human head scans by using a novel activation function E-Tanh. The role of activation functions is to perform computations and make decisions in artificial neural networks (ANN). We developed three ANN models for brain tumor detection by modifying the hidden layers. We have trained these ANN models using the E-Tanh activation function and evaluated their performance. This novel activation function achieved 98% prediction accuracy for the MRI brain tumor image detection neural network model, which was higher than the existing activation functions. We also have segmented brain tumors from the BraTS2020 dataset by using this activation function in U-Net-based architecture. We attained dice scores of 83%, 95%, and 85% for the whole, core, and enhancing tumors, which are significantly higher than the ReLU activation function. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. Conditional random k satisfiability modeling for k =1,2 (CRAN2SAT) with non-monotonic Smish activation function in discrete Hopfield neural network.

Author: Roslan, Nurshazneem, Sathasivam, Saratha, and Azizan, Liyana
Subjects: HOPFIELD networks, ARTIFICIAL neural networks, COST functions, CONDITIONALS (Logic)
Abstract: The current development of logic satisfiability in discrete Hopfield neural networks (DHNN)has been segregated into systematic logic and non-systematic logic. Most of the research tends to improve non-systematic logical rules to various extents, such as introducing the ratio of a negative literal and a flexible hybrid logical structure that combines systematic and non-systematic structures. However, the existing non-systematic logical rule exhibited a drawback concerning the impact of negative literal within the logical structure. Therefore, this paper presented a novel class of non-systematic logic called conditional random k satisfiability for k=1,2 while intentionally disregarding both positive literals in second-order clauses. The proposed logic was embedded into the discrete Hopfield neural network with the ultimate goal of minimizing the cost function. Moreover, a novel non-monotonic Smish activation function has been introduced with the aim of enhancing the quality of the final neuronal state. The performance of the proposed logic with new activation function was compared with other state of the art logical rules in conjunction with five different types of activation functions. Based on the findings, the proposed logic has obtained a lower learning error, with the highest total neuron variation TV=857 and lowest average of Jaccard index, JSI=0.5802. On top of that, the Smish activation function highlights its capability in the DHNN based on the result ratio of improvement Zm and TV. The ratio of improvement for Smish is consistently the highest throughout all the types of activation function, showing that Smish outperforms other types of activation functions in terms of Zm and TV. This new development of logical rule with the non-monotonic Smish activation function presents an alternative strategy to the logic mining technique. This finding will be of particular interest especially to the research areas of artificial neural network, logic satisfiability in DHNN and activation function. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. The effect of activation functions on accuracy, convergence speed, and misclassification confidence in CNN text classification: a comprehensive exploration.

Author: Emanuel, Rebecca H. K., Docherty, Paul D., Lunt, Helen, and Möller, Knut
Subjects: *CONVOLUTIONAL neural networks, *CONFIDENCE
Abstract: Convolutional neural networks (CNNs) have become a useful tool for a wide range of applications such as text classification. However, CNNs are not always sufficiently accurate to be useful in certain applications. The selection of activation functions within CNN architecture can affect the efficacy of the CNN. However, there is limited research regarding which activation functions are best for CNN text classification. This study tested sixteen activation functions across three text classification datasets and six CNN structures, to determine the effects of activation function on accuracy, iterations to convergence, and Positive Confidence Difference (PCD). PCD is a novel metric introduced to compare how activation functions affected a network's classification confidence. Tables were presented to compare the performance of the activation functions across the different CNN architectures and datasets. Top performing activation functions across the different tests included the symmetrical multi-state activation function, sigmoid, penalised hyperbolic tangent, and generalised swish. An activation function's PCD was the most consistent evaluation metric during activation function assessment, implying a close relationship between activation functions and network confidence that has yet to be explored. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

46. A unified and constructive framework for the universality of neural networks.

Author: Bui-Thanh, Tan
Subjects: *NUMERICAL analysis, *FUNCTION spaces, *PROBABILITY theory, *CONTINUOUS functions, *NEURONS, *FUNCTIONAL analysis
Abstract: One of the reasons why many neural networks are capable of replicating complicated tasks or functions is their universal approximation property. Though the past few decades have seen tremendous advances in theories of neural networks, a single constructive and elementary framework for neural network universality remains unavailable. This paper is an effort to provide a unified and constructive framework for the universality of a large class of activation functions including most of the existing ones. At the heart of the framework is the concept of neural network approximate identity (nAI). The main result is as follows: any nAI activation function is universal in the space of continuous functions on compacta. It turns out that most of the existing activation functions are nAI, and thus universal. The framework induces several advantages over the contemporary counterparts. First, it is constructive with elementary means from functional analysis, probability theory, and numerical analysis. Second, it is one of the first unified and constructive attempts that is valid for most of the existing activation functions. Third, it provides new proofs for most activation functions. Fourth, for a given activation and error tolerance, the framework provides precisely the architecture of the corresponding one-hidden neural network with a predetermined number of neurons and the values of weights/biases. Fifth, the framework allows us to abstractly present the first universal approximation with a favorable non-asymptotic rate. Sixth, our framework also provides insights into the developments, and hence providing constructive derivations, of some of the existing approaches. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

47. Optimizing activation functions and hidden neurons in Backpropagation neural networks for real-time NOx concentration prediction.

Author: Chang Duan and Gongping Mao
Subjects: *ARTIFICIAL neural networks, *STANDARD deviations, *NEURONS
Abstract: The Backpropagation (BP) neural network model is optimized in this research to predict NOx concentrations under dynamic conditions. Various activation functions (elliotsig, logsig, poslin, radbas, satlin, satlins, and tansig) were explored, and hidden neuron counts were adjusted to determine the optimal configuration for achieving maximal prediction accuracy. Out of the 8791 dataset entries, 7033 were used for training, and 1758 were used for testing. The model was evaluated using four primary metrics: Mean Absolute Error (MAE), Root Mean Square Error (RMSE), the coefficient of determination (R² ), and Model Efficiency (ME). It was observed that activation functions had a significant impact on the computational time of the BP model. Satlins, poslin, and satlin functions operated swiftly, with a running time of approximately 11 ms for an ANN model with 100 neurons. Due to the computational complexity of logsig, tansig, and radbas, their computation time was more than three times longer compared to the previous functions. Overall, this research illustrates the effectiveness of the BP neural network in predicting NOx concentrations when optimized with the appropriate architecture. It highlights the elliotsig function for its combination of accuracy and efficiency. During a 10 ms operating time of the microcontroller, the combination of elliotsig function with 68 hidden neurons resulted in the highest R² and ME values, achieving 0.9934 and 0.9861, respectively. Furthermore, this configuration demonstrated exceptional accuracy, with the lowest RMSE and MAE values recorded at 45.4654 and 23.2959, respectively [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

48. A novel autoencoder modeling method for intelligent assessment of bearing health based on Short-Time Fourier Transform and ensemble strategy.

Author: Hao, Yong, Zhang, Chengxiang, Lu, Yuanhang, Zhang, Long, Lei, Zuxiang, and Li, Zhihao
Subjects: *FOURIER transforms, *FAULT diagnosis, *ROLLER bearings, *DEEP learning, *PYRAMIDS
Abstract: In recent years, deep learning (DL) has developed rapidly in mechanical fault diagnosis. However, in the process of developing rolling bearing fault diagnosis DL model, especially for multiple-category fault diagnosis, the selection of hyperparameters of DL models is critical and difficult. This paper proposed a novel method for an intelligent assessment of bearing health based on Short-Time Fourier Transform (STFT) and ensemble strategy, namely Ensemble STFT-multi-AEs networks (ESAEs). The method was composed of four steps. Firstly, the vibration signal was converted into three-channel time-frequency maps by the STFT. Secondly, three multi-autoencoders (multi-AEs) networks were established using the three-channel time-frequency maps, and the multi-channel features extracted by the three networks were fused. Thirdly, a series of multi-AE networks were built based on 11 activation functions (AFs) with different characteristics such as output ranges, monotonicity, smoothness, etc. The consensus strategy was adopted to integrate the modeling results of each sub-model, and the advantages of each member model were analyzed. Finally, a pyramid grading strategy was applied to the member models with larger weights in the 11 sub-models for the second allocation of weights to eliminate the influence of extreme models on the stability of ensemble results. The results show that the ESAEs method can effectively and automatically extract the deep vibration features of bearings, reducing the parameter selection steps brought by AFs. The pyramid grading strategy can effectively balance the difference of prediction results of AE models built with different AFs. The ESAEs model achieved the best detection results for locomotive bearing defects with an accuracy of 99.2%. In addition, the ESAEs method still maintains good detection performance and stability in the fault diagnosis of data imbalance. [Display omitted] • The vibration features extracted from the STFT transformed time-frequency map are more effective. • Multiple autoencoders can effectively extract features in time-frequency maps without relying on complex prior knowledge. • The application of AFs and ensemble strategies improves the accuracy and robustness of bearing health assessment. • The pyramid grading strategy solves the accuracy mutation in multiple AFs and simplifies the model parameter selection step. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

49. On Specific Features of an Approach Based on Feedforward Neural Networks to Solve Problems Based on Differential Equations.

Author: Ladygin, S. A., Karachurin, R. N., Ryabov, P. N., and Kudryashov, N. A.
Subjects: *FEEDFORWARD neural networks, *DIFFERENTIAL equations, *PARTIAL differential equations, *PROBLEM solving, *FINITE difference method
Abstract: To date, a multitude of methods have been developed for numerical solution of problems based on ordinary differential equations (ODEs) and partial differential equations (PDEs). The most common of these are finite-difference method, the finite-element method, and the finite-volume method. In this study, an alternative numerical approach is implemented, based on the approximation of functions by feedforward neural networks. The solution obtained using this approach is a differentiable analytical expression; in this respect it differs significantly from other methods that offer either discrete solutions or solutions with limited differentiability. In this study, we examine the influence of neural-network parameters (such as activation functions and weights in the error function) on the rate of convergence and accuracy of the obtained approximation of the solution for three types of differential equations: ordinary differential equations, integrable partial differential equations, and non-integrable partial differential equations. As model equations, we consider Korteweg–de Vries and Kudryashov–Sinelshchikov partial differential equations and second-order ordinary differential equations. In each case described above, the optimal ratios of the weight coefficients are found. The activation functions most efficient for each problem are determined. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

50. Comparing Activation Functions in Machine Learning for Finite Element Simulations in Thermomechanical Forming.

Author: Pantalé, Olivier
Subjects: *MACHINE learning, *DATA compression, *EXPONENTIAL functions, *PARAMETER identification, *MODEL validation
Abstract: Finite element (FE) simulations have been effective in simulating thermomechanical forming processes, yet challenges arise when applying them to new materials due to nonlinear behaviors. To address this, machine learning techniques and artificial neural networks play an increasingly vital role in developing complex models. This paper presents an innovative approach to parameter identification in flow laws, utilizing an artificial neural network that learns directly from test data and automatically generates a Fortran subroutine for the Abaqus standard or explicit FE codes. We investigate the impact of activation functions on prediction and computational efficiency by comparing Sigmoid, Tanh, ReLU, Swish, Softplus, and the less common Exponential function. Despite its infrequent use, the Exponential function demonstrates noteworthy performance and reduced computation times. Model validation involves comparing predictive capabilities with experimental data from compression tests, and numerical simulations confirm the numerical implementation in the Abaqus explicit FE code. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

473 results on '"activation functions"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources