190 results
Search Results
2. Special issue: Advances in artificial neural networks, machine learning and computational intelligenceSelected papers from the 23rd European Symposium on Artificial Neural Networks (ESANN 2015).
- Author
-
Aiolli, Fabio, Bunte, Kerstin, Hérault, Romain, and Kanevski, Mikhail
- Subjects
- *
ARTIFICIAL neural networks , *MACHINE learning , *COMPUTATIONAL intelligence , *CONFERENCES & conventions , *ARTIFICIAL intelligence - Published
- 2016
- Full Text
- View/download PDF
3. Self-supervised anomaly pattern detection for large scale industrial data.
- Author
-
Tang, Xiaoyue, Zeng, Shan, Yu, Fang, Yu, Wei, Sheng, Zhongyin, and Kang, Zhen
- Subjects
- *
ANOMALY detection (Computer security) , *ARTIFICIAL neural networks , *AUTOMATIC speech recognition , *PATTERN recognition systems , *INDUSTRY 4.0 , *TIME series analysis , *MACHINE learning - Abstract
Detecting the anomalies in a large amounts of high-dimensional data has been a challenging task. In the Industry 4.0 environment, large-scale high-dimensional monitoring data features the complex pattern of high level semantics. In order to provide enterprise-wide monitoring solutions, it is necessary to identify the high-level semantic patterns of the anomalies in these data without splitting them. Existing end-to-end deep neural networks for time series are capable of recognizing the high-level semantics in natural language or speech signals, but they are barely applied in real-time anomaly detection of industrial data because of the large time costs. In this paper, we leverage the self-supervised contrastive learning methodology and propose a Composite Semantic Augmentation Encoder (CSAE) to provide an appropriate representation of industrial data and implement quick detection of anomalies in industrial application environments. CSAE is a non-sequential deep neural network with two augmentation layers and a mandatory layer. The two layers of data-augmentation are built to expand the size of samples of both low-level semantic anomalies and high-level semantic anomalies, which enables CSAE to discover diverse anomalies and improves its accuracy of high-level semantic pattern recognition. The mandatory layer is built to compress and reserve the temporal information in the industrial data to accelerate the anomaly detection. Therefore, as a non-sequential contrastive learning model, CSAE has faster training convergence than the usual sequence models. The experiment results have verified that CSAE can achieve higher prediction accuracy with less time consumption than existing machine learning models in the tasks of high dimensional anomaly pattern detection. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. Intrusion detection approach based on optimised artificial neural network.
- Author
-
Choraś, Michał and Pawlicki, Marek
- Subjects
- *
ARTIFICIAL neural networks , *MACHINE learning , *BIOLOGICALLY inspired computing - Abstract
Intrusion Detection, the ability to detect malware and other attacks, is a crucial aspect to ensure cybersecurity. So is the ability to identify this myriad of attacks. Artificial Neural Networks (as well as other machine learning bio-inspired approaches) are an established and proven method of accurate classification. ANNs are extremely versatile – a wide range of setups can achieve significantly different classification results. The main objective and contribution of this paper is the evaluation of the way the hyperparameters can influence the final classification result. In this paper, a wide range of ANN setups is put to comparison. We have performed our experiments on two benchmark datasets, namely NSL-KDD and CICIDS2017. The most effective arrangement achieves the multi-class classification accuracy of 99.909% on an established benchmark dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
5. Predicting energy cost of public buildings by artificial neural networks, CART, and random forest.
- Author
-
Zekić-Sušac, Marijana, Has, Adela, and Knežević, Marinela
- Subjects
- *
ARTIFICIAL neural networks , *RANDOM forest algorithms , *PUBLIC buildings , *CONSTRUCTION cost estimates , *REGRESSION trees , *MACHINE learning , *BUILDING repair - Abstract
• ANN, CART, and RF regression trees have shown the potential in modeling energy cost. • Three different strategies regarding variable selection were tested and compared. • Machine learning and Boruta method have produced the highest accuracy of prediction. • The model has extracted heating and occupational data as the most important. • The created model could be used to assess the concept of smart buildings and cities. The paper deals with modeling the cost of energy consumed in public buildings by leveraging three machine learning methods: artificial neural networks, CART, and random forest regression trees. Energy consumption is one of the major issues in global and national policies, therefore scientific efforts in creating prediction models of energy consumption and cost are highly important. One of the largest energy consumers in every state is its public sector, consisting of educational, health, public administration, military, and other types of public buildings. Recent technologies based on sensor networks and Big data platforms enable collection of large amounts of data that could be used to analyze energy consumption and cost. A real data from Croatian public sector is used in this paper including a large number of constructional, energetic, occupational, climate and other attributes. The algorithms for data pre-processing and modeling by optimizing parameters are suggested. Three strategies were tested: (1) with all available variables, (2) with a filter-based variable selection, and (3) with a wrapper-based variable selection which integrates Boruta algorithm and random forest. Prediction models of energy cost are created using two approaches: (a) comparative usage of artificial neural networks and two types of regression trees, CART and random forest, and (b) integration of RF-Boruta variable selection and machine learning methods for prediction. A cross-validation procedure was used to optimize the artificial neural network and regression tree topology, as well to select the most appropriate activation function. Along with creating a prediction model, the aim of the paper was also to extract the relevant predictors of energy cost in public buildings which are important in planning the construction or renovation of buildings. The results have shown that the second approach which integrates machine learning with Boruta method, where the random forest algorithm is used for both variable reduction and prediction modeling, has produced a higher accuracy of prediction than the individual usage of three machine learning methods. Such findings confirm the potential of hybrid machine learning methods which are suggested in previous research, but in favor of random forest method over CART and artificial neural networks. Regarding the variable selection, the model has extracted heating and occupational data as the most important, followed by constructional, cooling, electricity, and lighting attributes. The model could be implemented in public buildings information systems and their IoT networks within the concept of smart buildings and smart cities. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
6. Solve routing problems with a residual edge-graph attention neural network.
- Author
-
Lei, Kun, Guo, Peng, Wang, Yi, Wu, Xiao, and Zhao, Wenchao
- Subjects
- *
ARTIFICIAL neural networks , *PROBLEM solving , *REINFORCEMENT learning , *DEEP learning , *COMBINATORIAL optimization , *POLYNOMIAL time algorithms , *MACHINE learning - Abstract
For NP-hard combinatorial optimization problems, it is usually challenging to find high-quality solutions in polynomial time. Designing either an exact algorithm or an approximate algorithm for these problems often requires significantly specialized knowledge. Recently, deep learning methods have provided new directions to solve such problems. In this paper, an end-to-end deep reinforcement learning framework is proposed to solve this type of combinatorial optimization problems. This framework can be applied to different problems with only slight changes of input, masks, and decoder context vectors. The proposed framework aims to improve the models in literacy in terms of the neural network model and the training algorithm. The solution quality of TSP and the CVRP up to 100 nodes are significantly improved via our framework. Compared with the best results of the state-of-the-art methods, the average optimality gap is reduced from 4.53% to 3.67% for TSP with 100 nodes and from 7.34% to 6.68% for CVRP with 100 nodes when using the greedy decoding strategy. Besides, the proposed framework can be used to solve a multi-depot CVRP case without any structural modification. Furthermore, our framework uses about 1/3 ∼ 3/4 training samples compared with other existing learning methods while achieving better results. The results performed on randomly generated instances, and the benchmark instances from TSPLIB and CVRPLIB confirm that our framework has a linear running time on the problem size (number of nodes) during training and testing phases and has a good generalization performance from random instance training to real-world instance testing. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
7. Attractive and repulsive training to address inter-task forgetting issues in continual learning.
- Author
-
Choi, Hong-Jun and Choi, Dong-Wan
- Subjects
- *
MACHINE learning , *DEEP learning , *ARTIFICIAL neural networks , *EMPIRICAL research - Abstract
In continual learning over deep neural networks (DNNs), the rehearsal strategy , in which the previous exemplars are jointly trained with new samples, is commonly employed for the purpose of addressing catastrophic forgetting. Unfortunately, due to the memory limit, rehearsal-based techniques inevitably cause the class imbalance issue leading to a DNN biased toward new tasks having more samples. Existing works mostly focus on correcting such a bias in the fully connected layer or classifier. In this paper, we newly discover that class imbalance tends to make old classes even more highly correlated with their similar new classes in the feature space, which turns out to be the major reason behind catastrophic forgetting, called inter-task forgetting. To alleviate inter-task forgetting, we propose a novel class incremental learning method, called attractive & repulsive training (ART) , which effectively captures the previous feature space into a set of class-wise flags , and thereby makes old and new similar classes less correlated in the new feature space. In our empirical study, our ART method is observed to be quite effective to improve the performance of the state-of-the-art methods by substantially mitigating inter-task forgetting. Our implementation is available at: https://github.com/bigdata-inha/ART/. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
8. Predicting short-term next-active-object through visual attention and hand position.
- Author
-
Jiang, Jingjing, Nan, Zhixiong, Chen, Hui, Chen, Shitao, and Zheng, Nanning
- Subjects
- *
ARTIFICIAL neural networks , *HUMAN-robot interaction , *HAND , *DISTRIBUTION (Probability theory) , *MACHINE learning - Abstract
Human intention prediction is of great significance in many applications, such as human-robot interaction, intelligent rehabilitation robots. This paper studies the problem of short-term next-active-object prediction in egocentric images. The short-term next-active-object refers to the object that a human is going to interact with in the short-term future, which is an embodiment of human intention. Most current methods usually use object-centered cues, such as the deviation of object appearance change and the unique shape of the egocentric object trajectory, to predict the next-active-object. In this paper, inspired by the fact that human intention is also revealed by human-centered cues, we propose a deep neural network model that integrates the cues from visual attention and hand positions to predict the next-active-object. Firstly, the probability maps of visual attention and hand positions are constructed, and then the probability distribution of next-active-object is generated. We experimentally compare our method with several baseline methods using two datasets and confirm its effectiveness. In addition, ablation experiments are conducted, and crucial points concerning the next-active-object are discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
9. Designing efficient convolutional neural network structure: A survey.
- Author
-
Mi, Jian-Xun, Feng, Jie, and Huang, Ke-Yang
- Subjects
- *
CONVOLUTIONAL neural networks , *ARTIFICIAL neural networks , *DEEP learning , *MACHINE learning , *STRUCTURAL design - Abstract
As a powerful machine learning method, deep learning has attracted the attention of numerous researchers. While exploring a high-performance neural network model, the floating-point operations of a neural network model are also increasing. In recent years, many researchers have noticed that efficiency is also one of important indicators to measure the property of neural network models. Obviously, the efficient neural network model is more helpful to deploy on mobile and embedded devices. Therefore, the efficient neural network model becomes a hot research spot. In this paper, we review the methods related to the structural design of efficient convolution neural networks in recent years. According to the characteristics of these methods, we divide them into three kinds of methods: model pruning, efficient architecture, and neural architecture search. Detailed analyses of each method are presented to demonstrate their advantages and disadvantages. Then, we comprehensively compare them in detail and propose many suggestions about the design of the efficient convolution neural network model structure. Inspired by these suggestions, we built a new efficient neural network model, SharedNet. And the SharedNet obtains the best accuracy of manually-designed efficient CNN models on the ImageNet dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
10. Artificial intelligence within the interplay between natural and artificial computation: Advances in data science, trends and applications.
- Author
-
Górriz, Juan M., Ramírez, Javier, Ortíz, Andrés, Martínez-Murcia, Francisco J., Segovia, Fermin, Suckling, John, Leming, Matthew, Zhang, Yu-Dong, Álvarez-Sánchez, Jose Ramón, Bologna, Guido, Bonomini, Paula, Casado, Fernando E., Charte, David, Charte, Francisco, Contreras, Ricardo, Cuesta-Infante, Alfredo, Duro, Richard J., Fernández-Caballero, Antonio, Fernández-Jover, Eduardo, and Gómez-Vilda, Pedro
- Subjects
- *
ARTIFICIAL intelligence , *DATA science , *BRAIN-computer interfaces , *MACHINE learning , *ARTIFICIAL neural networks , *COMPUTER interfaces - Abstract
Artificial intelligence and all its supporting tools, e.g. machine and deep learning in computational intelligence-based systems, are rebuilding our society (economy, education, life-style, etc.) and promising a new era for the social welfare state. In this paper we summarize recent advances in data science and artificial intelligence within the interplay between natural and artificial computation. A review of recent works published in the latter field and the state the art are summarized in a comprehensive and self-contained way to provide a baseline framework for the international community in artificial intelligence. Moreover, this paper aims to provide a complete analysis and some relevant discussions of the current trends and insights within several theoretical and application fields covered in the essay, from theoretical models in artificial intelligence and machine learning to the most prospective applications in robotics, neuroscience, brain computer interfaces, medicine and society, in general. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
11. Text classification using capsules.
- Author
-
Kim, Jaeyoung, Jang, Sion, Park, Eunjeong, and Choi, Sungchul
- Subjects
- *
ARTIFICIAL neural networks , *CLASSIFICATION , *COMPUTATIONAL complexity - Abstract
This paper presents an empirical exploration of the use of capsule networks for text classification. While it has been shown that capsule networks are effective for image classification, the research regarding their validity in the domain of text has been initiated recently. In this paper, we show that capsule networks indeed have the potential for text classification and that they have several advantages over convolutional neural networks. As well, we compare our proposed model to the initial studies regarding capsule network-based text classification. We further suggest a simple routing method that effectively reduces the computational complexity of dynamic routing. We utilized seven benchmark datasets to demonstrate that capsule networks, along with the proposed routing method provide comparable results. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
12. On the ideal number of groups for isometric gradient propagation.
- Author
-
Kim, Bum Jun, Choi, Hyeyeon, Jang, Hyeonah, and Kim, Sang Woo
- Subjects
- *
ARTIFICIAL neural networks - Abstract
Recently, various normalization layers have been proposed to stabilize the training of deep neural networks. Among them, group normalization is a generalization of layer normalization and instance normalization by allowing a degree of freedom in the number of groups it uses. However, to determine the optimal number of groups, trial-and-error-based hyperparameter tuning is required, and such experiments are time-consuming. In this study, we discuss a reasonable method for setting the number of groups. First, we find that the number of groups influences the gradient behavior of the group normalization layer. Based on this observation, we derive the ideal number of groups, which calibrates the gradient scale to facilitate gradient descent optimization. This paper is the first to propose an optimal number of groups that is theoretically grounded, architecture-aware, and can provide a proper value in a layer-wise manner for all layers. The proposed method exhibited improved performance over existing methods in numerous neural network architectures, tasks, and datasets. • We propose a method to determine the number of groups of group normalization. • A theoretical analysis of group normalization with activation function is provided. • The proposed method is validated against various practical problems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Distributed Bayesian optimisation framework for deep neuroevolution.
- Author
-
Chandra, Rohitash and Tiwari, Animesh
- Subjects
- *
ARTIFICIAL neural networks , *DEEP learning , *CONVOLUTIONAL neural networks , *DISTRIBUTED computing , *REINFORCEMENT learning , *MACHINE learning - Abstract
Neuroevolution is a machine learning method for evolving neural networks parameters and topology with a high degree of flexibility that makes them applicable to a wide range of architectures. Neuroevolution has been popular in reinforcement learning and has also shown to be promising for deep learning. The major feature of Bayesian optimisation is in reducing computational load by approximating the actual model with an acquisition function (surrogate model) that is computationally cheaper. A major limitation of neuroevolution is the high computational time required for convergence since learning (evolution) typically does not utilize gradient information. Bayesian optimisation, which is also known as surrogate-assisted optimisation, has been popular for expensive engineering optimisation problems and hyper-parameter tuning in machine learning. It has potential for training deep learning models via neuroevolution given large datasets and complex models. Recent advances in parallel and distributed computing have enabled efficient implementation of neuroevolution for complex and computationally expensive neural models. In this paper, we present a Bayesian optimisation framework for deep neuroevolution using a distributed architecture to provide computational efficiency in training. Our results demonstrate promising results for simple to deep neural network models such as convolutional neural networks which motivates further applications. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
14. Self-organizing radial basis function neural network using accelerated second-order learning algorithm.
- Author
-
Han, Hong-Gui, Ma, Miao-Li, Yang, Hong-Yan, and Qiao, Jun-Fei
- Subjects
- *
RADIAL basis functions , *MACHINE learning , *ARTIFICIAL neural networks , *ALGORITHMS - Abstract
Gradient-based algorithms are commonly used for training radial basis function neural network (RBFNN). However, it is still difficult to avoid vanishing gradient to improve the learning performance in the training process. For this reason, in this paper, an accelerated second-order learning (ASOL) algorithm is developed to train RBFNN. First, an adaptive expansion and pruning mechanism (AEPM) of gradient space, based on the integrity and orthogonality of hidden neurons, is designed. Then, the effective gradient information is constantly added to gradient space and the redundant gradient information is eliminated from gradient space. Second, with AEPM, the neurons are generated or pruned accordingly. In this way, a self-organizing RBFNN (SORBFNN) which reduces the structure complexity and improves the generalization ability is obtained. Then, the structure and parameters in the learning process can be optimized by the proposed ASOL-based SORBFNN (ASOL-SORBFNN). Third, some theoretical analyses including the efficiency of the proposed AEPM on avoiding the vanishing gradient and the stability of SORBFNN in the process of structural adjustment are given, then the successful application of the proposed ASOL-SORBFNN is guaranteed. Finally, to illustrate the advantages of the proposed ASOL-SORBFNN, several experimental studies are examined. By comparing with other existing approaches, the results show that ASOL-SORBFNN performs well in terms of both learning speed and prediction accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
15. Time-frequency deep metric learning for multivariate time series classification.
- Author
-
Chen, Zhi, Liu, Yongguo, Zhu, Jiajing, Zhang, Yun, Jin, Rongjiang, He, Xia, Tao, Jing, and Chen, Lidian
- Subjects
- *
DEEP learning , *ARTIFICIAL neural networks , *CONVOLUTIONAL neural networks , *MACHINE learning , *POWER (Social sciences) , *CLASSIFICATION - Abstract
• A time–frequency deep metric learning model is proposed for MTS classification. • A consistency regularizer is designed to capture the correlations among levels. • An effective optimization strategy is developed to train the model hierarchically. • Results on 18 MTS datasets show that our method outperforms other methods. Multivariate time series (MTS) data exist in various fields of studies and MTS classification is an important research topic in the machine learning community. Researchers have proposed many MTS classification models over the years and the distance-based methods along with nearest neighbor classifier achieve good performance. However, the current methods mainly focus on defining distance metric on time-domain of MTS and ignore frequency information. Besides, these methods usually define the same linear distance metric for different datasets, which is not suitable for capturing the nonlinear relationship of MTS and degrades the discriminative power of the distance metric. In this paper, we propose a time–frequency deep metric learning (TFDM) approach for MTS classification. The multilevel discrete wavelet decomposition is first adopted to decompose an MTS into a group of sub-MTS so as to extract multilevel time–frequency representations. Then, a deep convolutional neural network is developed for each level to learn level-specific nonlinear features and a metric learning layer is added on the top of the network to learn the semantic similarity of MTS. Moreover, a cross-level consistency regularization term is designed to encourage the distance metrics of different levels to be consistent for capturing the correlations among different levels. Finally, we use 1-nearest neighbor to classify MTS according to the learned distance metrics. Extensive experiments on 18 benchmark datasets show the effectiveness of our approach. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
16. Neuromorphic extreme learning machines with bimodal memristive synapses.
- Author
-
Dong, Zhekang, Sing Lai, Chun, Zhang, Zhaowei, Qi, Donglian, Gao, Mingyu, and Duan, Shukai
- Subjects
- *
MACHINE learning , *ARTIFICIAL neural networks , *COMPUTER architecture , *HIGH resolution imaging , *SYNAPSES - Abstract
The biology-inspired intelligent computing system for the neuromorphic hardware implementation is useful in high-speed parallel information processing. However, the traditional Von Neumann computer architecture and the unsatisfactory signal transmission approach have jointly limited the overall performance of the specific hardware implementation. In this paper, a compact extreme learning machine (ELM) architecture synthesized with the spintronic memristor-based synaptic circuit, the biasing circuit, and the activation function circuit is presented. Notably, due to the threshold characteristic of the memristive device, the synaptic circuit has a bimodal behavior. Namely, it is capable to provide the constant and adjustable network weights between the adjacent layers in the ELM. Furthermore, two major limitations (process variations and sneak path issue) are taken into account for the detailed robustness analysis of the whole network. Finally, the entire scheme is verified with case studies in single image super-resolution (SR) reconstruction. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
17. A joint learning model for click-through prediction in display advertising.
- Author
-
Liu, Mengjuan, Cai, Shijia, Lai, Zhi, Qiu, Lizhou, Hu, Zhengning, and Ding, Yi
- Subjects
- *
DISPLAY advertising , *PREDICTION models , *MACHINE learning , *ARTIFICIAL neural networks , *LOGISTIC regression analysis - Abstract
Click-through rate (CTR) prediction is essential for targeted advertising and recommendation. At present, machine learning models are widely used to build CTR estimators, including logistic regression (LR), factorization machine (FM), and deep neural network (DNN). Unfortunately, these models adopt the single structure that only considers either low-order feature interactions (such as LR and FM) or high-order feature interactions (such as DNN), not sufficient for CTR prediction. Therefore, the joint learning models are proposed, such as Wide & Deep and DeepFM, which can exploit both high- and low-order feature interactions to predict CTR by combining two different models. In this paper, we first analyze the typical CTR prediction models' structures and performance, and then summarize the general form and design rules of CTR estimators. Based on the general form, we further design a new joint learning model that combines two different residual networks to explore the feature interactions automatically. Compared with the widely adopted feed-forward neural network, the residual network is more capable of exploring complex feature interactions at different layers. Additionally, we introduce a neural attention network to learn the importance of each second-order interaction of features from various fields. Finally, we evaluate the prediction performance of the proposed model based on two real-world datasets (i.e., Criteo and Avazu) in terms of LogLoss and AUC metrics. The extensive experimental results demonstrate our model performs the best compared to the state-of-the-art baselines. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
18. High-parallelism Inception-like Spiking Neural Networks for Unsupervised Feature Learning.
- Author
-
Meng, Mingyuan, Yang, Xingyu, Bi, Lei, Kim, Jinman, Xiao, Shanlin, and Yu, Zhiyi
- Subjects
- *
ARTIFICIAL neural networks , *MACHINE learning , *NEUROPLASTICITY - Abstract
Spiking Neural Networks (SNNs) are brain-inspired, event-driven machine learning algorithms that have been widely recognized in producing ultra-high-energy-efficient hardware. Among existing SNNs, unsupervised SNNs based on synaptic plasticity, especially Spike-Timing-Dependent Plasticity (STDP), are considered to have great potential in imitating the learning process of the biological brain. Nevertheless, the existing STDP-based SNNs have limitations in constrained learning capability and/or slow learning speed. Most STDP-based SNNs adopted a slow-learning Fully-Connected (FC) architectures and used a sub-optimal vote-based scheme for spike decoding. In this paper, we overcome these limitations with: 1) a design of high-parallelism network architecture, inspired by the Inception module in Artificial Neural Networks (ANNs); 2) use of a Vote-for-All (VFA) decoding layer as a replacement to the standard vote-based spike decoding scheme, to reduce the information loss in spike decoding and, 3) a proposed adaptive repolarization (resetting) mechanism that accelerates SNNs' learning by enhancing spiking activities. Our experimental results on two established benchmark datasets (MNIST/EMNIST) show that our network architecture resulted in superior performance compared to the widely used FC architecture and a more advanced Locally-Connected (LC) architecture, and that our SNN achieved competitive results with state-of-the-art unsupervised SNNs (95.64%/80.11% accuracy on the MNIST/EMNISE dataset) while having superior learning efficiency and robustness against hardware damage. Our SNN achieved great classification accuracy with only hundreds of training iterations, and random destruction of large numbers of synapses or neurons only led to negligible performance degradation. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
19. Residual attention and other aspects module for aspect-based sentiment analysis.
- Author
-
Wu, Chao, Xiong, Qingyu, Yang, Zhengyi, Gao, Min, Li, Qiude, Yu, Yang, Wang, Kaige, and Zhu, Qiwu
- Subjects
- *
SENTIMENT analysis , *ARTIFICIAL neural networks , *TASK analysis , *MACHINE learning - Abstract
• Residual attention mechanisms adjust the flow of sentiment information. • The other aspect term module reduces the effect from other aspect terms' sentiment. • Experimental results show our model achieves state-of-the-art performance. Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis task designed to predict the sentiment polarity of each aspect term in a text. Recent research mainly uses neural networks to model text and utilizes attention mechanisms to interact for associate aspect terms and context to obtain more effective feature representation. However, the general attention mechanism is easy to lose the original information. Besides, in the multi-aspect text, the sentiment information of other aspect terms interferes with the sentiment analysis of the current aspect term likely. In this paper, we propose two models named RA-CNN and RAO-CNN for ABSA tasks. In RA-CNN, we apply CNN to model the aspect term and utilize a specially designed residual attention mechanism to interact with the text. Based on the RA-CNN, RAO-CNN adds other aspect terms module, which can reduce interference of sentiment information related to other aspect terms in the multi-aspect text. To verify the proposed models' effectiveness, we conduct a large number of experiments and comparisons on seven public datasets. Experimental results show that our proposed models are useful and achieve state-of-the-art results. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
20. A fuzzy process neural network model and its application in process signal classification.
- Author
-
Xu, Shaohua, Liu, Kun, and Li, Xuegui
- Subjects
- *
ARTIFICIAL neural networks , *CLASSIFICATION , *FUZZY systems , *INFORMATION organization , *MACHINE learning - Abstract
Abstract Aiming at process signal classification combined with fuzzy decision rules, a fuzzy process neural network (FPNN) is proposed in this paper. The FPNN is structured with a process signal input layer, a fuzzy process neuron (FPN) hidden layer, a signal pattern layer, and a fuzzy decision output layer. The spatio-temporal aggregation in the FPN is taken as a generalized inner product operation, which can measure the fuzzy similarity of the distributed features among process signals and FPN uses an exponential fuzzy membership function as activation function. The FPNN selectively sums the output of the FPN hidden layer to the pattern layer according to the category of input signal. By linking a Takagi-Sugeno fuzzy classifier behind the pattern layer, direct classification of the process signals is achieved. The FPNN can improve the deficiencies of existing time-varying signal classification methods, such as the complete training data sets are required, and the information processing processes and algorithms are complicated. In this paper, the theoretical properties of the FPNN are analyzed, and the comprehensive learning algorithm for FPNNs is given. The discrimination of reservoir water flooding condition based on multi-channel well logging process signals was used as an example for experimental analysis; the results verify the validity of the model and algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
21. Impulsive generalized high-order recurrent neural networks with mixed delays: Stability and periodicity.
- Author
-
Aouiti, Chaouki, M'hamdi, Mohammed Salah, Chérif, Farouk, and Alimi, Adel M.
- Subjects
- *
ARTIFICIAL neural networks , *GRONWALL inequalities , *TIME-varying systems , *COMPUTER algorithms , *MACHINE learning - Abstract
Highlights • Impulsive generalised high-order Recurrent Neural Networks with timevarying coefficient and continuously distributed delays have been studied. • The existence and the global exponential stability of piecewise weighted pseudo almost-periodic solutions have been established. • An illustrative example is given to prove the effectiveness of our results. Abstract In this paper, by employing fixed point theorem, generalized Gronwall–Bellman inequality and differential inequality techniques, some sufficient conditions are given for the existence and the exponential stability of the unique piecewise weighted pseudo almost-periodic solution of impulsive high-order recurrent neural networks with time-varying coefficients and mixed delays. An illustrative example is also given in the end of this paper to show the effectiveness of our results. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
22. Supervised or unsupervised learning? Investigating the role of pattern recognition assumptions in the success of binary predictive prescriptions.
- Author
-
Jafari-Marandi, Ruholla
- Subjects
- *
ARTIFICIAL neural networks , *MACHINE learning , *PATTERN recognition systems , *MEDICAL prescriptions , *CLASSIFICATION algorithms , *SUPERVISED learning - Abstract
Machine learning (ML) employs classification algorithms such as artificial neural networks to make automated decisions. While the proposed solutions have made significant contributions toward improving decision-making efficiency, their effectiveness to serve our multifaceted and complex society has recently come under question. This paper attempts to theorize an answer to why ML lacks effectiveness by studying the assumptions that are made to facilitate efficient pattern recognition. Specifically, this study recognizes five assumptions and investigates their influence on the effectiveness of decision-making for three well-known case studies. The results suggest including the assumptions needed for metric-optimizing supervised learning can only be justifiable and lead to decision-making effectiveness for cases in which a fair and equitable definition of success can be formulated as an objective function. In contrast, the results show using unsupervised learning or non-metric-optimizing supervised learning leads to a more reasonable balance of effectiveness and efficiency when the formulation of a fair and equitable definition of success is not possible. Moreover, the results demonstrate that the current ML approaches that employ supervised learning can improve their efficacy by rethinking the assumptions made at the stage of pattern recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
23. MIDPhyNet: Memorized infusion of decomposed physics in neural networks to model dynamic systems.
- Author
-
Zhang, Zhibo, Rai, Rahul, Chowdhury, Souma, and Doermann, David
- Subjects
- *
ARTIFICIAL neural networks , *DYNAMICAL systems , *DYNAMIC models , *PHYSICS , *MACHINE learning - Abstract
Integrating simplified or partial physics models with data-driven machine learning models is an emerging concept targeted at facilitating generalizability and extrapolability of complex system behavior predictions. In this paper, we introduce a novel machine learning based fusion model MIDPhyNet that decomposes, memorizes, and integrates first principle physics-based information with data-driven models. In MIDPhyNet the output of partial physics is decomposed into Intrinsic Mode Functions (IMFs), which are then infused to a Memorization Unit to generate embedded vectors. A Prediction Unit synthesizes all of the data to generate prediction results. We test the performance of MIDPhyNet on modeling the behavior of dynamic systems such as an inverted pendulum under wind drag. The results clearly demonstrate the performance benefits of our hybrid architecture over both purely data-driven models and state-of-art hybrid models in terms of generalizability and extrapolability. The MIDPhyNet architecture's superiority is most significant when the models are trained over sparse data sets and in general, MIDPhyNet provides a generic way to explore how physical information can be infused with data-driven models. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
24. Computationally efficient neural hybrid automaton framework for learning complex dynamics.
- Author
-
Wang, Tao, Yang, Yejiang, and Xiang, Weiming
- Subjects
- *
ARTIFICIAL neural networks , *ROBOTS , *HYBRID systems , *MOTION , *INTERVAL analysis , *MACHINE learning , *LIMIT cycles , *DYNAMICAL systems - Abstract
This paper proposes a computationally efficient and effective data-driven modeling framework for dynamical systems. The proposed modeling framework employs a collection of shallow neural networks known as Extreme Learning Machines (ELMs) to model local system behaviors along with data-driven inferred transitions among local models to establish a neural hybrid automaton model. First, the sampled system inputs are mapped to the corresponding feature spaces to obtain data-driven partitions, which subsequently define the transitions and invariants of the neural hybrid automaton model through a novel data-driven mode clustering process. Then, a collection of ELMs are trained to approximate the local dynamics. The learning processes integrate a segmented data merging procedure for location identification and a local dynamics modeling process. The proposed neural hybrid automaton models can capture behaviors of complex dynamical systems with high modeling precision but significantly lower computational complexities in computationally expensive tasks such as training and verification, which are traditionally considered to be computationally expensive tasks for neural network models. A computationally efficient set-valued reachability analysis method which is commonly used in safety verification is then developed based on interval analysis and a novel Split and Combine process. Finally, applications to modeling the limit cycle and human handwritten motions are presented to show the effectiveness and efficiency of our approach. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
25. Training Feed-Forward Artificial Neural Networks with a modified artificial bee colony algorithm.
- Author
-
Xu, Feiyi, Pun, Chi-Man, Li, Haolun, Zhang, Yushu, Song, Yurong, and Gao, Hao
- Subjects
- *
ARTIFICIAL neural networks , *BEES algorithm , *DEEP learning , *ALGORITHMS , *IMAGE recognition (Computer vision) , *MACHINE learning , *POLLINATORS , *SPEECH perception - Abstract
Deep learning is a branch of neural network which has been intensively developed in the last decade. Due to the high-accuracy classification ability, the deep learning algorithms have been widely used in many fields, such as speech recognition, image recognition, and natural speech processing. However, they also show some shortcomings especially on the selection of some parameters in the network, including hyper-parameters, which is still treated as a time consuming task. In this paper, a modified ABC (ABC-ISB) optimization algorithm is proposed to automatically train the parameters of Feed-Forward Artificial Neural Networks, which is a typical a neural network. In the proposed ABC algorithm, we utilize the information of neighbors with better performance to accelerate the convergence of employed and onlooker bees respectively. In addition, a new selection strategy and a gbest-guided strategy are introduced to enhance the global search capability and balance the exploration and exploitation of the algorithm separately. The experimental results show our ABC-ISB is generally leading and competitive. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
26. Computer vision and deep learning techniques for pedestrian detection and tracking: A survey.
- Author
-
Brunetti, Antonio, Buongiorno, Domenico, Trotta, Gianpaolo Francesco, and Bevilacqua, Vitoantonio
- Subjects
- *
PEDESTRIANS , *TRACKING & trailing , *DEEP learning , *COMPUTER vision , *ARTIFICIAL neural networks - Abstract
Pedestrian detection and tracking have become an important field in the computer vision research area. This growing interest, started in the last decades, might be explained by the multitude of potential applications that could use the results of this research field, e.g. robotics, entertainment, surveillance, care for the elderly and disabled, and content-based indexing. In this survey paper, vision-based pedestrian detection systems are analysed based on their field of application, acquisition technology, computer vision techniques and classification strategies. Three main application fields have been individuated and discussed: video surveillance, human-machine interaction and analysis. Due to the large variety of acquisition technologies, this paper discusses both the differences between 2D and 3D vision systems, and indoor and outdoor systems. The authors reserved a dedicated section for the analysis of the Deep Learning methodologies, including the Convolutional Neural Networks in pedestrian detection and tracking, considering their recent exploding adoption for such a kind systems. Finally, focusing on the classification point of view, different Machine Learning techniques have been analysed, basing the discussion on the classification performances on different benchmark datasets. The reported results highlight the importance of testing pedestrian detection systems on different datasets to evaluate the robustness of the computed groups of features used as input to classifiers. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
27. Multi-scale hierarchical recurrent neural networks for hyperspectral image classification.
- Author
-
Shi, Cheng and Pun, Chi-Man
- Subjects
- *
ARTIFICIAL neural networks , *HYPERSPECTRAL imaging systems , *MACHINE learning , *IMAGE analysis , *INFORMATION science , *THREE-dimensional imaging - Abstract
This paper presents a novel hyperspectral image (HSI) classification framework by exploiting multi-scale spectral-spatial features via hierarchical recurrent neural networks. The neighborhood information plays an important role in the image classification process. Convolutional neural networks (CNNs) have been shown to be effective in learning the local features of HSI. However, CNNs do not consider the spatial dependency of non-adjacent image patches. Recurrent neural networks (RNNs) can effectively establish the relationship of non-adjacent image patches, but it can only be applied to single-dimensional (1D) sequence. In this paper, we propose multi-scale hierarchical recurrent neural networks (MHRNNs) to learn the spatial dependency of non-adjacent image patches in the two-dimension (2D) spatial domain. First, to better represent the objects with different scales, we generate multi-scale 3D image patches of central pixel and surrounding pixels. Then, 3D CNNs extract the local spectral-spatial feature from each 3D image patch, respectively. Finally, multi-scale 1D sequences in eight directions are constructed on the 3D local feature domain, and MHRNNs are proposed to capture the spatial dependency of local spectral-spatial features at different scales. The proposed method not only considers the local spectral-spatial features of the HSI, but also captures the spatial dependency of non-adjacent image patches at different scales. Experiments are performed on three real HSI datasets. The results demonstrate the superiority of the proposed method over several state-of-the-art methods in both visual appearance and classification accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
28. Generating exponentially stable states for a Hopfield Neural Network.
- Author
-
Cabrera, Erick and Sossa, Humberto
- Subjects
- *
ARTIFICIAL neural networks , *ALGORITHMS , *MACHINE learning , *MACHINE theory , *HOPFIELD networks - Abstract
An algorithm that generates an exponential number of stable states for the very well-known Hopfield Neural Network (HNN) is introduced in this paper. We show that the quantity of stable states depends on the dimension and number of components of the input pattern supporting noise. Extensive tests verify that the states generated by our algorithm are stable states and show the exponential storage capacity of a HNN. This paper opens the possibility of designing improved HNNs able to achieve exponential storage, and thus find their applicability in complex real-world problems. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
29. Facial expression recognition via learning deep sparse autoencoders.
- Author
-
Zeng, Nianyin, Zhang, Hong, Song, Baoye, Liu, Weibo, Li, Yurong, and Dobaie, Abdullah M.
- Subjects
- *
HUMAN facial recognition software , *FACIAL expression , *MACHINE learning , *PATTERN recognition systems , *ARTIFICIAL neural networks - Abstract
Facial expression recognition is an important research issue in the pattern recognition field. In this paper, we intend to present a novel framework for facial expression recognition to automatically distinguish the expressions with high accuracy. Especially, a high-dimensional feature composed by the combination of the facial geometric and appearance features is introduced to the facial expression recognition due to its containing the accurate and comprehensive information of emotions. Furthermore, the deep sparse autoencoders (DSAE) are established to recognize the facial expressions with high accuracy by learning robust and discriminative features from the data. The experiment results indicate that the presented framework can achieve a high recognition accuracy of 95.79% on the extended Cohn–Kanade (CK+) database for seven facial expressions, which outperforms the other three state-of-the-art methods by as much as 3.17%, 4.09% and 7.41%, respectively. In particular, the presented approach is also applied to recognize eight facial expressions (including the neutral) and it provides a satisfactory recognition accuracy, which successfully demonstrates the feasibility and effectiveness of the approach in this paper. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
30. Application of self-organizing map to failure modes and effects analysis methodology.
- Author
-
Chang, Wui Lee, Pang, Lie Meng, and Tay, Kai Meng
- Subjects
- *
SELF-organizing maps , *ARTIFICIAL neural networks , *FAILURE mode & effects analysis , *MACHINE learning , *DECISION making - Abstract
In this paper, a self-organizing map (SOM) neural network is used to visualize corrective actions of failure modes and effects analysis (FMEA). SOM is a popular unsupervised neural network model that aims to produce a low-dimensional map (typically a two-dimensional map) for visualizing high-dimensional data. With regards to FMEA, it is a popular methodology to identify potential failure modes for a product or a process, to assess the risk associated with those failure modes, also, to identify and carry out corrective actions to address the most serious concerns. Despite the popularity of FMEA in a wide range of industries, two well-known shortcomings are the complexity of the FMEA worksheet and its intricacy of use. To the best of our knowledge, the use of computation techniques for solving the aforementioned shortcomings is limited. The use of SOM in FMEA is new. In this paper, corrective actions in FMEA are described in their severity, occurrence and detect scores. SOM is then used as a visualization aid for FMEA users to see the relationship among corrective actions via a map. Color information from the SOM map is then included to the FMEA worksheet for better visualization. In addition, a Risk Priority Number Interval is used to allow corrective actions to be evaluated and ordered in groups. Such approach provides a quick and easily understandable framework to elucidate important information from a complex FMEA worksheet; therefore facilitating the decision-making tasks by FMEA users. The significance of this study is two-fold, viz., the use of SOM as an effective neural network learning paradigm to facilitate FMEA implementations, and the use of a computational visualization approach to tackle the two well-known shortcomings of FMEA. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
31. A Trajectory-based Attention Model for Sequential Impurity Detection.
- Author
-
He, Wenhao, Song, Haitao, Guo, Yue, Wang, Xiaonan, Bian, Guibin, and Yuan, Kui
- Subjects
- *
CONVOLUTIONAL neural networks , *ARTIFICIAL neural networks , *GLASS bottles , *OBJECT tracking (Computer vision) , *MOTION control devices , *MACHINE learning - Abstract
Impurity detection involves detecting small impurities in the liquid inside an opaque glass bottle with complex textures by looking through the bottleneck. Sometimes experts have to observe continuous frames to determine the existence of an impurity. In recent years, region-based convolutional neural networks have gained incremental successes in common object detection tasks. However, sequential impurity detections present more challenging issues than detecting targets in a single frame, because consecutive motions and appearance changes of impurities cannot be captured using those common object detectors. In this paper, we propose a simple and controllable ensemble architecture to alleviate this problem. Specifically, a siamese fusion network is used to generate impurity proposals, then an attention model based on visual features and trajectories is proposed to localize a unique region proposal in each frame, finally, a sequential region proposal classifier using a long-term recurrent convolutional network is applied to refine impurity detection performances. The proposed method achieves 79.81 % mAP on IML-DET datasets, outperforming a comparable state-of-the-art Mask R-CNN model. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
32. XCS with opponent modelling for concurrent reinforcement learners.
- Author
-
Chen, Hao, Wang, Chang, Huang, Jian, Kong, Jiangtao, and Deng, Hanqiang
- Subjects
- *
REINFORCEMENT learning , *ARTIFICIAL neural networks , *OPPONENTS , *MACHINE learning , *ACTION theory (Psychology) , *MOBILE learning - Abstract
Reinforcement learning (RL) of optimal policies against an opponent agent also with learning capability is still challenging in Markov games. A variety of algorithms have been proposed for solving this problem such as the traditional Q-learning-based RL (QbRL) algorithms as well as the state-of-the-art neural-network-based RL (NNbRL) algorithms. However, the QbRL approaches have poor generalization capability for complex problems with non-stationary opponents, while the learned policies by NNbRL algorithms are lack of explainability and transparency. In this paper, we propose an algorithm X-OMQ(λ) that integrates eXtended Classifier System (XCS) with opponent modelling for concurrent reinforcement learners in zero-sum Markov Games. The algorithm can learn general, accurate, and interpretable action selection rules and allow policy optimization using the genetic algorithm (GA). Besides, the X-OMQ(λ) agent optimizes the established opponent's model while simultaneously learning to select actions in a goal-directed manner. In addition, we use the eligibility trace mechanism to further speed up the learning process. In the reinforcement component, not only the classifiers in the action set are updated, but other relevant classifiers are also updated in a certain proportion. We demonstrate the performance of the proposed algorithm in the hunter prey problem and two adversarial soccer scenarios where the opponent is allowed to learn with several benchmark QbRL and NNbRL algorithms. The results show that our method has similar learning performance with the NNbRL algorithms while our method requires no prior knowledge of the opponent or the environment. Moreover, the learned action selection rules are also interpretable while having generalization capability. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
33. On a granular functional link network for classification.
- Author
-
Colace, Francesco, Loia, Vincenzo, Pedrycz, Witold, and Tomasiello, Stefania
- Subjects
- *
GRANULAR computing , *ARTIFICIAL neural networks , *ITERATIVE learning control , *MACHINE learning - Abstract
In this paper, we present a new granular classifier in two versions (iterative and non–iterative), by adopting some ideas originating from a kind of Functional Link Artificial Neural Network and the Functional Network schemes. These two architectures are substantially the same: they both use a function basis instead of the usual activation function, but they are different for the learning algorithm. We augment them from the perspective of Granular Computing and information granules, designing a new kind of classifier and two learning algorithms, by taking into account granularity of information. The proposed classifier exhibits the advantages of the granular architectures, that is higher accuracy and transparency. We formally discuss the convergence of the iterative learning scheme. We carry out some numerical experiments using publicly available data, by comparing the results against those results produced by the state-of-the-art methods. In particular, we achieved sound results by invoking the iterative learning scheme. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
34. A freight inspection volume forecasting approach using an aggregation/disaggregation procedure, machine learning and ensemble models.
- Author
-
Ruiz-Aguilar, Juan Jesús, Urda, Daniel, Moscoso-López, José Antonio, González-Enrique, Javier, and Turias, Ignacio J.
- Subjects
- *
MACHINE learning , *INSPECTION & review , *AGGREGATION (Statistics) , *FORECASTING , *TIME series analysis , *ARTIFICIAL neural networks , *HARBOR management - Abstract
Machine learning methods are a powerful tool to detect workload peaks and congestion in goods inspection facilities of seaports. In this paper, a time series data of freight inspection volume at the Border Inspections Posts in the Port of Algeciras Bay was used to construct 4 datasets based on different sizes of autoregressive window and several machine learning and ensemble models were used to aid decision-making in the inspection process. Moreover, an aggregation/disaggregation procedure to make predictions was proposed and compared to two different prediction horizons: daily (t + 1) and weekly (t + 7) predictions. In general, results showed that neural networks performed better than any other model independently of the size of the autoregressive window. The result obtained by a weighted average ensemble model was better and statistically significant than any other model. Moreover, the proposed aggregation/disaggregation procedure provided better performance results and more robust in terms of variance than considering daily or weekly predictions. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
35. Keywords extraction with deep neural network model.
- Author
-
Zhang, Yu, Tuo, Mingxiang, Yin, Qingyu, Qi, Le, Wang, Xuxiang, and Liu, Ting
- Subjects
- *
ARTIFICIAL neural networks , *DEEP learning , *NATURAL language processing , *KEYWORDS , *FEATURE selection , *MACHINE learning - Abstract
Keywords can express the main content of an article or a sentence. Keywords extraction is a critical issue in many Natural Language Processing (NLP) applications and can improve the performance of many NLP systems. The traditional methods of keywords extraction are based on machine learning or graph model. The performance of these methods is influenced by the feature selection and the manually defined rules. In recent years, with the emergence of deep learning technology, learning features automatically with the deep learning algorithm can improve the performance of many tasks. In this paper, we propose a deep neural network model for the task of keywords extraction. We make two extensions on the basis of traditional LSTM model. First, to better utilize both the historic and following contextual information of the given target word, we propose a target center-based LSTM model (TC-LSTM), which learns to encode the target word by considering its contextual information. Second, on the basis of TC-LSTM model, we apply the self-attention mechanism, which enables our model has an ability to focus on informative parts of the associated text. In addition, we also introduce a two-stage training method, which takes advantage of large-scale pseudo training data. Experimental results show the advantage of our method, our model beats all the baseline systems all across the board. And also, the two-stage training method is of great significance for improving the effectiveness of the model. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
36. LRP-Based path relevances for global explanation of deep architectures.
- Author
-
Guerrero-Gómez-Olmedo, Ricardo, Salmeron, Jose L., and Kuchkovsky, Carlos
- Subjects
- *
ARTIFICIAL neural networks , *MACHINE learning - Abstract
Understanding what Machine Learning models are doing is not always trivial. This is especially true for complex models such as Deep Neural Networks (DNN), which are the best-suited algorithms for modeling very complex and nonlinear relationships. But this need to understand has become a must since privacy regulations are hardening the industrial use of these models. There are different techniques to address the interpretability issues that Machine Learning models arises. This paper is focused on opening the so-called Deep Neural architectures black-box. This research extends the technique called Layer-wise Relevant Propagation (LRP) enhancing its properties to compute the most critical paths in different deep neural architectures using multicriteria analysis. We call this technique Ranked-LRP and it was tested on four different datasets and tasks, including classification and regression. The results show the worth of our proposal. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
37. DeepANF: A deep attentive neural framework with distributed representation for chromatin accessibility prediction.
- Author
-
Guo, Yanbu, Zhou, Dongming, Nie, Rencan, Ruan, Xiaoli, and Li, Weihua
- Subjects
- *
RECURRENT neural networks , *NUCLEOTIDE sequence , *ARTIFICIAL neural networks , *DEEP learning , *MACHINE learning , *CHROMATIN - Abstract
The identification of chromatin accessibility is a significant part of the genomics and genetics. However, high-throughput experimental techniques are costly and impractical for systematic identification of accessibility. Many computational methods were proposed to predict the functional regions of chromatin purely relying on DNA sequences, but they could not take full advantage of sequence information to capture hidden complex motifs among DNA sequences. Recently, deep learning algorithms have been incorporated into the chromatin accessibility predication and achieved the remarkable results. Nevertheless, there still exists a problem in chromatin accessibility prediction as how to effectively represent the complex features merely from DNA sequences. Thus, developing efficient computational methods is becoming increasingly urgent to identify functional regions of the genome. In this paper, combining convolutional and gated recurrent unit neural networks with attention mechanism, we develop a discriminative computational framework DeepANF to adaptively extract hidden pattern features and identify the chromatin accessibility based on distributed representation of DNA sequences. To verify the efficacy of the DeepANF framework, we conduct extensive experiments on five large scale datasets, and experimental results reveal that our framework not only consistently outperforms these published methods for chromatin accessibility prediction tasks, but also extracts more discriminative features from pure DNA sequences than published methods, especially on MCF-7 dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
38. Fuzzy logic interpretation of quadratic networks.
- Author
-
Fan, Fenglei and Wang, Ge
- Subjects
- *
FUZZY logic , *ARTIFICIAL neural networks , *FUZZY systems , *DEEP learning - Abstract
Over past several years, deep learning has achieved huge successes in various applications. However, such a data-driven approach is often criticized for lack of interpretability. Recently, we proposed artificial quadratic neural networks consisting of quadratic neurons in potentially many layers. In cellular level, a quadratic function is used to replace the inner product in a traditional neuron, and then undergoes a nonlinear activation. With a single quadratic neuron, any fuzzy logic operation, such as XOR, can be implemented. In this sense, any deep network constructed with quadratic neurons can be interpreted as a deep fuzzy logic system. Since traditional neural networks and quadratic counterparts can represent each other and fuzzy logic operations are naturally implemented in quadratic neural networks, it is plausible to explain how a deep neural network works with a quadratic network as the system model. In this paper, we generalize and categorize fuzzy logic operations implementable with individual quadratic neurons, and then perform statistical/information-theoretic analyses of exemplary quadratic neural networks. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
39. Extreme learning machine with local connections.
- Author
-
Li, Feng, Yang, Jie, Yao, Mingchen, Yang, Sibo, and Wu, Wei
- Subjects
- *
MACHINE learning , *BENCHMARK problems (Computer science) , *ARTIFICIAL neural networks , *LEAST squares , *FEEDFORWARD neural networks - Abstract
This paper is concerned with the sparsification of the input-hidden weights of ELM (extreme learning machine). For ordinary feedforward neural networks, the sparsification is usually done by introducing certain regularization technique into the learning process of the network. However, this strategy cannot be applied for ELM, since the input-hidden weights of ELM are supposed to be randomly chosen rather than iteratively learned. To this end, we propose a modified ELM, called ELM-LC (ELM with local connections), which is designed for the sparsification of the input-hidden weights as follows: The hidden nodes and the input nodes are divided respectively into several corresponding groups, and each input node group is fully connected with its corresponding hidden node group, but is not connected with any other hidden node group. As in the usual ELM, the input-hidden weights are randomly given, and the hidden-output weights are obtained through a least square learning. In the numerical simulations on some benchmark problems, the new ELM-LC behaves better than the traditional ELM and the ELM with normal sparse input-hidden weights. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
40. Frequency–based slow feature analysis.
- Author
-
Doumanoglou, Alexandros, Vretos, Nicholas, and Daras, Petros
- Subjects
- *
ARTIFICIAL neural networks , *SUPPORT vector machines , *MACHINE learning - Abstract
Slow Feature Analysis (SFA) is an unsupervised learning algorithm which extracts slowly varying features from a temporal vectorial signal. In SFA, feature slowness is measured by the average value of its squared time-derivative. In this paper, we introduce Frequency-Based Slow Feature Analysis (FSFA) and prove that it is a generalization of SFA in the frequency domain. In FSFA, the low pass filtered versions of the extracted slow features have maximum energy, making slowness a filter dependent measurement. Experimental results show that the extracted features depend on the selected filter kernel and are different than the signals extracted using SFA. However, it is proven that there is one filter kernel that makes FSFA equivalent to SFA. Furthermore, experiments on UCF-101 video action recognition dataset, showcase that the features extracted by FSFA, with proper filter kernels, result in improved classification performance when compared to the features extracted by standard SFA. Finally, an experiment on UCF-101, with an indicative, simple and shallow neural network, being composed of FSFA and SFA nodes, demonstrates that the previously mentioned network, can transform the features extracted by a known Convolutional Neural Network to a new feature space, where classification performance through Support Vector Machine can be improved. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
41. Unsupervised pre-trained filter learning approach for efficient convolution neural network.
- Author
-
Rehman, Sadaqat ur, Tu, Shanshan, Waqas, Muhammad, Huang, Yongfeng, Rehman, Obaid ur, Ahmad, Basharat, and Ahmad, Salman
- Subjects
- *
MATHEMATICAL convolutions , *COMPUTER vision , *ARTIFICIAL neural networks , *MACHINE learning , *SYSTEMS design - Abstract
The concept of Convolution Neural Network (ConvNet or CNN) is evaluated from the animal visual cortex. Since humans can learn through experience, similarly, ConvNet changes its weight accordingly to accomplish the desired output through backpropagation. In this paper, we provide a comprehensive survey of the relationship between ConvNet with different pre-trained learning methodologies and its optimization effects. These hybrid networks further develop the state-of-the-art algorithms in recognition, classification, and detection of images, speeches, texts, and videos. Furthermore, some task-specific applications of ConvNet have been introduced in computer vision. To validate the survey, we also perform some experiments on a public face and skin detection dataset to provide an authentic solution. The experimental results on the benchmark dataset highlight the merit of efficient pre-trained learning algorithms for optimized ConvNet. To motivate the follow-up research, we identify open problems and present future directions with regards to optimized ConvNet system design parameters and unsupervised learning. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
42. Single image super-resolution via multi-scale residual channel attention network.
- Author
-
Cao, Feilong and Liu, Huan
- Subjects
- *
HIGH resolution imaging , *ARTIFICIAL neural networks , *IMAGE reconstruction algorithms , *MACHINE learning - Abstract
Recently, various convolutional neural networks (CNNs) based single image super-resolution (SR) methods have been vigorously explored, and a lot of impressive results have emerged. However, more or less unfortunately, most of the methods mainly focused on increasing the depth of network to improve reconstruction performance. As a matter of fact, deeper depth of network usually means an increase in parameters and computations, or worse still, the increase in parameters or computations often results in the difficulty to train the network. This paper develops a new SR approach called multi-scale residual channel attention network (MSRCAN), which is comparative shallow two-stage neural network structure, and can extract more details to effectively ameliorate the quality of SR. Specifically, a multi-scale residual channel attention block (MSRCAB) is designed to plenarily exploit the image features with convolutional kernels of different sizes. At the same time, a channel attention mechanism is introduced to recalibrate the channel significance of feature mappings adaptively. Furthermore, multiple short skip connections and a long skip connection are presented in each MSRCAB to complement information loss. Moreover, the two-stage design contributes to fully uncover low-level and high-level information. Evaluation on the benchmark data set indicates that the proposed method can rival the state-of-the-art convolutional methods. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
43. Co-evolutionary multi-task learning with predictive recurrence for multi-step chaotic time series prediction.
- Author
-
Rohitash Chandra, Yew-Soon Ong, and Chi-Keong Goh
- Subjects
- *
TIME series analysis , *MACHINE learning , *PREDICTION models , *EVOLUTIONARY algorithms , *ARTIFICIAL neural networks - Abstract
Multi-task learning employs a shared representation of knowledge for learning several instances of the same problem. Multi-step time series problem is one of the most challenging problems for machine learning methods. The performance of a prediction model face challenges for higher prediction horizons due to the accumulation of errors. Cooperative coevolution employs in a divide and conquer approach for training neural networks and has been very promising for single step ahead time series prediction. Recently, co-evolutionary multi-task learning has been proposed for dynamic time series prediction. In this paper, we adapt co-evolutionary multi-task learning for multi-step prediction where predictive recurrence is developed to feature knowledge from previous states for future prediction horizon. The goal of the paper is to present a network architecture with predictive recurrence which is capable of multi-step prediction through a form of multi-task learning. We employ cooperative neuro-evolution and an evolutionary algorithm as baselines for comparison. The results show that the proposed method provides the best generalization performance in most cases. Comparison of results with the literature has shown to be promising which motivates further application of the approach for related real-world problems. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
44. Gram–Schmidt process based incremental extreme learning machine.
- Author
-
Zhao, Yong-Ping, Li, Zhi-Qiang, Xi, Peng-Peng, Liang, Dong, Sun, Liguo, and Chen, Ting-Hao
- Subjects
- *
MACHINE learning , *COMPUTER architecture , *COMPUTER algorithms , *INFINITE series (Mathematics) , *ARTIFICIAL neural networks , *REGRESSION analysis - Abstract
To compact the architecture of extreme learning machine (ELM), two incremental learning algorithms are proposed in this paper. The previous incremental learning algorithms for ELM recruit hidden nodes randomly, which is equivalent to implementing a random selection from a candidate set of infinite size. Hence, it is impossible to recruit good hidden nodes, and thus it usually requires more hidden nodes than traditional neural networks to achieve matched performance. To improve the quality of the hidden nodes recruited, an incremental learning algorithm for ELM is presented based on Gram--Schmidt process (GSI-ELM), which recruits the best hidden node from a random subset of fixed size via defining an evaluating criterion at each learning step. However, the “nesting effect” exists in the GSI-ELM, that is to say, the hidden nodes once recruited by GSI-ELM can not be later discarded. To treat this “nesting problem”, the improved GSI-ELM (IGSI-ELM) is generated with an elimination mechanism. At each learning step IGSI-ELM eliminates the worst hidden node from the already-recruited group if it is not the newly-recruited one. Finally, to verify the efficacy and feasibility of the proposed algorithms, i.e. GSI-ELM and IGSI-ELM, in this paper, experiments on regression and classification benchmark data sets are investigated. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
45. Neural networks: An overview of early research, current frameworks and new challenges.
- Author
-
Prieto, Alberto, Prieto, Beatriz, Ortigosa, Eva Martinez, Ros, Eduardo, Pelayo, Francisco, Ortega, Julio, and Rojas, Ignacio
- Subjects
- *
ARTIFICIAL neural networks , *COMPUTER simulation , *NEUROPHYSIOLOGY , *PROBLEM solving , *COMPUTATIONAL neuroscience , *COMPUTATIONAL intelligence , *MACHINE learning - Abstract
This paper presents a comprehensive overview of modelling, simulation and implementation of neural networks, taking into account that two aims have emerged in this area: the improvement of our understanding of the behaviour of the nervous system and the need to find inspiration from it to build systems with the advantages provided by nature to perform certain relevant tasks. The development and evolution of different topics related to neural networks is described (simulators, implementations, and real-world applications) showing that the field has acquired maturity and consolidation, proven by its competitiveness in solving real-world problems. The paper also shows how, over time, artificial neural networks have contributed to fundamental concepts at the birth and development of other disciplines such as Computational Neuroscience, Neuro-engineering, Computational Intelligence and Machine Learning. A better understanding of the human brain is considered one of the challenges of this century, and to achieve it, as this paper goes on to describe, several important national and multinational projects and initiatives are marking the way to follow in neural-network research. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
46. Composite learning adaptive sliding mode control for AUV target tracking.
- Author
-
Guo, Yuyan, Qin, Hongde, Xu, Bin, Han, Yi, Fan, Quan-Yong, and Zhang, Pengchao
- Subjects
- *
SLIDING mode control , *AUTONOMOUS underwater vehicles , *NONLINEAR functions , *ARTIFICIAL neural networks , *MACHINE learning - Abstract
This paper studies the controller design for an autonomous underwater vehicle (AUV) with the target tracking task. Considering the uncertainty the nonlinear longitudinal model, a sliding mode controller is designed. Meanwhile the neural networks (NNs) are used to approximate the unknown nonlinear function in the model. To improve the NNs learning rapidity, the prediction error which reflect the learning performance is constructed, further the updating law is designed utilizing the composite learning technique. The system stability is guaranteed through the Lyapunov approach. The simulation results verify that the designed method could force the AUV to track the target until rendezvous, and the model uncertainty is addressed better via the composite learning algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
47. Convolutional neural network based on an extreme learning machine for image classification.
- Author
-
Park, Youngmin and Yang, Hyun S.
- Subjects
- *
CONVOLUTIONAL neural networks , *ARTIFICIAL neural networks , *MACHINE learning , *SUPERVISED learning - Abstract
Abstract Over the last decade, substantial advances have been made in various computer vision technologies and many of them are based on convolutional neural network (CNN) architecture. Typically, CNN is trained by a stochastic gradient descent algorithm using Back-Propagation (BP) but the training process is adversely affected by slow convergence and the need for extensive parameter tuning. In this paper, we propose a new CNN architecture and training algorithm based on an Extreme Learning Machine (ELM) to overcome these drawbacks. The proposed training algorithm is a layer-wise training method for CNN, and uses an alternating strategy of random convolutional filters and semi-supervised filters to combine the advantages of both approaches. On each semi-supervised layer, the CNN efficiently solves a convex optimization problem based on nonlinear random projection. It is faster and requires less human effort than BP-based training. We experimentally validated the proposed method using a well-known character and object recognition benchmark. In our experiment, the performance of our method is comparable to approaches based on deep features and has higher accuracy than other unsupervised feature-learning methods. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
48. Adaptive entropy-based learning with dynamic artificial neural network.
- Author
-
Pinto, Tiago, Morais, Hugo, and Corchado, Juan Manuel
- Subjects
- *
MACHINE learning , *ARTIFICIAL neural networks , *MULTILAYER perceptrons - Abstract
Abstract Entropy models the added information associated to data uncertainty, proving that stochasticity is not purely random. This paper explores the potential improvement of machine learning methodologies through the incorporation of entropy analysis in the learning process. A multi-layer perceptron is applied to identify patterns in previous forecasting errors achieved by a machine learning methodology. The proposed learning approach is adaptive to the training data through a re-training process that includes only the most recent and relevant data, thus excluding misleading information from the training process. The learnt error patterns are then combined with the original forecasting results in order to improve forecasting accuracy, using the Rényi entropy to determine the amount in which the original forecasted value should be adapted considering the learnt error patterns. The proposed approach is combined with eleven different machine learning methodologies, and applied to the forecasting of electricity market prices using real data from the Iberian electricity market operator – OMIE. Results show that through the identification of patterns in the forecasting error, the proposed methodology is able to improve the learning algorithms' forecasting accuracy and reduce the variability of their forecasting errors. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
49. Integral reinforcement learning off-policy method for solving nonlinear multi-player nonzero-sum games with saturated actuator.
- Author
-
Ren, He, Zhang, Huaguang, Wen, Yinlei, and Liu, Chong
- Subjects
- *
REINFORCEMENT learning , *MACHINE learning , *ITERATIVE methods (Mathematics) , *NUMERICAL analysis , *ARTIFICIAL neural networks , *ARTIFICIAL intelligence , *COMPUTER algorithms - Abstract
Abstract In this paper, an effective off-policy algorithm is proposed to solve the continuous time nonzero-sum (NZS) control problem for unknown nonlinear systems with saturated actuator. A class of nonquadratic function is used to construct the performance functions to deal with constrained inputs. Utilizing the integral reinforcement learning (IRL) technique, the off-policy learning mechanism is introduced to design an iterative method for the continuous-time NZS constrained control problem without requiring the knowledge of system dynamics. To show the convergence of the proposed method, the traditional policy iteration (PI) method is discussed for the continuous-time NZS control problem with saturated actuator at first. Then, the equivalence of the proposed method with the traditional PI method is proved. Neural networks are introduced to construct the actor-critic structure, where the critic neural networks are aimed at approximating the iterative value functions and the actor neural networks are aimed at approximating the iterative control policies. Finally, two cases are simulated to verify the effectiveness of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
50. Enhanced feature fusion through irrelevant redundancy elimination in intra-class and extra-class discriminative correlation analysis.
- Author
-
Wu, Zuobin, Mao, Kezhi, and Ng, Gee-Wah
- Subjects
- *
DEEP learning , *MACHINE learning , *STATISTICAL correlation , *PRINCIPAL components analysis , *DISCRIMINANT analysis , *ARTIFICIAL neural networks , *ARTIFICIAL intelligence - Abstract
Abstract Feature fusion aims to provide enhancements of data authenticity in both traditional and deep learning pattern analysis. Canonical correlation analysis (CCA) based feature fusion is a main technique for exploring the mutual relationships of multiple feature sets. In traditional CCA-based feature fusion, the dimensionality of each feature set is usually first reduced using principal component analysis (PCA), linear discriminant analysis (LDA) etc. to ensure non-singularity and invertibility of covariance matrices. One issue with the above standard CCA-based feature fusion is that the reduced feature sets generated by PCA or LDA may neglect certain correlation information among different feature sets which is useful for CCA, and this in turn may degrade the following classification performance. Another issue is that most CCA fused features may still contain redundancies due to the correlation criterion. These redundancies may be relevant or irrelevant to class labels. The irrelevant redundancies may degrade the pattern recognition performance, while the relevant redundancies can make the pattern recognition system more robust. In this paper, we propose an enhanced feature fusion scheme through irrelevant redundancy elimination in intra-class and extra-class discriminative correlation analysis (IEDCA-IRE) addressing the above two issues. IEDCA-IRE explores the intra-class correlation including both the pairs-wise correlation like CCA-based feature fusion approaches and the correlation across different features within the same class. By incorporating kernelized IEDCA into minimum redundancy maximum relevance (mRMR) criterion, only the relevant redundancy is retained in the fused feature. Our proposed IEDCA-IRE can be used in unimodal feature fusion, multimodal feature fusion, fusion of deep features extracted from different deep neural network models as well as fusion of deep features and handcrafted features. Extensive experiments have proved its effectiveness. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.