1,242 results on '"incremental learning"'
Search Results
2. Incremental Learning Using a Grow-and-Prune Paradigm With Efficient Neural Networks
- Author
-
Niraj K. Jha, Hongxu Yin, and Xiaoliang Dai
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer science ,Inference ,02 engineering and technology ,Neural network synthesis ,Machine learning ,computer.software_genre ,Machine Learning (cs.LG) ,0202 electrical engineering, electronic engineering, information engineering ,Computer Science (miscellaneous) ,Redundancy (engineering) ,Neural and Evolutionary Computing (cs.NE) ,computer.programming_language ,Artificial neural network ,business.industry ,Computer Science - Neural and Evolutionary Computing ,020207 software engineering ,Computer Science Applications ,Human-Computer Interaction ,Scratch ,Incremental learning ,Deep neural networks ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,MNIST database ,Information Systems - Abstract
Deep neural networks (DNNs) have become a widely deployed model for numerous machine learning applications. However, their fixed architecture, substantial training cost, and significant model redundancy make it difficult to efficiently update them to accommodate previously unseen data. To solve these problems, we propose an incremental learning framework based on a grow-and-prune neural network synthesis paradigm. When new data arrive, the neural network first grows new connections based on the gradients to increase the network capacity to accommodate new data. Then, the framework iteratively prunes away connections based on the magnitude of weights to enhance network compactness, and hence recover efficiency. Finally, the model rests at a lightweight DNN that is both ready for inference and suitable for future grow-and-prune updates. The proposed framework improves accuracy, shrinks network size, and significantly reduces the additional training cost for incoming data compared to conventional approaches, such as training from scratch and network fine-tuning. For the LeNet-300-100 (LeNet-5) neural network architectures derived for the MNIST dataset, the framework reduces training cost by up to 64% (67%), 63% (63%), and 69% (73%) compared to training from scratch, network fine-tuning, and grow-and-prune from scratch, respectively. For the ResNet-18 architecture derived for the ImageNet dataset (DeepSpeech2 for the AN4 dataset), the corresponding training cost reductions against training from scratch, network fine-tunning, and grow-and-prune from scratch are 64% (67%), 60% (62%), and 72% (71%), respectively. Our derived models contain fewer network parameters but achieve higher accuracy relative to conventional baselines.
- Published
- 2022
- Full Text
- View/download PDF
3. Fast and Progressive Misbehavior Detection in Internet of Vehicles Based on Broad Learning and Incremental Learning Systems
- Author
-
Yushan Zhu, Haixia Gu, Han Shuangshuang, Xiao Wang, Fei-Yue Wang, and Linyao Yang
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,Deep learning ,Key features ,Machine learning ,computer.software_genre ,Computer Science Applications ,Nonlinear system ,Hardware and Architecture ,Signal Processing ,Incremental learning ,Scalability ,In vehicle ,The Internet ,Artificial intelligence ,Raw data ,business ,computer ,Information Systems - Abstract
In recent years, deep learning has been widely used in vehicle misbehavior detection and has attracted great attention due to its powerful nonlinear mapping ability. However, because of the large number of network parameters, the training processes of these methods are time-consuming. Besides, the existing detection methods lack scalability, thus they are not suitable for Internet of Vehicles (IoV) where new data is constantly generated. In this paper, the concept of Broad Learning System (BLS) is innovatively introduced into vehicle misbehavior detection. In order to make better use of vehicle information, key features are firstly extracted from the collected raw data. Then, a BLS is established, which is able to calculate the connection weight of the network efficiently and effectively by ridge regression approximation. Finally, the system can be updated and refined by incremental learning algorithm based on the newly generated data in IoV. Experimental results show that the proposed method performs much better than deep learning or traditional classifiers, and could update and optimize the old model fastly and progressively while improving the system’s misbehavior detection accuracy.
- Published
- 2022
- Full Text
- View/download PDF
4. HarMI: Human Activity Recognition Via Multi-Modality Incremental Learning
- Author
-
Yujun Li, Fuzhen Zhuang, Jingjing Gu, Dongxiao Yu, Zhaochun Ren, Yang Yang, Xiao Zhang, and Hongzheng Yu
- Subjects
business.industry ,Computer science ,Machine learning ,computer.software_genre ,Multi modality ,Computer Science Applications ,Machine Learning ,Activity recognition ,Wearable Electronic Devices ,Text mining ,Health Information Management ,Incremental learning ,Humans ,Human Activities ,Neural Networks, Computer ,Smartphone ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,computer ,Biotechnology - Abstract
Nowadays, with the development of various kinds of sensors in smartphones or wearable devices, human activity recognition (HAR) has been widely researched and has numerous applications in healthcare, smart city, etc. Many techniques based on hand-crafted feature engineering or deep neural network have been proposed for sensor based HAR. However, these existing methods usually recognize activities offline, which means the whole data should be collected before training, occupying large-capacity storage space. Moreover, once the offline model training finished, the trained model can't recognize new activities unless retraining from the start, thus with a high cost of time and space. In this paper, we propose a multi-modality incremental learning model, called HarMI, with continuous learning ability. The proposed HarMI model can start training quickly with little storage space and easily learn new activities without storing previous training data. In detail, we first adopt attention mechanism to align heterogeneous sensor data with different frequencies. In addition, to overcome catastrophic forgetting in incremental learning, HarMI utilizes the elastic weight consolidation and canonical correlation analysis from a multi-modality perspective. Extensive experiments based on two public datasets demonstrate that HarMI can achieve a superior performance compared with several state-of-the-arts.
- Published
- 2022
- Full Text
- View/download PDF
5. Polarity Classification of Social Media Feeds Using Incremental Learning — A Deep Learning Approach
- Author
-
Sathya Madhusudhanan and Suresh Jaganathan
- Subjects
business.industry ,Polarity (physics) ,Computer science ,Applied Mathematics ,Deep learning ,computer.software_genre ,Computer Graphics and Computer-Aided Design ,Signal Processing ,Incremental learning ,Social media ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,computer ,Natural language processing - Published
- 2022
- Full Text
- View/download PDF
6. Online Semisupervised Broad Learning System for Industrial Fault Diagnosis
- Author
-
Xiaokun Pu and Chunguang Li
- Subjects
Training set ,Manifold regularization ,Generalization ,business.industry ,Computer science ,Deep learning ,Supervised learning ,Process (computing) ,Construct (python library) ,Machine learning ,computer.software_genre ,Fault (power engineering) ,Computer Science Applications ,Data modeling ,Control and Systems Engineering ,Incremental learning ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,computer ,Information Systems - Abstract
Recently, broad learning system (BLS) has been introduced to solve industrial fault diagnosis problems and has achieved impressive performance. As a flat network, BLS enjoys a simple linear structure, which enables BLS to train and update the model efficiently in an incremental manner, and it potentially has better generalization capacity than deep learning methods when training data are limited. The basic BLS is a supervised learning method that requires all the training data to be labeled. However, in many practical industrial scenarios, data labels are usually difficult to obtain. Existing semisupervised variant uses manifold regularization framework to capture the information of unlabeled data, however, such a method will sacrifice the incremental learning capacity of BLS. Considering that in many practical applications, training data are sequentially generated, in this article, an online semisupervised broad learning system (OSSBLS) is proposed for fault diagnosis in these cases. The proposed method not only can efficiently construct and incrementally update the model, but also can take advantage of unlabeled data to improve the model's diagnostic performance. Experimental results on the Tennessee Eastman process and a real-world air compressor working process demonstrate the superiority of OSSBLS in terms of both diagnostic performance and time consumption.
- Published
- 2021
- Full Text
- View/download PDF
7. Fast and robust supervised machine learning approach for classification and prediction of Parkinson’s disease onset
- Author
-
Kadambari K and Lavanya Madhuri Bollipo
- Subjects
Parkinson's disease ,business.industry ,Computer science ,Biomedical Engineering ,Computational Mechanics ,Disease ,medicine.disease ,Machine learning ,computer.software_genre ,Computer Science Applications ,Support vector machine ,Frank–Wolfe algorithm ,Motor system ,Incremental learning ,medicine ,Radiology, Nuclear Medicine and imaging ,Artificial intelligence ,business ,computer - Abstract
Parkinson’s disease (PD) is an incurable long-term neurodegenerative disorder that mainly influence the motor system and eventually results in significant morbidity. The use of computational tools ...
- Published
- 2021
- Full Text
- View/download PDF
8. Exploiting abstractions for grammar‐based learning of complex multi‐agent behaviours
- Author
-
Michael Barlow, Dilini Samarasinghe, Erandi Lakshika, and Kathryn Kasmarik
- Subjects
Grammar ,Programming language ,Computer science ,media_common.quotation_subject ,Multi-agent system ,computer.software_genre ,Theoretical Computer Science ,Human-Computer Interaction ,Artificial Intelligence ,Parallel learning ,Grammatical evolution ,Incremental learning ,computer ,Software ,media_common - Published
- 2021
- Full Text
- View/download PDF
9. Multimodal continual learning with sonographer eye-tracking in fetal ultrasound
- Author
-
Yifan Cai, Aris T. Papageorghiou, Harshita Sharma, Lior Drukker, Arijit Patra, Pierre Chatelain, and J. Alison Noble
- Subjects
Forgetting ,business.industry ,Computer science ,Continual learning ,Machine learning ,computer.software_genre ,Article ,Image (mathematics) ,Reduction (complexity) ,Incremental learning ,Sonographer ,Eye tracking ,Artificial intelligence ,Extended time ,business ,computer - Abstract
Deep networks have been shown to achieve impressive accuracy for some medical image analysis tasks where large datasets and annotations are available. However, tasks involving learning over new sets of classes arriving over extended time is a different and difficult challenge due to the tendency of reduction in performance over old classes while adapting to new ones. Controlling such a 'forgetting' is vital for deployed algorithms to evolve with new arrivals of data incrementally. Usually, incremental learning approaches rely on expert knowledge in the form of manual annotations or active feedback. In this paper, we explore the role that other forms of expert knowledge might play in making deep networks in medical image analysis immune to forgetting over extended time. We introduce a novel framework for mitigation of this forgetting effect in deep networks considering the case of combining ultrasound video with point-of-gaze tracked for expert sonographers during model training. This is used along with a novel weighted distillation strategy to reduce the propagation of effects due to class imbalance.
- Published
- 2022
- Full Text
- View/download PDF
10. SpaceNet: Make Free Space for Continual Learning
- Author
-
Ghada Sokar, Mykola Pechenizkiy, Decebal Constantin Mocanu, Data Mining, Process Science, EAISI Health, EAISI Foundational, Digital Society Institute, and Datamanagement & Biometrics
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,0209 industrial biotechnology ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Cognitive Neuroscience ,cs.LG ,Lifelong learning ,Computer Science - Computer Vision and Pattern Recognition ,Inference ,Machine Learning (stat.ML) ,02 engineering and technology ,Machine learning ,computer.software_genre ,Regularization (mathematics) ,Machine Learning (cs.LG) ,020901 industrial engineering & automation ,Statistics - Machine Learning ,Artificial Intelligence ,Robustness (computer science) ,Deep neural networks ,0202 electrical engineering, electronic engineering, information engineering ,cs.CV ,Sparse training ,Forgetting ,Artificial neural network ,business.industry ,stat.ML ,Computer Science Applications ,Class incremental learning ,Incremental learning ,020201 artificial intelligence & image processing ,Continual learning ,Artificial intelligence ,business ,computer ,MNIST database - Abstract
The continual learning (CL) paradigm aims to enable neural networks to learn tasks continually in a sequential fashion. The fundamental challenge in this learning paradigm is catastrophic forgetting previously learned tasks when the model is optimized for a new task, especially when their data is not accessible. Current architectural-based methods aim at alleviating the catastrophic forgetting problem but at the expense of expanding the capacity of the model. Regularization-based methods maintain a fixed model capacity; however, previous studies showed the huge performance degradation of these methods when the task identity is not available during inference (e.g. class incremental learning scenario). In this work, we propose a novel architectural-based method referred as SpaceNet for class incremental learning scenario where we utilize the available fixed capacity of the model intelligently. SpaceNet trains sparse deep neural networks from scratch in an adaptive way that compresses the sparse connections of each task in a compact number of neurons. The adaptive training of the sparse connections results in sparse representations that reduce the interference between the tasks. Experimental results show the robustness of our proposed method against catastrophic forgetting old tasks and the efficiency of SpaceNet in utilizing the available capacity of the model, leaving space for more tasks to be learned. In particular, when SpaceNet is tested on the well-known benchmarks for CL: split MNIST, split Fashion-MNIST, and CIFAR-10/100, it outperforms regularization-based methods by a big performance gap. Moreover, it achieves better performance than architectural-based methods without model expansion and achieved comparable results with rehearsal-based methods, while offering a huge memory reduction., Published in Neurocomputing Journal
- Published
- 2021
- Full Text
- View/download PDF
11. Online Tensor-Based Learning Model for Structural Damage Detection
- Author
-
Ali Anaissi, Seid Miad Zandavi, and Basem Suleiman
- Subjects
Damage detection ,General Computer Science ,business.industry ,Computer science ,Online learning ,020206 networking & telecommunications ,02 engineering and technology ,Machine learning ,computer.software_genre ,Online analysis ,Tensor (intrinsic definition) ,Incremental learning ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Anomaly detection ,Artificial intelligence ,Structural health monitoring ,business ,computer - Abstract
The online analysis of multi-way data stored in a tensor has become an essential tool for capturing the underlying structures and extracting the sensitive features that can be used to learn a predictive model. However, data distributions often evolve with time and a current predictive model may not be sufficiently representative in the future. Therefore, incrementally updating the tensor-based features and model coefficients are required in such situations. A new efficient tensor-based feature extraction, named Nesterov Stochastic Gradient Descent (NeSGD), is proposed for online (CP) decomposition. According to the new features obtained from the resultant matrices of NeSGD, a new criterion is triggered for the updated process of the online predictive model. Experimental evaluation in the field of structural health monitoring using laboratory-based and real-life structural datasets shows that our methods provide more accurate results compared with existing online tensor analysis and model learning. The results showed that the proposed methods significantly improved the classification error rates, were able to assimilate the changes in the positive data distribution over time, and maintained a high predictive accuracy in all case studies.
- Published
- 2021
- Full Text
- View/download PDF
12. Deep Neural Networks Techniques using for Learning Automata Based Incremental Learning Method
- Author
-
C. Swetha Reddy et.al
- Subjects
Learning automata ,business.industry ,Computer science ,General Mathematics ,Machine learning ,computer.software_genre ,Training (civil) ,Education ,Visual recognition ,Computational Mathematics ,Visual language ,Computational Theory and Mathematics ,Incremental learning ,Deep neural networks ,Artificial intelligence ,business ,Set (psychology) ,computer ,MNIST database - Abstract
Surprisingly comprehensive learning methods are implemented in many large learning machine data, such as visual recognition and visual language processing. Much of the success of advanced training in recent years is due to leadership training, which requires a set of information for specific tasks, before such training. However, in reality, selected tasks related to personal study are gradually accumulated over time as it is difficult to collect and submit training data manually. It provides a way to continue learning some information columns and examples of steps that are specific to the new class and called additional learning. In this post, we recommend the best machine training method for further training for deep neural networks. The basic idea is to learn a deep system with strong connections that can be "activated" or "turned off" at different stages. The approach you suggest allows you to reduce the distribution of old services as you learn new for example new training, which increases the effectiveness of training in the additional training phase. Experiments with MNIST and CIFAR-100 show that our approach can be implemented in other long-term phases in deep neuron models and achieve better results from zero-base training.
- Published
- 2021
- Full Text
- View/download PDF
13. On the Challenges of Open World Recognition Under Shifting Visual Domains
- Author
-
Barbara Caputo, Massimiliano Mancini, Fabio Cermelli, and Dario Fontanel
- Subjects
FOS: Computer and information sciences ,Control and Optimization ,Computer science ,Generalization ,Computer Vision and Pattern Recognition (cs.CV) ,media_common.quotation_subject ,Computer Science - Computer Vision and Pattern Recognition ,Biomedical Engineering ,visual learning ,02 engineering and technology ,Machine learning ,computer.software_genre ,Computer Science - Robotics ,Artificial Intelligence ,020204 information systems ,Deep Learning ,Computer Vision ,Incremental Learning ,Open World Recognition ,Domain Shift ,0202 electrical engineering, electronic engineering, information engineering ,Set (psychology) ,Function (engineering) ,media_common ,Point (typography) ,business.industry ,Mechanical Engineering ,Cognitive neuroscience of visual object recognition ,Deep learning for visual perception ,Computer Science Applications ,Variety (cybernetics) ,Human-Computer Interaction ,recognition ,Control and Systems Engineering ,Benchmark (computing) ,Robot ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Robotics (cs.RO) ,computer - Abstract
Robotic visual systems operating in the wild must act in unconstrained scenarios, under different environmental conditions while facing a variety of semantic concepts, including unknown ones. To this end, recent works tried to empower visual object recognition methods with the capability to i) detect unseen concepts and ii) extended their knowledge over time, as images of new semantic classes arrive. This setting, called Open World Recognition (OWR), has the goal to produce systems capable of breaking the semantic limits present in the initial training set. However, this training set imposes to the system not only its own semantic limits, but also environmental ones, due to its bias toward certain acquisition conditions that do not necessarily reflect the high variability of the real-world. This discrepancy between training and test distribution is called domain-shift. This work investigates whether OWR algorithms are effective under domain-shift, presenting the first benchmark setup for assessing fairly the performances of OWR algorithms, with and without domain-shift. We then use this benchmark to conduct analyses in various scenarios, showing how existing OWR algorithms indeed suffer a severe performance degradation when train and test distributions differ. Our analysis shows that this degradation is only slightly mitigated by coupling OWR with domain generalization techniques, indicating that the mere plug-and-play of existing algorithms is not enough to recognize new and unknown categories in unseen domains. Our results clearly point toward open issues and future research directions, that need to be investigated for building robot visual systems able to function reliably under these challenging yet very real conditions. Code available at https://github.com/DarioFontanel/OWR-VisualDomains, RAL/ICRA 2021
- Published
- 2021
- Full Text
- View/download PDF
14. An Incremental Learning Based Convolutional Neural Network Model for Large-Scale and Short-Term Traffic Flow
- Author
-
Yanli Shao, Bin Chen, Feng Yu, and Jinglong Fang
- Subjects
Information Systems and Management ,Scale (ratio) ,business.industry ,Computer science ,Traffic flow ,Machine learning ,computer.software_genre ,Convolutional neural network ,Computer Science Applications ,Term (time) ,Artificial Intelligence ,Incremental learning ,Artificial intelligence ,business ,computer - Abstract
Traffic flow prediction is very important for smooth road conditions in cities and convenient travel for residents. With the explosive growth of traffic flow data size, traditional machine learning algorithms cannot fit large-scale training data effectively and the deep learning algorithms do not work well because of the huge training and update costs, and the prediction accuracy may need to be further improved when an emergency affecting traffic occurs. In this study, an incremental learning based convolutional neural network model, TF-net, is proposed to achieve the efficient and accurate prediction of large-scale and short-term traffic flow. The key idea is to introduce the uncertainty features into the model without increasing the training cost to improve the prediction accuracy. Meanwhile, based on the idea of combining incremental learning with active learning, a certain percentage of typical samples in historical traffic flow data are sampled to fine-tune the prediction model, so as to further improve the prediction accuracy for special situations and ensure the real-time requirement. The experimental results show that the proposed traffic flow prediction model has better performance than the existing methods.
- Published
- 2021
- Full Text
- View/download PDF
15. Concept-Cognitive Learning Model for Incremental Concept Learning
- Author
-
Yong Shi, Yunlong Mi, Wenqi Liu, and Jinhai Li
- Subjects
Context model ,Computer science ,business.industry ,02 engineering and technology ,Machine learning ,computer.software_genre ,Computer Science Applications ,Data modeling ,Human-Computer Interaction ,Control and Systems Engineering ,020204 information systems ,Concept learning ,Incremental learning ,Still face ,Cognitive learning ,0202 electrical engineering, electronic engineering, information engineering ,Task analysis ,020201 artificial intelligence & image processing ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,computer ,Classifier (UML) ,Software - Abstract
Concept-cognitive learning (CCL) is an emerging field of concerning incremental concept learning and dynamic knowledge processing in the context of dynamic environments. Although CCL has been widely researched in theory, the existing studies of CCL have one problem: the concepts obtained by CCL systems do not have generalization ability. In the meantime, the existing incremental algorithms still face some challenges that: 1) classifiers have to adapt gradually and 2) the previously acquired knowledge should be efficiently utilized. To address these problems, based on the advantage that CCL can naturally integrate new data into itself for enhancing flexibility of concept learning, we first propose a new CCL model (CCLM) to extend the classical methods of CCL, which is not only a new classifier but also good at incremental learning. Unlike the existing CCL systems, the theory of CCLM is mainly based on a formal decision context rather than a formal context. In learning concepts from dynamic environments, we show that CCLM can naturally incorporate new data into itself with a sufficient theoretical guarantee for incremental learning. For classification task and knowledge storage, our results on various data sets demonstrate that CCLM can simultaneously: 1) achieve the state-of-the-art static and dynamic classification task and 2) directly accomplish preservation of previously acquired knowledge (or concepts) under dynamic environments.
- Published
- 2021
- Full Text
- View/download PDF
16. Incremental learning framework for real‐world fraud detection environment
- Author
-
Farzana Anowar and Samira Sadaoui
- Subjects
Computational Mathematics ,Artificial Intelligence ,Computer science ,business.industry ,Incremental learning ,Artificial intelligence ,Machine learning ,computer.software_genre ,business ,computer ,Imbalanced data - Published
- 2021
- Full Text
- View/download PDF
17. Context-aware incremental learning-based method for personalized human activity recognition
- Author
-
Pekka Siirtola and Juha Röning
- Subjects
human activity recognition ,General Computer Science ,Computer science ,Decision tree ,Word error rate ,Computational intelligence ,Context (language use) ,02 engineering and technology ,Machine learning ,computer.software_genre ,Personalization ,Activity recognition ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,incremental learning ,adaptive models ,Ensemble forecasting ,business.industry ,context-awareness ,Quadratic classifier ,Linear discriminant analysis ,Weighting ,Incremental learning ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer - Abstract
This study introduces an ensemble-based personalized human activity recognition method relying on incremental learning, which is a method for continuous learning, that can not only learn from streaming data but also adapt to different contexts and changes in context. This adaptation is based on a novel weighting approach which gives bigger weight to those base models of the ensemble which are the most suitable to the current context. In this article, contexts are different body positions for inertial sensors. The experiments are performed in two scenarios: (S1) adapting model to a known context, and (S2) adapting model to a previously unknown context. In both scenarios, the models had to also adapt to the data of previously unknown person, as the initial user-independent dataset did not include any data from the studied user. In the experiments, the proposed ensemble-based approach is compared to non-weighted personalization method relying on ensemble-based classifier and to static user-independent model. Both ensemble models are experimented using three different base classifiers (linear discriminant analysis, quadratic discriminant analysis, and classification and regression tree). The results show that the proposed ensemble method performs much better than non-weighted ensemble model for personalization in both scenarios no matter which base classifier is used. Moreover, the proposed method outperforms user-independent models. In scenario 1, the error rate of balanced accuracy using user-independent model was 13.3%, using non-weighted personalization method 13.8%, and using the proposed method 6.4%. The difference is even bigger in scenario 2, where the error rate using user-independent model is 36.6%, using non-weighted personalization method 36.9%, and using the proposed method 14.1%. In addition, F1 scores also show that the proposed method performs much better in both scenarios that the rival methods. Moreover, as a side result, it was noted that the presented method can also be used to recognize body position of the sensor.
- Published
- 2021
- Full Text
- View/download PDF
18. T-DFNN: An Incremental Learning Algorithm for Intrusion Detection Systems
- Author
-
Masayoshi Aritsugi and Mahendra Data
- Subjects
Structure (mathematical logic) ,incremental learning ,Forgetting ,General Computer Science ,Computer science ,business.industry ,Process (engineering) ,catastrophic forgetting ,General Engineering ,deep learning ,Intrusion detection system ,Machine learning ,computer.software_genre ,TK1-9971 ,Tree (data structure) ,Feedforward neural network ,General Materials Science ,Incremental learning algorithm ,Artificial intelligence ,Electrical engineering. Electronics. Nuclear engineering ,Macro ,business ,classification algorithm ,computer ,Network intrusion detection - Abstract
Machine learning has recently become a popular algorithm in building reliable intrusion detection systems (IDSs). However, most of the models are static and trained using datasets containing all targeted intrusions. If new intrusions emerge, these trained models must be retrained using old and new datasets to classify all intrusions accurately. In real-world situations, new threats continuously appear. Therefore, machine learning algorithms used for IDSs should have the ability to learn incrementally when these new intrusions emerge. To solve this issue, we propose T-DFNN. T-DFNN is an algorithm capable of learning new intrusions incrementally as they emerge. A T-DFNN model is composed of multiple deep feedforward neural network (DFNN) models connected in a tree-like structure. We examined our proposed algorithm using CICIDS2017, an open and widely used network intrusion dataset covering benign traffic and the most common network intrusions. The experimental results showed that the T-DFNN algorithm can incrementally learn new intrusions and reduce the catastrophic forgetting effect. The macro average of the F1-score of the T-DFNN model was over 0.85 for every retraining process. In addition, our proposed T-DFNN model has some advantages in several aspects compared to other models. Compared to the DFNN and Hoeffding tree models trained with a dataset containing only the latest targeted intrusions, our proposed T-DFNN model has higher F1-scores. Moreover, our proposed T-DFNN model has significantly shorter training times than a DFNN model trained using a dataset containing all targeted intrusions. Even though several factors can affect the duration of the training process, the T-DFNN algorithm shows promising results in solving the problem of ever-evolving network intrusion variants.
- Published
- 2021
19. Baseline Model Training in Sensor-Based Human Activity Recognition: An Incremental Learning Approach
- Author
-
Linlin Chen, Jianyu Xiao, Haipeng Chen, and Xuemin Hong
- Subjects
General Computer Science ,Computer science ,Feature extraction ,Wearable computer ,02 engineering and technology ,Machine learning ,computer.software_genre ,baseline model ,Data modeling ,Personalization ,Activity recognition ,Classifier (linguistics) ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,Infomax ,DIM ,incremental learning ,business.industry ,General Engineering ,020206 networking & telecommunications ,TK1-9971 ,broad learning system ,Task analysis ,020201 artificial intelligence & image processing ,Artificial intelligence ,Human activity recognition ,Electrical engineering. Electronics. Nuclear engineering ,business ,computer - Abstract
Human activity recognition (HAR) based on wearable sensors has attracted significant research attention in recent years due to its advantages in availability, accuracy, and privacy-friendliness. HAR baseline model is essentially a general-purpose classifier trained to recognized multiple activity patterns of most user types. It provides the input for subsequent steps of model personalization. Training a good baseline model is of fundamental importance because it has significant impacts on the ultimate HAR accuracy. In practice, baseline model training in HAR is a non-trivial problem that faces two challenges: insufficient training data and biased training data. This paper proposes a novel baseline model training scheme to tackle the two challenges using Deep InfoMax (DIM)-based unsupervised feature extraction and Broad Learning System (BLS)-based incremental learning, respectively. Experimental results demonstrate that the proposed scheme outperform conventional methods in terms of overall accuracy, computational efficiency, and the ability to adapt to dynamic scenarios with changing data characteristics.
- Published
- 2021
20. Parallel Multistage Wide Neural Network
- Author
-
Jiangbo Xi, Jianwu Fang, Xin Wei, Okan K. Ersoy, Tianjun Wu, and Chaoying Zhao
- Subjects
Artificial neural network ,Computer Networks and Communications ,Computer science ,business.industry ,Deep learning ,Decision tree ,Machine learning ,computer.software_genre ,Multistage wide learning ,Ensemble learning ,Computer Science Applications ,Support vector machine ,Tree (data structure) ,Artificial Intelligence ,Multilayer perceptron ,Parallel testing ,Artificial intelligence ,business ,computer ,Software ,MNIST database ,Incremental learning - Abstract
Deep learning networks have achieved great success in many areas, such as in large-scale image processing. They usually need large computing resources and time and process easy and hard samples inefficiently in the same way. Another undesirable problem is that the network generally needs to be retrained to learn new incoming data. Efforts have been made to reduce the computing resources and realize incremental learning by adjusting architectures, such as scalable effort classifiers, multi-grained cascade forest (gcForest), conditional deep learning (CDL), tree CNN, decision tree structure with knowledge transfer (ERDK), forest of decision trees with radial basis function (RBF) networks, and knowledge transfer (FDRK). In this article, a parallel multistage wide neural network (PMWNN) is presented. It is composed of multiple stages to classify different parts of data. First, a wide radial basis function (WRBF) network is designed to learn features efficiently in the wide direction. It can work on both vector and image instances and can be trained in one epoch using subsampling and least squares (LS). Second, successive stages of WRBF networks are combined to make up the PMWNN. Each stage focuses on the misclassified samples of the previous stage. It can stop growing at an early stage, and a stage can be added incrementally when new training data are acquired. Finally, the stages of the PMWNN can be tested in parallel, thus speeding up the testing process. To sum up, the proposed PMWNN network has the advantages of: 1) optimized computing resources; 2) incremental learning; and 3) parallel testing with stages. The experimental results with the MNIST data, a number of large hyperspectral remote sensing data, and different types of data in different application areas, including many image and nonimage datasets, show that the WRBF and PMWNN can work well on both image and nonimage data and have very competitive accuracy compared to learning models, such as stacked autoencoders, deep belief nets, support vector machine (SVM), multilayer perceptron (MLP), LeNet-5, RBF network, recently proposed CDL, broad learning, gcForest, ERDK, and FDRK.
- Published
- 2021
- Full Text
- View/download PDF
21. Robust Incremental Outlier Detection Approach Based on a New Metric in Data Streams
- Author
-
Ali Degirmenci and Omer Karal
- Subjects
General Computer Science ,Computer science ,Data stream mining ,General Engineering ,robustness ,outlier detection ,computer.software_genre ,TK1-9971 ,Metric (mathematics) ,General Materials Science ,Anomaly detection ,Data mining ,new metric ,Electrical engineering. Electronics. Nuclear engineering ,Electrical and Electronic Engineering ,computer ,local outlier factor (LOF) ,Incremental learning - Abstract
Detecting outliers in real time from multivariate streaming data is a vital and challenging research topic in many areas. Recently introduced the incremental Local Outlier Factor (iLOF) approach and its variants have received considerable attention as they achieve high detection performance in data streams with varying distributions. However, these iLOF-based approaches still have some major limitations: i) Poor detection in high-dimensional data; ii) The difficulty of determining the proper nearest neighbor number $k$ ; iii) Instead of labeling the outlier, assigning a score to each sample that indicates the probability to be an outlier; iv) Inability to detect a long sequence (small cluster) of outliers. This article proposes a new robust outlier detection method (RiLOF) based on iLOF that can effectively overcome these limitations. In the RiLOF method, a novel metric called Median of Nearest Neighborhood Absolute Deviation (MoNNAD) has been developed that uses the median of the local absolute deviation of the samples LOF values. Unlike the previously reported LOF-based approaches, RiLOF is capable of achieving outlier detection in different data stream applications using the same hyperparameters. Extensive experiments performed on 15 different real-world data sets demonstrate that RiLOF remarkably outperforms 12 different state-of-the-art competitors.
- Published
- 2021
22. Reduce the Difficulty of Incremental Learning With Self-Supervised Learning
- Author
-
Linting Guan and Yan Wu
- Subjects
Forgetting ,General Computer Science ,Artificial neural network ,Computer science ,business.industry ,Deep learning ,Feature extraction ,General Engineering ,deep learning ,Machine learning ,computer.software_genre ,Data modeling ,Task (project management) ,TK1-9971 ,Learning disability ,self-supervised learning ,medicine ,Task analysis ,General Materials Science ,Artificial intelligence ,Electrical engineering. Electronics. Nuclear engineering ,medicine.symptom ,business ,computer ,Incremental learning - Abstract
Incremental learning requires a learning model to learn new tasks without forgetting the learned tasks continuously. However, when a deep learning model learns new tasks, it will catastrophically forget tasks it has learned before. Researchers have proposed methods to alleviate catastrophic forgetting; these methods only consider extracting features related to tasks learned before, suppression to extract features for unlearned tasks. As a result, when a deep learning model learns new tasks incrementally, the model needs to learn to extract the relevant features of the newly learned task quickly; this requires a significant change in the model’s behavior of extracting features, which increases the learning difficulty. Therefore, the model is caught in the dilemma of reducing the learning rate to retain existing knowledge or increasing the learning rate to learn new knowledge quickly. We present a study aiming to alleviate this problem by introducing self-supervised learning into incremental learning methods. We believe that the task-independent self-supervised learning signal helps the learning model extract features not only effective for the current learned task but also suitable for other tasks that have not been learned. We give a detailed algorithm combining self-supervised learning signals and incremental learning methods. Extensive experiments on several different datasets show that self-supervised signal significantly improves the accuracy of most incremental learning methods without the need for additional labeled data. We found that the self-supervised learning signal works best for the replay-based incremental learning method.
- Published
- 2021
23. Sentiment analysis for customer relationship management: an incremental learning approach
- Author
-
Pierluigi Ritrovato, Mario Vento, Nicola Capuano, and Luca Greco
- Subjects
business.industry ,Computer science ,Customer relationship management ,Hierarchical attention networks ,Machine learning ,Natural language processing ,Sentiment analysis ,02 engineering and technology ,computer.software_genre ,Loyalty business model ,Artificial Intelligence ,020204 information systems ,Incremental learning ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Customer satisfaction ,Artificial intelligence ,business ,computer ,Classifier (UML) ,Corporate management - Abstract
In recent years there has been a significant rethinking of corporate management, which is increasingly based on customer orientation principles. As a matter of fact, customer relationship management processes and systems are ever more popular and crucial to facing today’s business challenges. However, the large number of available customer communication stimuli coming from different (direct and indirect) channels, require automatic language processing techniques to help filter and qualify such stimuli, determine priorities, facilitate the routing of requests and reduce the response times. In this scenario, sentiment analysis plays an important role in measuring customer satisfaction, tracking consumer opinion, interacting with consumers and building customer loyalty. The research described in this paper proposes an approach based on Hierarchical Attention Networks for detecting the sentiment polarity of customer communications. Unlike other existing approaches, after initial training, the defined model can improve over time during system operation using the feedback provided by CRM operators thanks to an integrated incremental learning mechanism. The paper also describes the developed prototype as well as the dataset used for training the model which includes over 30.000 annotated items. The results of two experiments aimed at measuring classifier performance and validating the retraining mechanism are also presented and discussed. In particular, the classifier accuracy turned out to be better than that of other algorithms for the supported languages (macro-averaged f1-score of 0.89 and 0.79 for Italian and English respectively) and the retraining mechanism was able to improve the classification accuracy on new samples without degrading the overall system performance.
- Published
- 2020
- Full Text
- View/download PDF
24. Active and incremental learning for semantic ALS point cloud segmentation
- Author
-
Yaping Lin, George Vosselman, Yanpeng Cao, Michael Ying Yang, Department of Earth Observation Science, Faculty of Geo-Information Science and Earth Observation, and UT-I-ITC-ACQUAL
- Subjects
Active learning ,010504 meteorology & atmospheric sciences ,Computer science ,UT-Hybrid-D ,0211 other engineering and technologies ,Point cloud ,02 engineering and technology ,Machine learning ,computer.software_genre ,01 natural sciences ,ITC-HYBRID ,Entropy (information theory) ,Segmentation ,Computers in Earth Sciences ,Engineering (miscellaneous) ,Incremental learning ,021101 geological & geomatics engineering ,0105 earth and related environmental sciences ,Artificial neural network ,business.industry ,Deep learning ,Mutual information ,Semantic segmentation ,Atomic and Molecular Physics, and Optics ,Computer Science Applications ,Lidar ,Photogrammetry ,ITC-ISI-JOURNAL-ARTICLE ,Artificial intelligence ,Point clouds ,business ,computer - Abstract
Supervised training of a deep neural network for semantic segmentation of point clouds requires a large amount of labelled data. Nowadays, it is easy to acquire a huge number of points with high density in large-scale areas using current LiDAR and photogrammetric techniques. However it is extremely time-consuming to manually label point clouds for model training. In this paper, we propose an active and incremental learning strategy to iteratively query informative point cloud data for manual annotation and the model is continuously trained to adapt to the newly labelled samples in each iteration. We evaluate the data informativeness step by step and effectively and incrementally enrich the model knowledge. The data informativeness is estimated by two data dependent uncertainty metrics (point entropy and segment entropy) and one model dependent metric (mutual information). The proposed methods are tested on two datasets. The results indicate the proposed uncertainty metrics can enrich current model knowledge by selecting informative samples, such as considering points with difficult class labels and choosing target objects with various geometries in the labelled training pool. Compared to random selection, our metrics provide valuable information to significantly reduce the labelled training samples. In contrast with training from scratch, the incremental fine-tuning strategy significantly save the training time.
- Published
- 2020
- Full Text
- View/download PDF
25. An integrated classification model for incremental learning
- Author
-
Hu Ji, Zhiyuan Li, Xin Liu, Chengwei Ren, Yi Yang, Chenggang Yan, Dongliang Peng, and Jiyong Zhang
- Subjects
Computer Networks and Communications ,Process (engineering) ,Computer science ,Image classification ,Feature vector ,Masked-face dataset ,02 engineering and technology ,Machine learning ,computer.software_genre ,Field (computer science) ,Article ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,Artificial Intelligence & Image Processing ,Confidence weight ,Incremental learning ,Contextual image classification ,business.industry ,Software Engineering ,020207 software engineering ,Transfer learning ,Statistical classification ,Hardware and Architecture ,Face (geometry) ,0801 Artificial Intelligence and Image Processing, 0803 Computer Software, 0805 Distributed Computing, 0806 Information Systems ,Noise (video) ,Artificial intelligence ,Transfer of learning ,business ,computer ,Software - Abstract
Incremental Learning is a particular form of machine learning that enables a model to be modified incrementally, when new data becomes available. In this way, the model can adapt to the new data without the lengthy and time-consuming process required for complete model re-training. However, existing incremental learning methods face two significant problems: 1) noise in the classification sample data, 2) poor accuracy of modern classification algorithms when applied to modern classification problems. In order to deal with these issues, this paper proposes an integrated classification model, known as a Pre-trained Truncated Gradient Confidence-weighted (Pt-TGCW) model. Since the pre-trained model can extract and transform image information into a feature vector, the integrated model also shows its advantages in the field of image classification. Experimental results on ten datasets demonstrate that the proposed method outperform the original counterparts.
- Published
- 2020
26. BNGBS: An efficient network boosting system with triple incremental learning capabilities for more nodes, samples, and classes
- Author
-
Honglin Qiao, Min Zhou, Chunhui Zhao, Chuan Fu, Yuanlong Li, C. L. Philip Chen, and Liangjun Feng
- Subjects
0209 industrial biotechnology ,Boosting (machine learning) ,Computer science ,business.industry ,Cognitive Neuroscience ,02 engineering and technology ,Machine learning ,computer.software_genre ,Computer Science Applications ,020901 industrial engineering & automation ,Artificial Intelligence ,Incremental learning ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,Gradient boosting ,business ,Additive model ,computer - Abstract
As an ensemble algorithm, network boosting enjoys a powerful classification ability but suffers from the tedious and time-consuming training process. To tackle the problem, in this paper, a broad network gradient boosting system (BNGBS) is developed by integrating gradient boosting machine with broad networks, in which the classification loss caused by a base broad network is learned and eliminated by followed networks in a cascade manner. The proposed system is constructed as an additive model and can be easily optimized by a greedy strategy instead of the tedious back-propagation algorithm, resulting in a more efficient learning process. Meanwhile, triple incremental learning capabilities including the increment of feature nodes, increment of input samples, and increment of target classes are designed. The proposed system can be efficiently updated and expanded based on the current status instead of being entirely retrained when the demands for more feature nodes, input samples, and target classes are proposed. The node-increment ability allows to add more feature nodes into the built system if the current structures are not effective for learning. The sample-increment ability is developed to allow the model to keep learning from the coming batch data. The class-increment ability is used to tackle the issue that the coming batch data may contain unseen categories. In comparison with existing popular machine learning methods, comprehensive results based on eight benchmark datasets illustrate the effectiveness of the proposed broad network gradient boosting system for the classification task.
- Published
- 2020
- Full Text
- View/download PDF
27. Beyond Cross-Validation—Accuracy Estimation for Incremental and Active Learning Models
- Author
-
Helge Ritter, Christian Limberg, and Heiko Wersing
- Subjects
lcsh:Computer engineering. Computer hardware ,Computer science ,online learning ,lcsh:TK7885-7895 ,02 engineering and technology ,Machine learning ,computer.software_genre ,Cross-validation ,accuracy estimation ,020204 information systems ,active learning ,0202 electrical engineering, electronic engineering, information engineering ,benchmarking ,incremental learning ,business.industry ,error prediction ,Cognitive neuroscience of visual object recognition ,Regression analysis ,Benchmarking ,Standard methods ,ComputingMethodologies_PATTERNRECOGNITION ,classifier evaluation ,Robot ,020201 artificial intelligence & image processing ,Artificial intelligence ,Benchmark data ,business ,Classifier (UML) ,computer - Abstract
For incremental machine-learning applications it is often important to robustly estimate the system accuracy during training, especially if humans perform the supervised teaching. Cross-validation and interleaved test/train error are here the standard supervised approaches. We propose a novel semi-supervised accuracy estimation approach that clearly outperforms these two methods. We introduce the Configram Estimation (CGEM) approach to predict the accuracy of any classifier that delivers confidences. By calculating classification confidences for unseen samples, it is possible to train an offline regression model, capable of predicting the classifier&rsquo, s accuracy on novel data in a semi-supervised fashion. We evaluate our method with several diverse classifiers and on analytical and real-world benchmark data sets for both incremental and active learning. The results show that our novel method improves accuracy estimation over standard methods and requires less supervised training data after deployment of the model. We demonstrate the application of our approach to a challenging robot object recognition task, where the human teacher can use our method to judge sufficient training.
- Published
- 2020
28. Adaptive Chunk-Based Dynamic Weighted Majority for Imbalanced Data Streams With Concept Drift
- Author
-
Yang Lu, Yuan Yan Tang, and Yiu-ming Cheung
- Subjects
Data stream ,Concept drift ,Computer Networks and Communications ,Computer science ,02 engineering and technology ,computer.software_genre ,Ensemble learning ,Computer Science Applications ,Weighting ,Data set ,ComputingMethodologies_PATTERNRECOGNITION ,Artificial Intelligence ,Incremental learning ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data mining ,computer ,Classifier (UML) ,Software ,Statistical hypothesis testing - Abstract
One of the most challenging problems in the field of online learning is concept drift, which deeply influences the classification stability of streaming data. If the data stream is imbalanced, it is even more difficult to detect concept drifts and make an online learner adapt to them. Ensemble algorithms have been found effective for the classification of streaming data with concept drift, whereby an individual classifier is built for each incoming data chunk and its associated weight is adjusted to manage the drift. However, it is difficult to adjust the weights to achieve a balance between the stability and adaptability of the ensemble classifiers. In addition, when the data stream is imbalanced, the use of a size-fixed chunk to build a single classifier can create further problems; the data chunk may contain too few or even no minority class samples (i.e., only majority class samples). A classifier built on such a chunk is unstable in the ensemble. In this article, we propose a chunk-based incremental learning method called adaptive chunk-based dynamic weighted majority (ACDWM) to deal with imbalanced streaming data containing concept drift. ACDWM utilizes an ensemble framework by dynamically weighting the individual classifiers according to their classification performance on the current data chunk. The chunk size is adaptively selected by statistical hypothesis tests to access whether the classifier built on the current data chunk is sufficiently stable. ACDWM has four advantages compared with the existing methods as follows: 1) it can maintain stability when processing nondrifted streams and rapidly adapt to the new concept; 2) it is entirely incremental, i.e., no previous data need to be stored; 3) it stores a limited number of classifiers to ensure high efficiency; and 4) it adaptively selects the chunk size in the concept drift environment. Experiments on both synthetic and real data sets containing concept drift show that ACDWM outperforms both state-of-the-art chunk-based and online methods.
- Published
- 2020
- Full Text
- View/download PDF
29. Broad Reinforcement Learning for Supporting Fast Autonomous IoT
- Author
-
Jialin Zhao, Xin Wei, Liang Zhou, and Yi Qian
- Subjects
Computer Networks and Communications ,Computer science ,business.industry ,Control (management) ,Big data ,020206 networking & telecommunications ,02 engineering and technology ,Machine learning ,computer.software_genre ,Computer Science Applications ,Dilemma ,Action (philosophy) ,Hardware and Architecture ,Signal Processing ,Incremental learning ,0202 electrical engineering, electronic engineering, information engineering ,Reinforcement learning ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Internet of Things ,computer ,Information Systems - Abstract
The emergence of a massive Internet-of-Things (IoT) ecosystem is changing the human lifestyle. In several practical scenarios, IoT still faces significant challenges with reliance on human assistance and unacceptable response time for the treatment of big data. Therefore, it is very urgent to establish a new framework and algorithm to solve problems specific to this kind of fast autonomous IoT. Traditional reinforcement learning and deep reinforcement learning (DRL) approaches have abilities of autonomous decision making, but time-consuming modeling and training procedures limit their applications. To get over this dilemma, this article proposes the broad reinforcement learning (BRL) approach that fits fast autonomous IoT as it combines the broad learning system (BLS) with a reinforcement learning paradigm to improve the agent’s efficiency and accuracy of modeling and decision making. Specifically, a BRL framework is first constructed. Then, the associated learning algorithm, containing training pool introduction, training sample preparation, and incremental learning for BLS, is carefully designed. Finally, as a case study of fast autonomous IoT, the proposed BRL approach is applied to traffic light control, aiming to alleviate traffic congestion in the intersections of smart cities. The experimental results show that the proposed BRL approach can learn better action policy at a shorter execution time when compared with competing approaches.
- Published
- 2020
- Full Text
- View/download PDF
30. Class Boundary Exemplar Selection Based Incremental Learning for Automatic Target Recognition
- Author
-
Zongjie Cao, Nengyuan Liu, Zongyong Cui, Sihang Dang, and Yiming Pi
- Subjects
Training set ,Computer science ,business.industry ,0211 other engineering and technologies ,Boundary (topology) ,02 engineering and technology ,Machine learning ,computer.software_genre ,Class (biology) ,Data modeling ,Support vector machine ,Set (abstract data type) ,Automatic target recognition ,Incremental learning ,Task analysis ,General Earth and Planetary Sciences ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,computer ,Selection (genetic algorithm) ,021101 geological & geomatics engineering - Abstract
When adding new tasks/classes in an incremental learning scenario, the previous recognition capabilities trained on the previous training data can be lost. In the real-life application of automatic target recognition (ATR), part of the previous samples may be able to be used. Most incremental learning methods have not considered how to save the previous key samples. In this article, the class boundary exemplar selection-based incremental learning (CBesIL) is proposed to save the previous recognition capabilities in the form of the class boundary exemplars. For exemplar selection, the class boundary selection method based on local geometrical and statistical information is proposed. And when adding new classes continually, a class-boundary-based data reconstruction method is introduced to update the exemplar set. Thus, when adding new classes, the previous class boundaries could be kept complete. Experimental results demonstrate that the proposed CBesIL outperforms the other state of the art on the accuracy of multiclass recognition and class-incremental recognition.
- Published
- 2020
- Full Text
- View/download PDF
31. Distributed Incremental Clustering Algorithms: A Bibliometric and Word-Cloud Review Analysis
- Author
-
Rahul Joshi, Archana Chaudhari, and Preeti Mulay
- Subjects
business.industry ,Computer science ,Incremental learning ,Artificial intelligence ,Library and Information Sciences ,Tag cloud ,Machine learning ,computer.software_genre ,Review analysis ,Cluster analysis ,business ,computer - Abstract
“Incremental Learning (IL)” is the niche area of “Machine Learning.” It is of utmost essential to keep learning incremental for ever-increasing data from all domains for effectual decisions, predic...
- Published
- 2020
- Full Text
- View/download PDF
32. Broad Convolutional Neural Network Based Industrial Process Fault Diagnosis With Incremental Learning Capability
- Author
-
Chunhui Zhao and Wanke Yu
- Subjects
Computer science ,020208 electrical & electronic engineering ,Feature extraction ,Process (computing) ,02 engineering and technology ,Root cause ,Fault (power engineering) ,computer.software_genre ,Convolutional neural network ,Control and Systems Engineering ,Incremental learning ,0202 electrical engineering, electronic engineering, information engineering ,Data mining ,Electrical and Electronic Engineering ,computer - Abstract
Fault diagnosis, which identifies the root cause of the observed out-of-control status, is essential to counteracting or eliminating faults in industrial processes. Many conventional data-driven fault diagnosis methods ignore the fault tendency of abnormal samples, and they need a complete retraining process to include the newly collected abnormal samples or fault classes. In this article, a broad convolutional neural network (BCNN) is designed with incremental learning capability for solving the aforementioned issues. The proposed method combines several consecutive samples as a data matrix, and it then extracts both fault tendency and nonlinear structure from the obtained data matrix by using convolutional operation. After that, the weights in fully connected layers can be trained based on the obtained features and their corresponding fault labels. Because of the architecture of this network, the diagnosis performance of the BCNN model can be improved by adding newly generated additional features. Finally, the incremental learning capability of the proposed method is also designed, so that the BCNN model can update itself to include new coming abnormal samples and fault classes. The proposed method is applied both to a simulated process and a real industrial process. Experimental results illustrate that it can better capture the characteristics of the fault process, and effectively update diagnosis model to include new coming abnormal samples, and fault classes.
- Published
- 2020
- Full Text
- View/download PDF
33. Downsizing and enhancing broad learning systems by feature augmentation and residuals boosting
- Author
-
Runshan Xie and Shitong Wang
- Subjects
Boosting (machine learning) ,ComputingMilieux_THECOMPUTINGPROFESSION ,Artificial neural network ,Computer science ,business.industry ,Computational intelligence ,General Medicine ,Overfitting ,Machine learning ,computer.software_genre ,Incremental learning ,Systems architecture ,Artificial intelligence ,Architecture ,business ,computer - Abstract
Recently, a broad learning system (BLS) has been theoretically and experimentally confirmed to be an efficient incremental learning system. To get rid of deep architecture, BLS shares the same architecture and learning mechanism of the well-known functional link neural networks (FLNN), but works in broad learning way on both the randomly mapped features of original features of data and their randomly generated enhancement nodes. As such, BLS often requires a huge heap of hidden nodes to achieve the prescribed or satisfactory performance, which may inevitably cause both overwhelming storage requirement and overfitting phenomenon. In this study, a stacked architecture of broad learning systems called D&BLS is proposed to achieve enhanced performance and simultaneously downsize the system architecture. By boosting the residuals between previous and current layers and simultaneously augmenting the original input space with the outputs of the previous layer as the inputs of current layer, D&BLS stacks several lightweight BLS sub-systems to guarantee stronger feature representation capability and better classification/regression performance. Three fast incremental learning algorithms of D&BLS are also developed, without the need for the whole re-training. Experimental results on some popular datasets demonstrate the effectiveness of D&BLS in the sense of both enhanced performance and reduced system architecture.
- Published
- 2020
- Full Text
- View/download PDF
34. A comparative study of general fuzzy min-max neural networks for pattern classification problems
- Author
-
Bogdan Gabrys and Thanh Tung Khuat
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,0209 industrial biotechnology ,Computer science ,Cognitive Neuroscience ,68T30, 68T20, 68T37, 68W27 ,Fuzzy set ,Machine Learning (stat.ML) ,02 engineering and technology ,Machine learning ,computer.software_genre ,Fuzzy logic ,Machine Learning (cs.LG) ,020901 industrial engineering & automation ,Empirical research ,Statistics - Machine Learning ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Artificial Intelligence & Image Processing ,Cluster analysis ,I.5.0 ,I.5.1 ,I.2.1 ,I.2.6 ,I.2.m ,I.5.2 ,I.5.3 ,I.5.4 ,Artificial neural network ,business.industry ,Computer Science Applications ,Hierarchical clustering ,ComputingMethodologies_PATTERNRECOGNITION ,Incremental learning ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Classifier (UML) ,computer - Abstract
General fuzzy min-max (GFMM) neural network is a generalization of fuzzy neural networks formed by hyperbox fuzzy sets for classification and clustering problems. Two principle algorithms are deployed to train this type of neural network, i.e., incremental learning and agglomerative learning. This paper presents a comprehensive empirical study of performance influencing factors, advantages, and drawbacks of the general fuzzy min-max neural network on pattern classification problems. The subjects of this study include (1) the impact of maximum hyperbox size, (2) the influence of the similarity threshold and measures on the agglomerative learning algorithm, (3) the effect of data presentation order, (4) comparative performance evaluation of the GFMM with other types of fuzzy min-max neural networks and prevalent machine learning algorithms. The experimental results on benchmark datasets widely used in machine learning showed overall strong and weak points of the GFMM classifier. These outcomes also informed potential research directions for this class of machine learning algorithms in the future., Comment: 18 pages, 7 figures, 12 tables
- Published
- 2020
- Full Text
- View/download PDF
35. Incremental Learning for Malware Classification in Small Datasets
- Author
-
Di Xue, Weifei Wu, Jingmei Li, and Jiaxiang Wang
- Subjects
Science (General) ,Article Subject ,Computer Networks and Communications ,Computer science ,020206 networking & telecommunications ,02 engineering and technology ,Information security ,computer.software_genre ,Data science ,Q1-390 ,Important research ,ComputingMethodologies_PATTERNRECOGNITION ,Incremental learning ,0202 electrical engineering, electronic engineering, information engineering ,T1-995 ,Malware ,020201 artificial intelligence & image processing ,computer ,Technology (General) ,Information Systems - Abstract
Information security is an important research area. As a very special yet important case, malware classification plays an important role in information security. In the real world, the malware datasets are open-ended and dynamic, and new malware samples belonging to old classes and new classes are increasing continuously. This requires the malware classification method to enable incremental learning, which can efficiently learn the new knowledge. However, existing works mainly focus on feature engineering with machine learning as a tool. To solve the problem, we present an incremental malware classification framework, named “IMC,” which consists of opcode sequence extraction, selection, and incremental learning method. We develop an incremental learning method based on multiclass support vector machine (SVM) as the core component of IMC, named “IMCSVM,” which can incrementally improve its classification ability by learning new malware samples. In IMC, IMCSVM adds the new classification planes (if new samples belong to a new class) and updates all old classification planes for new malware samples. As a result, IMC can improve the classification quality of known malware classes by minimizing the prediction error and transfer the old model with known knowledge to classify unknown malware classes. We apply the incremental learning method into malware classification, and the experimental results demonstrate the advantages and effectiveness of IMC.
- Published
- 2020
- Full Text
- View/download PDF
36. Prediction of blood glucose concentration for type 1 diabetes based on echo state networks embedded with incremental learning
- Author
-
Jianyong Tuo, Ning Li, Menghui Wang, and Youqing Wang
- Subjects
0209 industrial biotechnology ,Type 1 diabetes ,Computer science ,business.industry ,Cognitive Neuroscience ,Echo (computing) ,02 engineering and technology ,Hypoglycemia ,medicine.disease ,Machine learning ,computer.software_genre ,Artificial pancreas ,Computer Science Applications ,020901 industrial engineering & automation ,Artificial Intelligence ,Diabetes mellitus ,Incremental learning ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,020201 artificial intelligence & image processing ,State (computer science) ,Artificial intelligence ,business ,computer - Abstract
Valid prediction of blood glucose concentration can help people to manage diabetes mellitus, alert hypoglycemia/hyperglycemia, exploit artificial pancreas, and plan a treatment program. Along the development of continuous glucose monitoring system (CGMS), the massive historical data require a new modeling framework based on a data-driven perspective. Studies indicate that the glucose time series (i.e., CGMS readings) involve chaotic properties; therefore, echo state networks (ESN) and its improved variants are proposed to establish subject-specific prediction models owing to their superiority in processing chaotic systems. This study mainly has two innovations: (1) a novel combination of incremental learning and ESN is developed to obtain a suitable network structure through partial optimization of parameters; (2) a feedback ESN is proposed to excavate the relationship of different predictions. These methods are assessed on ten patients with diabetes mellitus. Experimental results substantiate that the proposed methods achieve superior prediction performance in terms of four evaluation metrics compared with three conventional methods.
- Published
- 2020
- Full Text
- View/download PDF
37. Active and Incremental Learning with Weak Supervision
- Author
-
Clemens-Alexander Brust, Christoph Käding, and Joachim Denzler
- Subjects
FOS: Computer and information sciences ,Training set ,Computer science ,business.industry ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,02 engineering and technology ,Pascal (programming language) ,010501 environmental sciences ,Machine learning ,computer.software_genre ,01 natural sciences ,Object detection ,Artificial Intelligence ,Active learning ,Incremental learning ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,0105 earth and related environmental sciences ,computer.programming_language - Abstract
Large amounts of labeled training data are one of the main contributors to the great success that deep models have achieved in the past. Label acquisition for tasks other than benchmarks can pose a challenge due to requirements of both funding and expertise. By selecting unlabeled examples that are promising in terms of model improvement and only asking for respective labels, active learning can increase the efficiency of the labeling process in terms of time and cost. In this work, we describe combinations of an incremental learning scheme and methods of active learning. These allow for continuous exploration of newly observed unlabeled data. We describe selection criteria based on model uncertainty as well as expected model output change (EMOC). An object detection task is evaluated in a continuous exploration context on the PASCAL VOC dataset. We also validate a weakly supervised system based on active and incremental learning in a real-world biodiversity application where images from camera traps are analyzed. Labeling only 32 images by accepting or rejecting proposals generated by our method yields an increase in accuracy from 25.4% to 42.6%., Comment: Accepted for publication in KI - K\"unstliche Intelligenz
- Published
- 2020
- Full Text
- View/download PDF
38. MineCap: super incremental learning for detecting and blocking cryptocurrency mining on software-defined networking
- Author
-
Natalia Castro Fernandes, Martin Andreoni Lopez, Diogo M. F. Mattos, and Helio N. Cunha Neto
- Subjects
Cryptocurrency ,business.industry ,Computer science ,Online processing ,Machine learning ,computer.software_genre ,Flow network ,Network interface controller ,Covert ,Incremental learning ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Software-defined networking ,computer ,Classifier (UML) - Abstract
Covert mining of cryptocurrency implies the use of valuable computing resources and high energy consumption. In this paper, we propose MineCap, a dynamic online mechanism for detecting and blocking covert cryptocurrency mining flows, using machine learning on software-defined networking. The proposed mechanism relies on Spark Streaming for online processing of network flows, and, when identifying a mining flow, it requests the flow blocking to the network controller. We also propose a learning technique called super incremental learning, a variant of the super learner applied to online learning, which takes the classification probabilities of an ensemble of classifiers as features for an incremental learning classifier. Hence, we design an accurate mechanism to classify mining flows that learn with incoming data with an average of 98% accuracy, 99% precision, 97% sensitivity, and 99.9% specificity and avoid concept drift–related issues.
- Published
- 2020
- Full Text
- View/download PDF
39. Incremental Learning in Deep Convolutional Neural Networks Using Partial Network Sharing
- Author
-
Syed Shakib Sarwar, Aayush Ankit, and Kaushik Roy
- Subjects
FOS: Computer and information sciences ,Scheme (programming language) ,General Computer Science ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,lifelong learning ,Computer Science - Computer Vision and Pattern Recognition ,02 engineering and technology ,Machine learning ,computer.software_genre ,Convolutional neural network ,energy-efficient learning ,Reduction (complexity) ,Set (abstract data type) ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,Incremental learning ,computer.programming_language ,Contextual image classification ,business.industry ,catastrophic forgetting ,020208 electrical & electronic engineering ,Supervised learning ,General Engineering ,020201 artificial intelligence & image processing ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Artificial intelligence ,network sharing ,Transfer of learning ,business ,lcsh:TK1-9971 ,computer ,Efficient energy use - Abstract
Deep convolutional neural network (DCNN) based supervised learning is a widely practiced approach for large-scale image classification. However, retraining these large networks to accommodate new, previously unseen data demands high computational time and energy requirements. Also, previously seen training samples may not be available at the time of retraining. We propose an efficient training methodology and incrementally growing DCNN to learn new tasks while sharing part of the base network. Our proposed methodology is inspired by transfer learning techniques, although it does not forget previously learned tasks. An updated network for learning new set of classes is formed using previously learned convolutional layers (shared from initial part of base network) with addition of few newly added convolutional kernels included in the later layers of the network. We employed a `clone-and-branch' technique which allows the network to learn new tasks one after another without any performance loss in old tasks. We evaluated the proposed scheme on several recognition applications. The classification accuracy achieved by our approach is comparable to the regular incremental learning approach (where networks are updated with new training samples only, without any network sharing), while achieving energy efficiency, reduction in storage requirements, memory access and training time., Comment: 18 pages, 13 figures. IEEE Access 2019
- Published
- 2020
- Full Text
- View/download PDF
40. Complex Emotion Profiling: An Incremental Active Learning Based Approach With Sparse Annotations
- Author
-
Selvarajah Thuseethan, John Yearwood, and Sutharshan Rajasegarar
- Subjects
sparse data ,Active learning ,General Computer Science ,Computer science ,media_common.quotation_subject ,Emotion classification ,02 engineering and technology ,Anger ,Machine learning ,computer.software_genre ,020204 information systems ,Perception ,emotion recognition ,0202 electrical engineering, electronic engineering, information engineering ,complex emotions ,Profiling (information science) ,General Materials Science ,GeneralLiterature_REFERENCE(e.g.,dictionaries,encyclopedias,glossaries) ,media_common ,incremental learning ,business.industry ,General Engineering ,Disgust ,Sadness ,Surprise ,Incremental learning ,020201 artificial intelligence & image processing ,Artificial intelligence ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,business ,computer ,lcsh:TK1-9971 - Abstract
Generally, in-the-wild emotions are complex in nature. They often occur in combinations of multiple basic emotions, such as fear, happy, disgust, anger, sadness and surprise. Unlike the basic emotions, annotation of complex emotions, such as pain, is a time-consuming and expensive exercise. Moreover, there is an increasing demand for profiling such complex emotions as they are useful in many real-world application domains, such as medical, psychology, security and computer science. The traditional emotion recognition systems require a significant amount of annotated training samples to understand the complex emotions. This limits the direct applicability of those methods for complex emotion detection from images and videos. Therefore, it is important to learn the profile of the in-the-wild complex emotions accurately using limited annotated samples. In this paper, we propose a deep framework to incrementally and actively profile in-the-wild complex emotions, from sparse data. Our approach consists of three major components, namely a pre-processing unit, an optimization unit and an active learning unit. The pre-processing unit removes the variations present in the complex emotion images extracted from an uncontrolled environment. Our novel incremental active learning algorithm along with an optimization unit effectively predicts the complex emotions present in-the-wild. Evaluation using multiple complex emotions benchmark datasets reveals that our proposed approach performs close to the human perception capability in effectively profiling complex emotions. Further, our proposed approach shows a significant performance enhancement, in comparison with the state-of-the-art deep networks and other benchmark complex emotion profiling approaches.
- Published
- 2020
41. Confidence Calibration for Incremental Learning
- Author
-
Yeongwoo Nam, Yeonsik Jo, Dongmin Kang, and Jonghyun Choi
- Subjects
General Computer Science ,Computer science ,Calibration (statistics) ,Sample (statistics) ,02 engineering and technology ,010501 environmental sciences ,Machine learning ,computer.software_genre ,01 natural sciences ,Task (project management) ,Margin (machine learning) ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,Set (psychology) ,continual learning ,Incremental learning ,0105 earth and related environmental sciences ,Class (computer programming) ,Forgetting ,business.industry ,General Engineering ,confidence calibration ,Memory management ,020201 artificial intelligence & image processing ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Artificial intelligence ,business ,lcsh:TK1-9971 ,computer - Abstract
Class incremental learning is an online learning paradigm wherein the classes to be recognized are gradually increased with limited memory, storing only a partial set of examples of past tasks. At a task transition, we observe an unintentional imbalance of confidence or likelihood between the classes of the past and the new task. We argue that the imbalance aggravates a catastrophic forgetting for class incremental learning. We propose a simple yet effective learning objective to balance the confidence of classes of old tasks and new task in the class incremental learning setup. In addition, we compare various sample memory configuring strategies and propose a novel sample memory management policy to alleviate the forgetting further. The proposed method outperforms the state of the arts in many evaluation metrics including accuracy and forgetting $F$ by a large margin (up to 5.71% in $A_{10}$ and 17.1% in $F_{10}$ ) in extensive empirical validations on multiple visual recognition datasets such as CIFAR100, TinyImageNet and a subset of the ImageNet.
- Published
- 2020
- Full Text
- View/download PDF
42. Class-Incremental Learning With Deep Generative Feature Replay for DNA Methylation-Based Cancer Classification
- Author
-
Erdenebileg Batbaatar, Van Huy Pham, Lkhagvadorj Munkhdalai, Kwang Ho Park, Tsatsral Amarbayasgalan, Khishigsuren Davagdorj, and Keun Ho Ryu
- Subjects
0301 basic medicine ,Information privacy ,General Computer Science ,Computer science ,Lifelong learning ,Feature extraction ,Feature selection ,010501 environmental sciences ,Machine learning ,computer.software_genre ,01 natural sciences ,Data modeling ,Computational biology ,03 medical and health sciences ,class-incremental learning ,Feature (machine learning) ,variational autoencoder ,General Materials Science ,continual learning ,0105 earth and related environmental sciences ,business.industry ,Deep learning ,General Engineering ,deep learning ,deep generative model ,Autoencoder ,Generative model ,030104 developmental biology ,Incremental learning ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Artificial intelligence ,business ,lcsh:TK1-9971 ,computer - Abstract
Developing lifelong learning algorithms are mandatory for computational systems biology. Recently, many studies have shown how to extract biologically relevant information from high-dimensional data to understand the complexity of cancer by taking the benefit of deep learning (DL). Unfortunately, new cancer growing up into the hundred types that make systems difficult to classify them efficiently. In contrast, the current state-of-the-art continual learning (CL) methods are not designed for the dynamic characteristics of high-dimensional data. And data security and privacy are some of the main issues in the biomedical field. This article addresses three practical challenges for class-incremental learning (Class-IL) such as data privacy, high-dimensionality, and incremental learning problems. To solve this, we propose a novel continual learning approach, called Deep Generative Feature Replay (DGFR), for cancer classification tasks. DGFR consists of an incremental feature selection (IFS) and a scholar network (SN). IFS is used for selecting the most significant CpG sites from high-dimensional data. We investigate different dimensions to find an optimal number of selected CpG sites. SN employs a deep generative model for generating pseudo data without accessing past samples and a neural network classifier for predicting cancer types. We use a variational autoencoder (VAE), which has been successfully applied to this research field in previous works. All networks are sequentially trained on multiple tasks in the Class-IL setting. We evaluated the proposed method on the publicly available DNA methylation data. The experimental results show that the proposed DGFR achieves a significantly superior quality of cancer classification tasks with various state-of-the-art methods in terms of accuracy.
- Published
- 2020
- Full Text
- View/download PDF
43. Visual focus of attention estimation based on improved hybrid incremental dynamic Bayesian network
- Author
-
Xue-feng Chen, Chen Xu, Yuan Luo, Xing-yao Liu, Ting-kai Fan, and Yi Zhang
- Subjects
Computer science ,media_common.quotation_subject ,02 engineering and technology ,Machine learning ,computer.software_genre ,01 natural sciences ,Adaptability ,010309 optics ,020210 optoelectronics & photonics ,Deflection (engineering) ,Robustness (computer science) ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,Electrical and Electronic Engineering ,Dynamic Bayesian network ,media_common ,business.industry ,Conditional probability ,Regression analysis ,Condensed Matter Physics ,Gaze ,Atomic and Molecular Physics, and Optics ,Electronic, Optical and Magnetic Materials ,Incremental learning ,Artificial intelligence ,business ,computer - Abstract
In this paper, a visual focus of attention (VFOA) detection method based on the improved hybrid incremental dynamic Bayesian network (IHIDBN) constructed with the fusion of head, gaze and prediction sub-models is proposed aiming at solving the problem of the complexity and uncertainty in dynamic scenes. Firstly, gaze detection sub-model is improved based on the traditional human eye model to enhance the recognition rate and robustness for different subjects which are detected. Secondly, the related sub-models are described, and conditional probability is used to establish regression models respectively. Also an incremental learning method is used to dynamically update the parameters to improve adaptability of this model. The method has been evaluated on two public datasets and daily experiments. The results show that the method proposed in this paper can effectively estimate VFOA from user, and it is robust to the free deflection of the head and distance change.
- Published
- 2020
- Full Text
- View/download PDF
44. Overcoming the Barriers That Obscure the Interlinking and Analysis of Clinical Data Through Harmonization and Incremental Learning
- Author
-
Salvatore De Vita, Fanis G. Kalatzis, Themis P. Exarchos, E. Zampeli, Dimitrios I. Fotiadis, Konstantina Kourou, Andreas V. Goules, Fotini N. Skopouli, Saviana Gandolfo, Chiara Baldini, Athanasios G. Tzioufas, and Vasileios C. Pezoulas
- Subjects
Computer science ,Computer applications to medicine. Medical informatics ,R858-859.7 ,data harmonization ,Machine learning ,computer.software_genre ,federated data analytics ,03 medical and health sciences ,0302 clinical medicine ,Medical technology ,R855-855.5 ,030304 developmental biology ,Semantic matching ,data curation ,incremental learning ,0303 health sciences ,Data curation ,business.industry ,Data sharing ,Workflow ,Analytics ,Data quality ,Hyperparameter optimization ,Data analysis ,Artificial intelligence ,business ,computer ,030217 neurology & neurosurgery - Abstract
Goal: To present a framework for data sharing, curation, harmonization and federated data analytics to solve open issues in healthcare, such as, the development of robust disease prediction models. Methods: Data curation is applied to remove data inconsistencies. Lexical and semantic matching methods are used to align the structure of the heterogeneous, curated cohort data along with incremental learning algorithms including class imbalance handling and hyperparameter optimization to enable the development of disease prediction models. Results: The applicability of the framework is demonstrated in a case study of primary Sjögren's Syndrome, yielding harmonized data with increased quality and more than 85% agreement, along with lymphoma prediction models with more than 80% sensitivity and specificity. Conclusions: The framework provides data quality, harmonization and analytics workflows that can enhance the statistical power of heterogeneous clinical data and enables the development of robust models for disease prediction.
- Published
- 2020
- Full Text
- View/download PDF
45. Exemplar-Supported Representation for Effective Class-Incremental Learning
- Author
-
Lei Guo, Gang Xie, Xinying Xu, and Jinchang Ren
- Subjects
Exemplar-based subspace clustering ,General Computer Science ,Computer science ,TK ,02 engineering and technology ,Machine learning ,computer.software_genre ,01 natural sciences ,010305 fluids & plasmas ,0103 physical sciences ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,incremental learning ,Forgetting ,business.industry ,General Engineering ,memory aware synapses ,image recognition ,ComputingMethodologies_PATTERNRECOGNITION ,Incremental learning ,020201 artificial intelligence & image processing ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Artificial intelligence ,business ,lcsh:TK1-9971 ,computer ,Classifier (UML) ,Feature learning - Abstract
Catastrophic forgetting is a key challenge for class-incremental learning with deep neural networks, where the performance decreases considerably while dealing with long sequences of new classes. To tackle this issue, in this paper, we propose a new exemplar-supported representation for incremental learning (ESRIL) approach that consists of three components. First, we use memory aware synapses (MAS) pre-trained on the ImageNet to retain the ability of robust representation learning and classification for old classes from the perspective of the model. Second, exemplar-based subspace clustering (ESC) is utilized to construct the exemplar set, which can keep the performance from various views of the data. Third, the nearest class multiple centroids (NCMC) is used as the classifier to save the training cost of the fully connected layer of MAS when the criterion is met. Intensive experiments and analyses are presented to show the influence of various backbone structures and the effectiveness of different components in our model. Experiments on several general-purpose and fine-grained image recognition datasets have fully demonstrated the efficacy of the proposed methodology.
- Published
- 2020
- Full Text
- View/download PDF
46. Challenges in Task Incremental Learning for Assistive Robotics
- Author
-
Rosa H. M. Chan, Qi She, Xuesong Shi, Yimin Zhang, and Fan Feng
- Subjects
0209 industrial biotechnology ,robotic vision systems ,Forgetting ,General Computer Science ,business.industry ,Computer science ,General Engineering ,Cognitive neuroscience of visual object recognition ,02 engineering and technology ,010502 geochemistry & geophysics ,Machine learning ,computer.software_genre ,01 natural sciences ,Task (project management) ,020901 industrial engineering & automation ,Machine intelligence ,Incremental learning ,General Materials Science ,Artificial intelligence ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,business ,Set (psychology) ,computer ,lcsh:TK1-9971 ,0105 earth and related environmental sciences - Abstract
Recent breakthroughs in computer vision areas, ranging from detection, segmentation, to classification, rely on the availability of large-scale representative training datasets. Yet, robotic vision poses new challenges towards applying visual algorithms developed from these datasets because the latter implicitly assume a fixed set of categories and time-invariant distribution of tasks. In practice, assistive robots should be able to operate in dynamic environments with everyday changes. The variations of four commonly observed factors, including illumination, occlusion, camera-object distance/angles and clutter, could make lifelong/continual learning in computer vision more challenging. Large-scale datasets previously made publicly available were relatively simple, and rarely include such real-world challenges in data collection. Benefited from the recent released OpenLORIS-Object dataset, which explicitly includes these real-world challenges in the lifelong object recognition task, we evaluate three most adopted regularization methods in lifelong/continual learning (Learning without Forgetting, Elastic Weights Consolidation, and Synaptic Intelligence). Their performances were compared with the naive and cumulative training modes as the lower bound and upper bound of performances, respectively. The experiments conducted on the dataset focused on task incremental learning, i.e., incremental difficulty based on the four environment of factors. However, all the three most reported lifelong/continual learning algorithms have failed with the increase in encountered batches across various metrics with indistinguishable performance comparing to the naive training mode. Our results highlight the current challenges in lifelong object recognition for assistive robots to operate in real-world dynamic scene.
- Published
- 2020
47. Incremental Learning of Latent Forests
- Author
-
Pedro Larrañaga, Fernando Rodriguez-Sanchez, and Concha Bielza
- Subjects
General Computer Science ,Computer science ,Test data generation ,02 engineering and technology ,Latent variable ,Machine learning ,computer.software_genre ,hidden variables ,01 natural sciences ,010104 statistics & probability ,Cardinality ,0202 electrical engineering, electronic engineering, information engineering ,Code (cryptography) ,General Materials Science ,Fraction (mathematics) ,latent tree model ,0101 mathematics ,Latent variable model ,Informática ,variational Bayes ,business.industry ,General Engineering ,Process (computing) ,Tree (data structure) ,Incremental learning ,020201 artificial intelligence & image processing ,Artificial intelligence ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,business ,computer ,lcsh:TK1-9971 - Abstract
In the analysis of real-world data, it is useful to learn a latent variable model that represents the data generation process. In this setting, latent tree models are useful because they are able to capture complex relationships while being easily interpretable. In this paper, we propose two incremental algorithms for learning forests of latent trees. Unlike current methods, the proposed algorithms are based on the variational Bayesian framework, which allows them to introduce uncertainty into the learning process and work with mixed data. The first algorithm, incremental learner , determines the forest structure and the cardinality of its latent variables in an iterative search process. The second algorithm, constrained incremental learner , modifies the previous method by considering only a subset of the most prominent structures in each step of the search. Although restricting each iteration to a fixed number of candidate models limits the search space, we demonstrate that the second algorithm returns almost identical results for a small fraction of the computational cost. We compare our algorithms with existing methods by conducting a comparative study using both discrete and continuous real-world data. In addition, we demonstrate the effectiveness of the proposed algorithms by applying them to data from the 2018 Spanish Living Conditions Survey. All code, data, and results are available at https://github.com/ferjorosa/incremental-latent-forests .
- Published
- 2020
48. Adaptive Online Learning With Regularized Kernel for One-Class Classification
- Author
-
Aruna Tiwari, Chandan Gautam, Sundaram Suresh, and Kapil Ahuja
- Subjects
Computer science ,02 engineering and technology ,Machine learning ,computer.software_genre ,Kernel (linear algebra) ,0202 electrical engineering, electronic engineering, information engineering ,One-class classification ,Electrical and Electronic Engineering ,Extreme learning machine ,business.industry ,020208 electrical & electronic engineering ,Computer Science Applications ,Human-Computer Interaction ,Support vector machine ,ComputingMethodologies_PATTERNRECOGNITION ,Hyperplane ,Control and Systems Engineering ,Kernel (statistics) ,Outlier ,Incremental learning ,Benchmark (computing) ,020201 artificial intelligence & image processing ,Anomaly detection ,Artificial intelligence ,business ,computer ,Software - Abstract
In the past few years, kernel-based one-class extreme learning machine (ELM) receives quite a lot of attention by researchers for offline/batch learning due to its noniterative and fast learning capability. This paper extends this concept for adaptive online learning with regularized kernel-based one-class ELM classifiers for detection of outliers, and are collectively referred to as ORK-OCELM. Two frameworks, viz., boundary and reconstruction, are presented to detect the target class in ORK-OCELM. The kernel hyperplane-based baseline one-class ELM model considers whole data in a single chunk, however, the proposed one-class classifiers are adapted in an online fashion from the stream of training samples. The performance of ORK-OCELM is evaluated on a standard benchmark as well as synthetic datasets for both types of environments, i.e., stationary and nonstationary. While evaluating on stationary datasets, these classifiers are compared against batch learning-based one-class classifiers. Similarly, while evaluating on nonstationary datasets, the comparison is done with incremental learning-based online one-class classifiers. The results indicate that the proposed classifiers yield better or similar outcomes for both. In the nonstationary dataset evaluation, adaptability of the proposed classifiers in a changing environment is also demonstrated. It is further shown that the proposed classifiers have large stream data handling capability even under limited system memory. Moreover, the proposed classifiers gain significant time improvement compared to traditional online one-class classifiers (in all aspects of training and testing). A faster learning ability of the proposed classifiers makes them more suitable for real-time anomaly detection.
- Published
- 2020
- Full Text
- View/download PDF
49. Class-Incremental Learning of Convolutional Neural Networks Based on Double Consolidation Mechanism
- Author
-
Hong Liang, Changsheng Yang, and Leilei Jin
- Subjects
Data stream ,General Computer Science ,Computer science ,Process (engineering) ,Knowledge engineering ,Class-incremental learning ,02 engineering and technology ,Machine learning ,computer.software_genre ,Convolutional neural network ,03 medical and health sciences ,0302 clinical medicine ,convolutional neural networks ,visual recognition ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,Forgetting ,business.industry ,General Engineering ,weight consolidation ,Class (biology) ,Statistical classification ,knowledge distillation ,Incremental learning ,Task analysis ,020201 artificial intelligence & image processing ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Artificial intelligence ,business ,lcsh:TK1-9971 ,computer ,030217 neurology & neurosurgery - Abstract
Class-incremental learning is a model learning technique that can help classification models incrementally learn about new target classes and realize knowledge accumulation. It has become one of the major concerns of the machine learning and classification community. To overcome the catastrophic forgetting that occurs when the network is trained sequentially on a multi-class data stream, a double consolidation class-incremental learning (DCCIL) method is proposed. In the incremental learning process, the network parameters are adjusted by combining knowledge distillation and elastic weight consolidation, so that the network can better maintain the recognition ability of the old classes while learning the new ones. The incremental learning experiment is designed, and the proposed method is compared with the popular incremental learning methods such as EWC, LwF, and iCaRL. Experimental results show that the proposed DCCIL method can achieve better incremental accuracy than that of the current popular incremental learning algorithms, which can effectively improve the expansibility and intelligence of the classification model.
- Published
- 2020
- Full Text
- View/download PDF
50. An Open-Ended Continual Learning for Food Recognition Using Class Incremental Extreme Learning Machines
- Author
-
Ghalib Ahmed Tahir and Chu Kiong Loo
- Subjects
General Computer Science ,Computational complexity theory ,Computer science ,Feature extraction ,Inference ,Feature selection ,Machine learning ,computer.software_genre ,class incremental extreme learning machine ,open-ended continual learning ,General Materials Science ,adaptive class incremental extreme learning machine ,Forgetting ,business.industry ,Deep learning ,General Engineering ,deep learning ,adaptive reduced class incremental kernel extreme learning machine ,Catastrophic interference ,Incremental learning ,Food recognition ,Artificial intelligence ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,business ,Transfer of learning ,computer ,lcsh:TK1-9971 - Abstract
State-of-the-art deep learning models for food recognition do not allow data incremental learning and often suffer from catastrophic interference problems during the class incremental learning. This is an important issue in food recognition since real-world food datasets are open-ended and dynamic, involving a continuous increase in food samples and food classes. Model retraining is often carried out to cope with the dynamic nature of the data, but this demands high-end computational resources and significant time. This paper proposes a new open-ended continual learning framework by employing transfer learning on deep models for feature extraction, Relief F for feature selection, and a novel adaptive reduced class incremental kernel extreme learning machine (ARCIKELM) for classification. Transfer learning is beneficial due to the high generalization ability of deep learning features. Relief F reduces computational complexity by ranking and selecting the extracted features. The novel ARCIKELM classifier dynamically adjusts network architecture to reduce catastrophic forgetting. It addresses domain adaptation problems when new samples of the existing class arrive. To conduct comprehensive experiments, we evaluated the model against four standard food benchmarks and a recently collected Pakistani food dataset. Experimental results show that the proposed framework learns new classes incrementally with less catastrophic inference and adapts domain changes while having competitive classification performance.
- Published
- 2020
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.