20 results on '"Hoi, Steven C. H."'
Search Results
2. Distributed multi-task classification: a decentralized online learning approach
- Author
-
Zhang, Chi, Zhao, Peilin, Hao, Shuji, Soh, Yeng Chai, Lee, Bu Sung, Miao, Chunyan, and Hoi, Steven C. H.
- Published
- 2018
- Full Text
- View/download PDF
3. Collaborative topic regression for online recommender systems: an online and Bayesian approach
- Author
-
Liu, Chenghao, Jin, Tao, Hoi, Steven C. H., Zhao, Peilin, and Sun, Jianling
- Published
- 2017
- Full Text
- View/download PDF
4. BDUOL: Double Updating Online Learning on a Fixed Budget
- Author
-
Zhao, Peilin, Hoi, Steven C. H., Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Doug, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Goebel, Randy, editor, Siekmann, Jörg, editor, Wahlster, Wolfgang, editor, Flach, Peter A., editor, De Bie, Tijl, editor, and Cristianini, Nello, editor
- Published
- 2012
- Full Text
- View/download PDF
5. Online Passive-Aggressive Active learning
- Author
-
Lu, Jing, Zhao, Peilin, and Hoi, Steven C. H.
- Published
- 2016
- Full Text
- View/download PDF
6. Online Multiple Kernel Classification
- Author
-
Hoi, Steven C. H., Jin, Rong, Zhao, Peilin, and Yang, Tianbao
- Published
- 2013
- Full Text
- View/download PDF
7. PAMR: Passive aggressive mean reversion strategy for portfolio selection
- Author
-
Li, Bin, Zhao, Peilin, Hoi, Steven C. H., and Gopalkrishnan, Vivekanand
- Published
- 2012
- Full Text
- View/download PDF
8. A Unified Framework for Sparse Online Learning.
- Author
-
PEILIN ZHAO, DAYONG WANG, PENGCHENG WU, and HOI, STEVEN C. H.
- Subjects
ONLINE education ,ONLINE algorithms ,DATA mining ,ANOMALY detection (Computer security) ,MACHINE learning ,BIG data - Abstract
The amount of data in our society has been exploding in the era of big data. This article aims to address several open challenges in big data stream classification. Many existing studies in data mining literature follow the batch learning setting, which suffers from low efficiency and poor scalability. To tackle these challenges, we investigate a unified online learning framework for the big data stream classification task. Different from the existing online data stream classification techniques, we propose a unified Sparse Online Classification (SOC) framework. Based on SOC, we derive a second-order online learning algorithm and a cost-sensitive sparse online learning algorithm, which could successfully handle online anomaly detection tasks with the extremely unbalanced class distribution. As the performance evaluation, we analyze the theoretical bounds of the proposed algorithms and conduct an extensive set of experiments. The encouraging experimental results demonstrate the efficacy of the proposed algorithms over the state-of-the-art techniques on multiple data stream classification tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
9. Adaptive Cost-Sensitive Online Classification.
- Author
-
Zhao, Peilin, Zhang, Yifan, Wu, Min, Hoi, Steven C. H., Tan, Mingkui, and Huang, Junzhou
- Subjects
ADAPTIVE antennas ,ANTENNAS (Electronics) ,ENVIRONMENTALLY sensitive areas ,ONLINE banking research ,TESTING - Abstract
Cost-Sensitive Online Classification has drawn extensive attention in recent years, where the main approach is to directly online optimize two well-known cost-sensitive metrics: (i) weighted sum of sensitivity and specificity and (ii) weighted misclassification cost. However, previous existing methods only considered first-order information of data stream. It is insufficient in practice, since many recent studies have proved that incorporating second-order information enhances the prediction performance of classification models. Thus, we propose a family of cost-sensitive online classification algorithms with adaptive regularization in this paper. We theoretically analyze the proposed algorithms and empirically validate their effectiveness and properties in extensive experiments. Then, for better trade off between the performance and efficiency, we further introduce the sketching technique into our algorithms, which significantly accelerates the computational speed with quite slight performance loss. Finally, we apply our algorithms to tackle several online anomaly detection tasks from real world. Promising results prove that the proposed algorithms are effective and efficient in solving cost-sensitive online classification problems in various real-world domains. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
10. Large Scale Online Multiple Kernel Regression with Application to Time-Series Prediction.
- Author
-
SAHOO, DOYEN, HOI, STEVEN C. H., and BIN LI
- Subjects
FORECASTING ,NONLINEAR regression ,TIME series analysis ,KERNEL functions ,PRIOR learning - Abstract
Kernel-based regression represents an important family of learning techniques for solving challenging regression tasks with non-linear patterns. Despite being studied extensively, most of the existing work suffers from two major drawbacks as follows: (i) they are often designed for solving regression tasks in a batch learning setting, making them not only computationally inefficient and but also poorly scalable in real-world applications where data arrives sequentially; and (ii) they usually assume that a fixed kernel function is given prior to the learning task, which could result in poor performance if the chosen kernel is inappropriate. To overcome these drawbacks, this work presents a novel scheme of Online Multiple Kernel Regression (OMKR), which sequentially learns the kernel-based regressor in an online and scalable fashion, and dynamically explore a pool of multiple diverse kernels to avoid suffering from a single fixed poor kernel so as to remedy the drawback of manual/heuristic kernel selection. The OMKR problem is more challenging than regular kernelbased regression tasks since we have to on-the-fly determine both the optimal kernel-based regressor for each individual kernel and the best combination of the multiple kernel regressors. We propose a family of OMKR algorithms for regression and discuss their application to time series prediction tasks including application to AR, ARMA, and ARIMA time series. We develop novel approaches to make OMKR scalable for large datasets, to counter the problems arising from an unbounded number of support vectors. We also explore the effect of kernel combination at prediction level and at the representation level. Finally, we conduct extensive experiments to evaluate the empirical performance on both real-world regression and times series prediction tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
11. Combination Forecasting Reversion Strategy for Online Portfolio Selection.
- Author
-
Huang, Dingjiang, Yu, Shunchang, Li, Bin, Hoi, Steven C. H., and Zhou, Shuigeng
- Subjects
PORTFOLIO management (Investments) ,MEAN reversion theory ,ARTIFICIAL intelligence ,MACHINE learning ,DISTANCE education - Abstract
Machine learning and artificial intelligence techniques have been applied to construct online portfolio selection strategies recently. A popular and state-of-the-art family of strategies is to explore the reversion phenomenon through online learning algorithms and statistical prediction models. Despite gaining promising results on some benchmark datasets, these strategies often adopt a single model based on a selection criterion (e.g., breakdown point) for predicting future price. However, such model selection is often unstable and may cause unnecessarily high variability in the final estimation, leading to poor prediction performance in real datasets and thus non-optimal portfolios. To overcome the drawbacks, in this article, we propose to exploit the reversion phenomenon by using combination forecasting estimators and design a novel online portfolio selection strategy, named Combination Forecasting Reversion (CFR), which outputs optimal portfolios based on the improved reversion estimator. We further present two efficient CFR implementations based on online Newton step (ONS) and online gradient descent (OGD) algorithms, respectively, and theoretically analyze their regret bounds, which guarantee that the online CFR model performs as well as the best CFR model in hindsight. We evaluate the proposed algorithms on various real markets with extensive experiments. Empirical results show that CFR can effectively overcome the drawbacks of existing reversion strategies and achieve the state-of-the-art performance. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
12. Online Active Learning with Expert Advice.
- Author
-
SHUJI HAO, PEIYING HU, PEILIN ZHAO, HOI, STEVEN C. H., and CHUNYAN MIAO
- Subjects
ACTIVE learning ,DISTANCE education ,DATA mining ,MACHINE learning ,COMPUTER security ,CLOUD computing ,SOCIAL media - Abstract
In literature, learning with expert advice methods usually assume that a learner always obtain the true label of every incoming training instance at the end of each trial. However, in many real-world applications, acquiring the true labels of all instances can be both costly and time consuming, especially for large-scale problems. For example, in the social media, data stream usually comes in a high speed and volume, and it is nearly impossible and highly costly to label all of the instances. In this article, we address this problem with active learning with expert advice, where the ground truth of an instance is disclosed only when it is requested by the proposed active query strategies. Our goal is to minimize the number of requests while training an online learning model without sacrificing the performance. To address this challenge, we propose a framework of active forecasters, which attempts to extend two fully supervised forecasters, Exponentially Weighted Average Forecaster and Greedy Forecaster, to tackle the task of online active learning (OAL) with expert advice. Specifically, we proposed two OAL with expert advice algorithms, named Active Exponentially Weighted Average Forecaster (AEWAF) and active greedy forecaster (AGF), by considering the difference of expert advices. To further improve the robustness of the proposed AEWAF and AGF algorithms in the noisy scenarios (where noisy experts exist), we also proposed two robust active learning with expert advice algorithms, named Robust Active Exponentially Weighted Average Forecaster and Robust Active Greedy Forecaster. We validate the efficacy of the proposed algorithms by an extensive set of experiments in both normal scenarios (where all of experts are comparably reliable) and noisy scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
13. Sparse Passive-Aggressive Learning for Bounded Online Kernel Methods.
- Author
-
Lu, Jing, Sahoo, Doyen, Zhao, Peilin, and Hoi, Steven C. H.
- Subjects
FACILITATED learning ,MONTE Carlo method ,INFORMATION storage & retrieval systems ,INFORMATION science ,INFORMATION organization - Abstract
One critical deficiency of traditional online kernel learning methods is their unbounded and growing number of support vectors in the online learning process, making them inefficient and non-scalable for large-scale applications. Recent studies on scalable online kernel learning have attempted to overcome this shortcoming, e.g., by imposing a constant budget on the number of support vectors. Although they attempt to bound the number of support vectors at each online learning iteration, most of them fail to bound the number of support vectors for the final output hypothesis, which is often obtained by averaging the series of hypotheses over all the iterations. In this article, we propose a novel framework for bounded online kernel methods, named “Sparse Passive-Aggressive (SPA)” learning, which is able to yield a final output kernel-based hypothesis with a bounded number of support vectors. Unlike the common budget maintenance strategy used by many existing budget online kernel learning approaches, the idea of our approach is to attain the bounded number of support vectors using an efficient stochastic sampling strategy that samples an incoming training example as a new support vector with a probability proportional to its loss suffered. We theoretically prove that SPA achieves an optimal mistake bound in expectation, and we empirically show that it outperforms various budget online kernel learning algorithms. Finally, in addition to general online kernel learning tasks, we also apply SPA to derive bounded online multiple-kernel learning algorithms, which can significantly improve the scalability of traditional Online Multiple-Kernel Classification (OMKC) algorithms while achieving satisfactory learning accuracy as compared with the existing unbounded OMKC algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
14. Sparse Online Learning of Image Similarity.
- Author
-
Gao, Xingyu, Hoi, Steven C. H., Zhang, Yongdong, Zhou, Jianshe, Wan, Ji, Chen, Zhenyu, Li, Jintao, and Zhu, Jianke
- Subjects
- *
IMAGE analysis , *MULTIMEDIA systems , *CONTENT-based image retrieval , *BAG-of-words model (Computer science) , *DATA analysis - Abstract
Learning image similarity plays a critical role in real-world multimedia information retrieval applications, especially in Content-Based Image Retrieval (CBIR) tasks, in which an accurate retrieval of visually similar objects largely relies on an effective image similarity function. Crafting a good similarity function is very challenging because visual contents of images are often represented as feature vectors in high-dimensional spaces, for example, via bag-of-words (BoW) representations, and traditional rigid similarity functions, for example, cosine similarity, are often suboptimal for CBIR tasks. In this article, we address this fundamental problem, that is, learning to optimize image similarity with sparse and high-dimensional representations from large-scale training data, and propose a novel scheme of Sparse Online Learning of Image Similarity (SOLIS). In contrast to many existing image-similarity learning algorithms that are designed to work with low-dimensional data, SOLIS is able to learn image similarity from large-scale image data in sparse and high-dimensional spaces. Our encouraging results showed that the proposed new technique achieves highly competitive accuracy as compared to the state-of-the-art approaches but enjoys significant advantages in computational efficiency, model sparsity, and retrieval scalability, making it more practical for real-world multimedia retrieval applications. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
15. Robust Median Reversion Strategy for Online Portfolio Selection.
- Author
-
Dingjiang Huang, Junlong Zhou, Bin Li, Hoi, Steven C. H., and Shuigeng Zhou
- Subjects
MEDIAN (Mathematics) ,MEAN reversion theory ,PORTFOLIO management (Investments) ,DATA mining ,MACHINE learning ,ELECTRONIC trading of securities - Abstract
Online portfolio selection has attracted increasing attention from data mining and machine learning communities in recent years. An important theory in financial markets is mean reversion, which plays a critical role in some state-of-the-art portfolio selection strategies. Although existing mean reversion strategies have been shown to achieve good empirical performance on certain datasets, they seldom carefully deal with noise and outliers in the data, leading to suboptimal portfolios, and consequently yielding poor performance in practice. In this paper, we propose to exploit the reversion phenomenon by using robust $L_1$
-median estimators, and design a novel online portfolio selection strategy named “Robust Median Reversion” (RMR), which constructs optimal portfolios based on the improved reversion estimator. We examine the performance of the proposed algorithms on various real markets with extensive experiments. Empirical results show that RMR can overcome the drawbacks of existing mean reversion algorithms and achieve significantly better results. Finally, RMR runs in linear time, and thus is suitable for large-scale real-time algorithmic trading applications. [ABSTRACT FROM PUBLISHER]- Published
- 2016
- Full Text
- View/download PDF
16. Online Multi-Modal Distance Metric Learning with Application to Image Retrieval.
- Author
-
Wu, Pengcheng, Hoi, Steven C. H., Zhao, Peilin, Miao, Chunyan, and Liu, Zhi-Yong
- Subjects
- *
IMAGE retrieval , *SEARCH algorithms , *DISTANCE education , *FEATURE extraction , *MACHINE learning - Abstract
Distance metric learning (DML) is an important technique to improve similarity search in content-based image retrieval. Despite being studied extensively, most existing DML approaches typically adopt a single-modal learning framework that learns the distance metric on either a single feature type or a combined feature space where multiple types of features are simply concatenated. Such single-modal DML methods suffer from some critical limitations: (i) some type of features may significantly dominate the others in the DML task due to diverse feature representations; and (ii) learning a distance metric on the combined high-dimensional feature space can be extremely time-consuming using the naive feature concatenation approach. To address these limitations, in this paper, we investigate a novel scheme of online multi-modal distance metric learning (OMDML), which explores a unified two-level online learning scheme: (i) it learns to optimize a distance metric on each individual feature space; and (ii) then it learns to find the optimal combination of diverse types of features. To further reduce the expensive cost of DML on high-dimensional feature space, we propose a low-rank OMDML algorithm which not only significantly reduces the computational cost but also retains highly competing or even better learning accuracy. We conduct extensive experiments to evaluate the performance of the proposed algorithms for multi-modal image retrieval, in which encouraging results validate the effectiveness of the proposed technique. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
17. LIBOL: A Library for Online Learning Algorithms.
- Author
-
Hoi, Steven C. H., Jialei Wang, and Peilin Zhao
- Subjects
- *
MACHINE learning , *ALGORITHMS , *OPEN source software , *ONLINE library catalogs , *COMPUTER users , *COMMAND-line interfaces - Abstract
LIBOL is an open-source library for large-scale online learning, which consists of a large family of efficient and scalable state-of-the-art online learning algorithms for large-scale online classification tasks. We have offered easy-to-use command-line tools and examples for users and developers, and also have made comprehensive documents available for both beginners and advanced users. LIBOL is not only a machine learning toolbox, but also a comprehensive experimental platform for conducting online learning research. [ABSTRACT FROM AUTHOR]
- Published
- 2014
18. Confidence Weighted Mean Reversion Strategy for Online Portfolio Selection.
- Author
-
BIN LI, HOI, STEVEN C. H., PEILIN ZHAO, and GOPALKRISHNAN, VIVEKANAND
- Subjects
DATA mining ,MACHINE learning ,INFORMATION storage & retrieval systems ,EMPIRICAL research ,DISTANCE education ,ALGORITHMS ,CODING theory - Abstract
Online portfolio selection has been attracting increasing attention from the data mining and machine learning communities. All existing online portfolio selection strategies focus on the first order information of a portfolio vector, though the second order information may also be beneficial to a strategy. Moreover, empirical evidence shows that relative stock prices may follow the mean reversion property, which has not been fully exploited by existing strategies. This article proposes a novel online portfolio selection strategy named Confidence Weighted Mean Reversion (CWMR). Inspired by the mean reversion principle in finance and confidence weighted online learning technique in machine learning, CWMR models the portfolio vector as a Gaussian distribution, and sequentially updates the distribution by following the mean reversion trading principle. CWMR's closed-form updates clearly reflect the mean reversion trading idea. We also present several variants of CWMR algorithms, including a CWMR mixture algorithm that is theoretical universal. Empirically, CWMR strategy is able to effectively exploit the power of mean reversion for online portfolio selection. Extensive experiments on various real markets show that the proposed strategy is superior to the state-of-the-art techniques. The experimental testbed including source codes and data sets is available online. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
19. Double Updating Online Learning.
- Author
-
Zhao, Peilin, Hoi, Steven C. H., and Jin, Rong
- Subjects
- *
DISTANCE education , *KERNEL functions , *ALGORITHMS , *SUPPORT vector machines , *MACHINE learning , *INFORMATION retrieval , *SET theory , *INFORMATION technology - Abstract
In most kernel based online learning algorithms, when an incoming instance is misclassified, it will be added into the pool of support vectors and assigned with a weight, which often remains unchanged during the rest of the learning process. This is clearly insufficient since when a new support vector is added, we generally expect the weights of the other existing support vectors to be updated in order to reflect the influence of the added support vector. In this paper, we propose a new online learning method, termed Double Updating Online Learning, or DUOL for short, that explicitly addresses this problem. Instead of only assigning a fixed weight to the misclassified example received at the current trial, the proposed online learning algorithm also tries to update the weight for one of the existing support vectors. We show that the mistake bound can be improved by the proposed online learning method. We conduct an extensive set of empirical evaluations for both binary and multi-class online learning tasks. The experimental results show that the proposed technique is considerably more effective than the state-of-the-art online learning algorithms. The source code is available to public at http://www.cais.ntu.edu.sg/∼chhoi/DUOL/. [ABSTRACT FROM AUTHOR]
- Published
- 2011
20. Detecting cyberattacks in industrial control systems using online learning algorithms.
- Author
-
Li, Guangxia, Shen, Yulong, Zhao, Peilin, Lu, Xiao, Liu, Jia, Liu, Yangyang, and Hoi, Steven C. H.
- Subjects
- *
ONLINE algorithms , *CYBERSPACE , *MACHINE learning , *ONLINE education , *CYBERTERRORISM , *PROCESS control systems - Abstract
Industrial control systems are critical to the operation of industrial facilities, especially for critical infrastructures, such as refineries, power grids, and transportation systems. Similar to other information systems, a significant threat to industrial control systems is the attack from cyberspace—the offensive maneuvers launched by "anonymous" in the digital world that target computer-based assets with the goal of compromising a system's functions or probing for information. Owing to the importance of industrial control systems, and the possibly devastating consequences of being attacked, significant endeavors have been attempted to secure industrial control systems from cyberattacks. Among them are intrusion detection systems that serve as the first line of defense by monitoring and reporting potentially malicious activities. Classical machine-learning-based intrusion detection methods usually generate prediction models by learning modest-sized training samples all at once. Such approach is not always applicable to industrial control systems, as industrial control systems must process continuous control commands with limited computational resources in a nonstop way. To satisfy such requirements, we propose using online learning to learn prediction models from the controlling data stream. We introduce several state-of-the-art online learning algorithms categorically, and illustrate their efficacies on two typically used testbeds—power system and gas pipeline. Further, we explore a new cost-sensitive online learning algorithm to solve the class-imbalance problem that is pervasive in industrial intrusion detection systems. Our experimental results indicate that the proposed algorithm can achieve an overall improvement in the detection rate of cyberattacks in industrial control systems. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.