Author: "Cheng, James" / Topic: machine learning (stat.ml) - Searchworks@Jio Institute Digital Library Search Results

1. Towards Understanding Feature Learning in Out-of-Distribution Generalization

Author: Chen, Yongqiang, Huang, Wei, Zhou, Kaiwen, Bian, Yatao, Han, Bo, and Cheng, James
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: A common explanation for the failure of out-of-distribution (OOD) generalization is that the model trained with empirical risk minimization (ERM) learns spurious features instead of the desired invariant features. However, several recent studies challenged this explanation and found that deep networks may have already learned sufficiently good features for OOD generalization. The debate extends to the in-distribution and OOD performance correlations along with training or fine-tuning neural nets across a variety of OOD generalization tasks. To understand these seemingly contradicting phenomena, we conduct a theoretical investigation and find that ERM essentially learns both spurious features and invariant features. On the other hand, the quality of learned features during ERM pre-training significantly affects the final OOD performance, as OOD objectives rarely learn new features. Failing to capture all the underlying useful features during pre-training will further limit the final OOD performance. To remedy the issue, we propose Feature Augmented Training (FAT ), to enforce the model to learn all useful features by retaining the already learned features and augmenting new ones by multiple rounds. In each round, the retention and augmentation operations are performed on different subsets of the training data that capture distinct features. Extensive experiments show that FAT effectively learns richer features and consistently improves the OOD performance when applied to various objectives., Yongqiang Chen, Wei Huang, and Kaiwen Zhou contributed equally
Published: 2023

2. Pareto Invariant Risk Minimization: Towards Mitigating the Optimization Dilemma in Out-of-Distribution Generalization

Author: Chen, Yongqiang, Zhou, Kaiwen, Bian, Yatao, Xie, Binghui, Wu, Bingzhe, Zhang, Yonggang, Ma, Kaili, Yang, Han, Zhao, Peilin, Han, Bo, and Cheng, James
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Recently, there has been a growing surge of interest in enabling machine learning systems to generalize well to Out-of-Distribution (OOD) data. Most efforts are devoted to advancing optimization objectives that regularize models to capture the underlying invariance; however, there often are compromises in the optimization process of these OOD objectives: i) Many OOD objectives have to be relaxed as penalty terms of Empirical Risk Minimization (ERM) for the ease of optimization, while the relaxed forms can weaken the robustness of the original objective; ii) The penalty terms also require careful tuning of the penalty weights due to the intrinsic conflicts between ERM and OOD objectives. Consequently, these compromises could easily lead to suboptimal performance of either the ERM or OOD objective. To address these issues, we introduce a multi-objective optimization (MOO) perspective to understand the OOD optimization process, and propose a new optimization scheme called PAreto Invariant Risk Minimization (PAIR). PAIR improves the robustness of OOD objectives by cooperatively optimizing with other OOD objectives, thereby bridging the gaps caused by the relaxations. Then PAIR approaches a Pareto optimal solution that trades off the ERM and OOD objectives properly. Extensive experiments on challenging benchmarks, WILDS, show that PAIR alleviates the compromises and yields top OOD performances., ICLR 2023, 50 pages, 58 figures
Published: 2022

3. Understanding and Improving Graph Injection Attack by Promoting Unnoticeability

Author: Chen, Yongqiang, Yang, Han, Zhang, Yonggang, Ma, Kaili, Liu, Tongliang, Han, Bo, and Cheng, James
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Cryptography and Security, Statistics - Machine Learning, Machine Learning (stat.ML), Cryptography and Security (cs.CR), Machine Learning (cs.LG)
Abstract: Recently Graph Injection Attack (GIA) emerges as a practical attack scenario on Graph Neural Networks (GNNs), where the adversary can merely inject few malicious nodes instead of modifying existing nodes or edges, i.e., Graph Modification Attack (GMA). Although GIA has achieved promising results, little is known about why it is successful and whether there is any pitfall behind the success. To understand the power of GIA, we compare it with GMA and find that GIA can be provably more harmful than GMA due to its relatively high flexibility. However, the high flexibility will also lead to great damage to the homophily distribution of the original graph, i.e., similarity among neighbors. Consequently, the threats of GIA can be easily alleviated or even prevented by homophily-based defenses designed to recover the original homophily. To mitigate the issue, we introduce a novel constraint -- homophily unnoticeability that enforces GIA to preserve the homophily, and propose Harmonious Adversarial Objective (HAO) to instantiate it. Extensive experiments verify that GIA with HAO can break homophily-based defenses and outperform previous GIA attacks by a significant margin. We believe our methods can serve for a more reliable evaluation of the robustness of GNNs., Comment: ICLR2022, 42 pages, 22 figures
Published: 2022
Full Text: View/download PDF

4. Accelerating Perturbed Stochastic Iterates in Asynchronous Lock-Free Optimization

Author: Zhou, Kaiwen, So, Anthony Man-Cho, and Cheng, James
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Optimization and Control (math.OC), Statistics - Machine Learning, FOS: Mathematics, Physics::Accelerator Physics, Machine Learning (stat.ML), Mathematics - Optimization and Control, Machine Learning (cs.LG)
Abstract: We show that stochastic acceleration can be achieved under the perturbed iterate framework (Mania et al., 2017) in asynchronous lock-free optimization, which leads to the optimal incremental gradient complexity for finite-sum objectives. We prove that our new accelerated method requires the same linear speed-up condition as the existing non-accelerated methods. Our core algorithmic discovery is a new accelerated SVRG variant with sparse updates. Empirical results are presented to verify our theoretical findings., 21 pages, 22 figures
Published: 2021

5. Practical Schemes for Finding Near-Stationary Points of Convex Finite-Sums

Author: Zhou, Kaiwen, Tian, Lai, So, Anthony Man-Cho, and Cheng, James
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Optimization and Control (math.OC), Statistics - Machine Learning, FOS: Mathematics, Machine Learning (stat.ML), Mathematics - Optimization and Control, Machine Learning (cs.LG)
Abstract: In convex optimization, the problem of finding near-stationary points has not been adequately studied yet, unlike other optimality measures such as the function value. Even in the deterministic case, the optimal method (OGM-G, due to Kim and Fessler (2021)) has just been discovered recently. In this work, we conduct a systematic study of algorithmic techniques for finding near-stationary points of convex finite-sums. Our main contributions are several algorithmic discoveries: (1) we discover a memory-saving variant of OGM-G based on the performance estimation problem approach (Drori and Teboulle, 2014); (2) we design a new accelerated SVRG variant that can simultaneously achieve fast rates for minimizing both the gradient norm and function value; (3) we propose an adaptively regularized accelerated SVRG variant, which does not require the knowledge of some unknown initial constants and achieves near-optimal complexities. We put an emphasis on the simplicity and practicality of the new schemes, which could facilitate future work., 29 pages, 4 figures
Published: 2021

6. Understanding Graph Neural Networks from Graph Signal Denoising Perspectives

Author: Fu, Guoji, Hou, Yifan, Zhang, Jian, Ma, Kaili, Kamhoua, Barakeel Fanseu, and Cheng, James
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Graph neural networks (GNNs) have attracted much attention because of their excellent performance on tasks such as node classification. However, there is inadequate understanding on how and why GNNs work, especially for node representation learning. This paper aims to provide a theoretical framework to understand GNNs, specifically, spectral graph convolutional networks and graph attention networks, from graph signal denoising perspectives. Our framework shows that GNNs are implicitly solving graph signal denoising problems: spectral graph convolutions work as denoising node features, while graph attentions work as denoising edge weights. We also show that a linear self-attention mechanism is able to compete with the state-of-the-art graph attention methods. Our theoretical results further lead to two new models, GSDN-F and GSDN-EF, which work effectively for graphs with noisy node features and/or noisy edges. We validate our theoretical findings and also the effectiveness of our new models by experiments on benchmark datasets. The source code is available at \url{https://github.com/fuguoji/GSDN}., Comment: 19 pages, 8 figures
Published: 2020
Full Text: View/download PDF

7. Boosting First-Order Methods by Shifting Objective: New Schemes with Faster Worst-Case Rates

Author: Zhou, Kaiwen, So, Anthony Man-Cho, and Cheng, James
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: We propose a new methodology to design first-order methods for unconstrained strongly convex problems. Specifically, instead of tackling the original objective directly, we construct a shifted objective function that has the same minimizer as the original objective and encodes both the smoothness and strong convexity of the original objective in an interpolation condition. We then propose an algorithmic template for tackling the shifted objective, which can exploit such a condition. Following this template, we derive several new accelerated schemes for problems that are equipped with various first-order oracles and show that the interpolation condition allows us to vastly simplify and tighten the analysis of the derived methods. In particular, all the derived methods have faster worst-case convergence rates than their existing counterparts. Experiments on machine learning tasks are conducted to evaluate the new methods., Comment: NeurIPS 2020, 29 pages, 7 figures
Published: 2020
Full Text: View/download PDF

8. Rethinking Graph Regularization for Graph Neural Networks

Author: Yang, Han, Ma, Kaili, and Cheng, James
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), General Medicine, MathematicsofComputing_DISCRETEMATHEMATICS, Machine Learning (cs.LG)
Abstract: The graph Laplacian regularization term is usually used in semi-supervised representation learning to provide graph structure information for a model $f(X)$. However, with the recent popularity of graph neural networks (GNNs), directly encoding graph structure $A$ into a model, i.e., $f(A, X)$, has become the more common approach. While we show that graph Laplacian regularization brings little-to-no benefit to existing GNNs, and propose a simple but non-trivial variant of graph Laplacian regularization, called Propagation-regularization (P-reg), to boost the performance of existing GNN models. We provide formal analyses to show that P-reg not only infuses extra information (that is not captured by the traditional graph Laplacian regularization) into GNNs, but also has the capacity equivalent to an infinite-depth graph convolutional network. We demonstrate that P-reg can effectively boost the performance of existing GNN models on both node-level and graph-level tasks across many different datasets., Comment: AAAI2021
Published: 2020
Full Text: View/download PDF

9. Hyper-Sphere Quantization: Communication-Efficient SGD for Federated Learning

Author: Dai, Xinyan, Yan, Xiao, Zhou, Kaiwen, Yang, Han, Ng, Kelvin K. W., Cheng, James, and Fan, Yu
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Information Retrieval (cs.IR), Machine Learning (cs.LG), Computer Science - Information Retrieval
Abstract: The high cost of communicating gradients is a major bottleneck for federated learning, as the bandwidth of the participating user devices is limited. Existing gradient compression algorithms are mainly designed for data centers with high-speed network and achieve $O(\sqrt{d} \log d)$ per-iteration communication cost at best, where $d$ is the size of the model. We propose hyper-sphere quantization (HSQ), a general framework that can be configured to achieve a continuum of trade-offs between communication efficiency and gradient accuracy. In particular, at the high compression ratio end, HSQ provides a low per-iteration communication cost of $O(\log d)$, which is favorable for federated learning. We prove the convergence of HSQ theoretically and show by experiments that HSQ significantly reduces the communication cost of model training without hurting convergence accuracy.
Published: 2019

10. PMD: An Optimal Transportation-based User Distance for Recommender Systems

Author: Meng, Yitong, Dai, Xinyan, Yan, Xiao, Cheng, James, Liu, Weiwen, Liao, Benben, Guo, Jun, and Chen, Guangyong
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Statistics - Machine Learning, Optimal transport, Machine Learning (stat.ML), User similarity, Recommendation, Information Retrieval (cs.IR), Article, Machine Learning (cs.LG), Computer Science - Information Retrieval
Abstract: Collaborative filtering, a widely-used recommendation technique, predicts a user's preference by aggregating the ratings from similar users. As a result, these measures cannot fully utilize the rating information and are not suitable for real world sparse data. To solve these issues, we propose a novel user distance measure named Preference Mover's Distance (PMD) which makes full use of all ratings made by each user. Our proposed PMD can properly measure the distance between a pair of users even if they have no co-rated items. We show that this measure can be cast as an instance of the Earth Mover's Distance, a well-studied transportation problem for which several highly efficient solvers have been developed. Experimental results show that PMD can help achieve superior recommendation accuracy than state-of-the-art methods, especially when training data is very sparse., This paper is accepted by European Conference on Information Retrieval (ECIR 2020)
Published: 2019

11. Norm-Ranging LSH for Maximum Inner Product Search

Author: Yan, Xiao, Li, Jinfeng, Dai, Xinyan, Chen, Hongzhi, and Cheng, James
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Neyshabur and Srebro proposed Simple-LSH, which is the state-of-the-art hashing method for maximum inner product search (MIPS) with performance guarantee. We found that the performance of Simple-LSH, in both theory and practice, suffers from long tails in the 2-norm distribution of real datasets. We propose Norm-ranging LSH, which addresses the excessive normalization problem caused by long tails in Simple-LSH by partitioning a dataset into multiple sub-datasets and building a hash index for each sub-dataset independently. We prove that Norm-ranging LSH has lower query time complexity than Simple-LSH. We also show that the idea of partitioning the dataset can improve other hashing based methods for MIPS. To support efficient query processing on the hash indexes of the sub-datasets, a novel similarity metric is formulated. Experiments show that Norm-ranging LSH achieves an order of magnitude speedup over Simple-LSH for the same recall, thus significantly benefiting applications that involve MIPS., NIPS2018
Published: 2018

12. A Simple Stochastic Variance Reduced Algorithm with Fast Convergence Rates

Author: Zhou, Kaiwen, Shang, Fanhua, and Cheng, James
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Recent years have witnessed exciting progress in the study of stochastic variance reduced gradient methods (e.g., SVRG, SAGA), their accelerated variants (e.g, Katyusha) and their extensions in many different settings (e.g., online, sparse, asynchronous, distributed). Among them, accelerated methods enjoy improved convergence rates but have complex coupling structures, which makes them hard to be extended to more settings (e.g., sparse and asynchronous) due to the existence of perturbation. In this paper, we introduce a simple stochastic variance reduced algorithm (MiG), which enjoys the best-known convergence rates for both strongly convex and non-strongly convex problems. Moreover, we also present its efficient sparse and asynchronous variants, and theoretically analyze its convergence rates in these settings. Finally, extensive experiments for various machine learning problems such as logistic regression are given to illustrate the practical improvement in both serial and asynchronous settings., ICML2018
Published: 2018

13. ASVRG: Accelerated Proximal SVRG

Author: Shang, Fanhua, Jiao, Licheng, Zhou, Kaiwen, Cheng, James, Ren, Yan, and Jin, Yufei
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Statistics - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: This paper proposes an accelerated proximal stochastic variance reduced gradient (ASVRG) method, in which we design a simple and effective momentum acceleration trick. Unlike most existing accelerated stochastic variance reduction methods such as Katyusha, ASVRG has only one additional variable and one momentum parameter. Thus, ASVRG is much simpler than those methods, and has much lower per-iteration complexity. We prove that ASVRG achieves the best known oracle complexities for both strongly convex and non-strongly convex objectives. In addition, we extend ASVRG to mini-batch and non-smooth settings. We also empirically verify our theoretical results and show that the performance of ASVRG is comparable with, and sometimes even better than that of the state-of-the-art stochastic methods., Comment: 32 pages, 3 figures
Published: 2018
Full Text: View/download PDF

14. Fast Stochastic Variance Reduced Gradient Method with Momentum Acceleration for Machine Learning

Author: Shang, Fanhua, Liu, Yuanyuan, Cheng, James, and Zhuo, Jiacheng
Subjects: FOS: Computer and information sciences, Computer Science - Learning, Artificial Intelligence (cs.AI), Optimization and Control (math.OC), Computer Science - Artificial Intelligence, Statistics - Machine Learning, FOS: Mathematics, Machine Learning (stat.ML), Mathematics - Optimization and Control, Machine Learning (cs.LG)
Abstract: Recently, research on accelerated stochastic gradient descent methods (e.g., SVRG) has made exciting progress (e.g., linear convergence for strongly convex problems). However, the best-known methods (e.g., Katyusha) requires at least two auxiliary variables and two momentum parameters. In this paper, we propose a fast stochastic variance reduction gradient (FSVRG) method, in which we design a novel update rule with the Nesterov's momentum and incorporate the technique of growing epoch size. FSVRG has only one auxiliary variable and one momentum weight, and thus it is much simpler and has much lower per-iteration complexity. We prove that FSVRG achieves linear convergence for strongly convex problems and the optimal $\mathcal{O}(1/T^2)$ convergence rate for non-strongly convex problems, where $T$ is the number of outer-iterations. We also extend FSVRG to directly solve the problems with non-smooth component functions, such as SVM. Finally, we empirically study the performance of FSVRG for solving various machine learning problems such as logistic regression, ridge regression, Lasso and SVM. Our results show that FSVRG outperforms the state-of-the-art stochastic methods, including Katyusha., Corrected a few typos in this version
Published: 2017

15. Accelerated Variance Reduced Stochastic ADMM

Author: Liu, Yuanyuan, Shang, Fanhua, and Cheng, James
Subjects: FOS: Computer and information sciences, Computer Science - Learning, Statistics::Machine Learning, Statistics - Machine Learning, Computer Science::Systems and Control, Computer Science::Computer Vision and Pattern Recognition, Mathematics::Optimization and Control, Machine Learning (stat.ML), General Medicine, Computer Science::Numerical Analysis, Machine Learning (cs.LG)
Abstract: Recently, many variance reduced stochastic alternating direction method of multipliers (ADMM) methods (e.g.\ SAG-ADMM, SDCA-ADMM and SVRG-ADMM) have made exciting progress such as linear convergence rates for strongly convex problems. However, the best known convergence rate for general convex problems is O(1/T) as opposed to O(1/T^2) of accelerated batch algorithms, where $T$ is the number of iterations. Thus, there still remains a gap in convergence rates between existing stochastic ADMM and batch algorithms. To bridge this gap, we introduce the momentum acceleration trick for batch optimization into the stochastic variance reduced gradient based ADMM (SVRG-ADMM), which leads to an accelerated (ASVRG-ADMM) method. Then we design two different momentum term update rules for strongly convex and general convex cases. We prove that ASVRG-ADMM converges linearly for strongly convex problems. Besides having a low per-iteration complexity as existing stochastic ADMM methods, ASVRG-ADMM improves the convergence rate on general convex problems from O(1/T) to O(1/T^2). Our experimental results show the effectiveness of ASVRG-ADMM., Comment: 16 pages, 5 figures, Appears in Proceedings of the 31th AAAI Conference on Artificial Intelligence (AAAI), San Francisco, California, USA, pp. 2287--2293, 2017
Published: 2017
Full Text: View/download PDF

16. Guaranteed Sufficient Decrease for Variance Reduced Stochastic Gradient Descent

Author: Shang, Fanhua, Liu, Yuanyuan, Cheng, James, Ng, Kelvin Kai Wing, and Yoshida, Yuichi
Subjects: FOS: Computer and information sciences, Computer Science - Learning, Statistics - Machine Learning, Optimization and Control (math.OC), FOS: Mathematics, Machine Learning (stat.ML), Mathematics - Optimization and Control, Machine Learning (cs.LG)
Abstract: In this paper, we propose a novel sufficient decrease technique for variance reduced stochastic gradient descent methods such as SAG, SVRG and SAGA. In order to make sufficient decrease for stochastic optimization, we design a new sufficient decrease criterion, which yields sufficient decrease versions of variance reduction algorithms such as SVRG-SD and SAGA-SD as a byproduct. We introduce a coefficient to scale current iterate and satisfy the sufficient decrease property, which takes the decisions to shrink, expand or move in the opposite direction, and then give two specific update rules of the coefficient for Lasso and ridge regression. Moreover, we analyze the convergence properties of our algorithms for strongly convex problems, which show that both of our algorithms attain linear convergence rates. We also provide the convergence guarantees of our algorithms for non-strongly convex problems. Our experimental results further verify that our algorithms achieve significantly better performance than their counterparts., Comment: 25 pages, 8 figures
Published: 2017
Full Text: View/download PDF

17. Unified Scalable Equivalent Formulations for Schatten Quasi-Norms

Author: Shang, Fanhua, Liu, Yuanyuan, and Cheng, James
Subjects: FOS: Computer and information sciences, Statistics - Machine Learning, Optimization and Control (math.OC), Computer Science - Information Theory, Information Theory (cs.IT), FOS: Mathematics, Machine Learning (stat.ML), Mathematics - Optimization and Control
Abstract: The Schatten quasi-norm can be used to bridge the gap between the nuclear norm and rank function, and is the tighter approximation to matrix rank. However, most existing Schatten quasi-norm minimization (SQNM) algorithms, as well as for nuclear norm minimization, are too slow or even impractical for large-scale problems, due to the SVD or EVD of the whole matrix in each iteration. In this paper, we rigorously prove that for any p, p1, p2>0 satisfying 1/p=1/p1+1/p2, the Schatten-p quasi-norm of any matrix is equivalent to minimizing the product of the Schatten-p1 norm (or quasi-norm) and Schatten-p2 norm (or quasi-norm) of its two factor matrices. Then we present and prove the equivalence relationship between the product formula of the Schatten quasi-norm and its weighted sum formula for the two cases of p1 and p2: p1=p2 and p1\neq p2. In particular, when p>1/2, there is an equivalence between the Schatten-p quasi-norm of any matrix and the Schatten-2p norms of its two factor matrices, where the widely used equivalent formulation of the nuclear norm can be viewed as a special case. That is, various SQNM problems with p>1/2 can be transformed into the one only involving smooth, convex norms of two factor matrices, which can lead to simpler and more efficient algorithms than conventional methods. We further extend the theoretical results of two factor matrices to the cases of three and more factor matrices, from which we can see that for any 0, Comment: 21 pages. CUHK Technical Report CSE-ShangLC20160307, March 7, 2016
Published: 2016
Full Text: View/download PDF

18. Scalable Algorithms for Tractable Schatten Quasi-Norm Minimization

Author: Shang, Fanhua, Liu, Yuanyuan, and Cheng, James
Subjects: FOS: Computer and information sciences, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Statistics - Machine Learning, Optimization and Control (math.OC), Computer Science - Numerical Analysis, FOS: Mathematics, Machine Learning (stat.ML), General Medicine, Numerical Analysis (math.NA), Mathematics - Optimization and Control
Abstract: The Schatten-p quasi-norm $(0, Comment: 16 pages, 5 figures, Appears in Proceedings of the 30th AAAI Conference on Artificial Intelligence (AAAI), Phoenix, Arizona, USA, pp. 2016--2022, 2016
Published: 2016
Full Text: View/download PDF

19. Structured Low-Rank Matrix Factorization with Missing and Grossly Corrupted Observations

Author: Shang, Fanhua, Liu, Yuanyuan, Tong, Hanghang, Cheng, James, and Cheng, Hong
Subjects: FOS: Computer and information sciences, Computer Science - Learning, Statistics - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (stat.ML), Machine Learning (cs.LG)
Abstract: Recovering low-rank and sparse matrices from incomplete or corrupted observations is an important problem in machine learning, statistics, bioinformatics, computer vision, as well as signal and image processing. In theory, this problem can be solved by the natural convex joint/mixed relaxations (i.e., l_{1}-norm and trace norm) under certain conditions. However, all current provable algorithms suffer from superlinear per-iteration cost, which severely limits their applicability to large-scale problems. In this paper, we propose a scalable, provable structured low-rank matrix factorization method to recover low-rank and sparse matrices from missing and grossly corrupted data, i.e., robust matrix completion (RMC) problems, or incomplete and grossly corrupted measurements, i.e., compressive principal component pursuit (CPCP) problems. Specifically, we first present two small-scale matrix trace norm regularized bilinear structured factorization models for RMC and CPCP problems, in which repetitively calculating SVD of a large-scale matrix is replaced by updating two much smaller factor matrices. Then, we apply the alternating direction method of multipliers (ADMM) to efficiently solve the RMC problems. Finally, we provide the convergence analysis of our algorithm, and extend it to address general CPCP problems. Experimental results verified both the efficiency and effectiveness of our method compared with the state-of-the-art methods., 28 pages, 9 figures
Published: 2014

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

19 results on '"Cheng, James"'

1. Towards Understanding Feature Learning in Out-of-Distribution Generalization

2. Pareto Invariant Risk Minimization: Towards Mitigating the Optimization Dilemma in Out-of-Distribution Generalization

3. Understanding and Improving Graph Injection Attack by Promoting Unnoticeability

4. Accelerating Perturbed Stochastic Iterates in Asynchronous Lock-Free Optimization

5. Practical Schemes for Finding Near-Stationary Points of Convex Finite-Sums

6. Understanding Graph Neural Networks from Graph Signal Denoising Perspectives

7. Boosting First-Order Methods by Shifting Objective: New Schemes with Faster Worst-Case Rates

8. Rethinking Graph Regularization for Graph Neural Networks

9. Hyper-Sphere Quantization: Communication-Efficient SGD for Federated Learning

10. PMD: An Optimal Transportation-based User Distance for Recommender Systems

11. Norm-Ranging LSH for Maximum Inner Product Search

12. A Simple Stochastic Variance Reduced Algorithm with Fast Convergence Rates

13. ASVRG: Accelerated Proximal SVRG

14. Fast Stochastic Variance Reduced Gradient Method with Momentum Acceleration for Machine Learning

15. Accelerated Variance Reduced Stochastic ADMM

16. Guaranteed Sufficient Decrease for Variance Reduced Stochastic Gradient Descent

17. Unified Scalable Equivalent Formulations for Schatten Quasi-Norms

18. Scalable Algorithms for Tractable Schatten Quasi-Norm Minimization

19. Structured Low-Rank Matrix Factorization with Missing and Grossly Corrupted Observations

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

19 results on '"Cheng, James"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources