39 results on '"Yao, Quanming"'
Search Results
2. Accurate and interpretable drug-drug interaction prediction enabled by knowledge subgraph learning
- Author
-
Wang, Yaqing, Yang, Zaifei, and Yao, Quanming
- Published
- 2024
- Full Text
- View/download PDF
3. Emerging drug interaction prediction enabled by a flow-based graph neural network with biomedical network
- Author
-
Zhang, Yongqi, Yao, Quanming, Yue, Ling, Wu, Xian, Zhang, Ziheng, Lin, Zhenxi, and Zheng, Yefeng
- Published
- 2023
- Full Text
- View/download PDF
4. Codabench: Flexible, easy-to-use, and reproducible meta-benchmark platform
- Author
-
Xu, Zhen, Escalera, Sergio, Pavão, Adrien, Richard, Magali, Tu, Wei-Wei, Yao, Quanming, Zhao, Huan, and Guyon, Isabelle
- Published
- 2022
- Full Text
- View/download PDF
5. WRTRe: Weighted relative position transformer for joint entity and relation extraction
- Author
-
Zheng, Wei, Wang, Zhen, Yao, Quanming, and Li, Xuelong
- Published
- 2021
- Full Text
- View/download PDF
6. Simple and automated negative sampling for knowledge graph embedding
- Author
-
Zhang, Yongqi, Yao, Quanming, and Chen, Lei
- Published
- 2021
- Full Text
- View/download PDF
7. VISTopic: A visual analytics system for making sense of large document collections using hierarchical topic modeling
- Author
-
Yang, Yi, Yao, Quanming, and Qu, Huamin
- Published
- 2017
- Full Text
- View/download PDF
8. On Strengthening and Defending Graph Reconstruction Attack with Markov Chain Approximation
- Author
-
Zhou, Zhanke, Zhou, Chenyu, Li, Xuan, Yao, Jiangchao, Yao, Quanming, and Han, Bo
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Science - Cryptography and Security ,Cryptography and Security (cs.CR) ,Machine Learning (cs.LG) - Abstract
Although powerful graph neural networks (GNNs) have boosted numerous real-world applications, the potential privacy risk is still underexplored. To close this gap, we perform the first comprehensive study of graph reconstruction attack that aims to reconstruct the adjacency of nodes. We show that a range of factors in GNNs can lead to the surprising leakage of private links. Especially by taking GNNs as a Markov chain and attacking GNNs via a flexible chain approximation, we systematically explore the underneath principles of graph reconstruction attack, and propose two information theory-guided mechanisms: (1) the chain-based attack method with adaptive designs for extracting more private information; (2) the chain-based defense method that sharply reduces the attack fidelity with moderate accuracy loss. Such two objectives disclose a critical belief that to recover better in attack, you must extract more multi-aspect knowledge from the trained GNN; while to learn safer for defense, you must forget more link-sensitive information in training GNNs. Empirically, we achieve state-of-the-art results on six datasets and three common GNNs. The code is publicly available at: https://github.com/tmlr-group/MC-GRA., Accepted by ICML 2023
- Published
- 2023
9. Automated 3D Pre-Training for Molecular Property Prediction
- Author
-
Wang, Xu, Zhao, Huan, Tu, Weiwei, and Yao, Quanming
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,FOS: Biological sciences ,Quantitative Biology - Quantitative Methods ,Quantitative Methods (q-bio.QM) ,Machine Learning (cs.LG) - Abstract
Molecular property prediction is an important problem in drug discovery and materials science. As geometric structures have been demonstrated necessary for molecular property prediction, 3D information has been combined with various graph learning methods to boost prediction performance. However, obtaining the geometric structure of molecules is not feasible in many real-world applications due to the high computational cost. In this work, we propose a novel 3D pre-training framework (dubbed 3D PGT), which pre-trains a model on 3D molecular graphs, and then fine-tunes it on molecular graphs without 3D structures. Based on fact that bond length, bond angle, and dihedral angle are three basic geometric descriptors corresponding to a complete molecular 3D conformer, we first develop a multi-task generative pre-train framework based on these three attributes. Next, to automatically fuse these three generative tasks, we design a surrogate metric using the \textit{total energy} to search for weight distribution of the three pretext task since total energy corresponding to the quality of 3D conformer.Extensive experiments on 2D molecular graphs are conducted to demonstrate the accuracy, efficiency and generalization ability of the proposed 3D PGT compared to various pre-training baselines.
- Published
- 2023
10. Enhancing Intra-class Information Extraction for Heterophilous Graphs: One Neural Architecture Search Approach
- Author
-
Wei, Lanning, He, Zhiqiang, Zhao, Huan, and Yao, Quanming
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Machine Learning (cs.LG) - Abstract
In recent years, Graph Neural Networks (GNNs) have been popular in graph representation learning which assumes the homophily property, i.e., the connected nodes have the same label or have similar features. However, they may fail to generalize into the heterophilous graphs which in the low/medium level of homophily. Existing methods tend to address this problem by enhancing the intra-class information extraction, i.e., either by designing better GNNs to improve the model effectiveness, or re-designing the graph structures to incorporate more potential intra-class nodes from distant hops. Despite the success, we observe two aspects that can be further improved: (a) enhancing the ego feature information extraction from node itself which is more reliable in extracting the intra-class information; (b) designing node-wise GNNs can better adapt to the nodes with different homophily ratios. In this paper, we propose a novel method IIE-GNN (Intra-class Information Enhanced Graph Neural Networks) to achieve two improvements. A unified framework is proposed based on the literature, in which the intra-class information from the node itself and neighbors can be extracted based on seven carefully designed blocks. With the help of neural architecture search (NAS), we propose a novel search space based on the framework, and then provide an architecture predictor to design GNNs for each node. We further conduct experiments to show that IIE-GNN can improve the model performance by designing node-wise GNNs to enhance intra-class information extraction.
- Published
- 2022
11. Search to Pass Messages for Temporal Knowledge Graph Completion
- Author
-
Wang, Zhen, Du, Haotong, Yao, Quanming, and Li, Xuelong
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Machine Learning (cs.LG) - Abstract
Completing missing facts is a fundamental task for temporal knowledge graphs (TKGs). Recently, graph neural network (GNN) based methods, which can simultaneously explore topological and temporal information, have become the state-of-the-art (SOTA) to complete TKGs. However, these studies are based on hand-designed architectures and fail to explore the diverse topological and temporal properties of TKG. To address this issue, we propose to use neural architecture search (NAS) to design data-specific message passing architecture for TKG completion. In particular, we develop a generalized framework to explore topological and temporal information in TKGs. Based on this framework, we design an expressive search space to fully capture various properties of different TKGs. Meanwhile, we adopt a search algorithm, which trains a supernet structure by sampling single path for efficient search with less cost. We further conduct extensive experiments on three benchmark datasets. The results show that the searched architectures by our method achieve the SOTA performances. Besides, the searched models can also implicitly reveal diverse properties in different TKGs. Our code is released in https://github.com/striderdu/SPA., Accepted by Findings of EMNLP 2022
- Published
- 2022
12. Searching a High-Performance Feature Extractor for Text Recognition Network
- Author
-
Zhang, Hui, Yao, Quanming, Kwok, James T., and Bai, Xiang
- Subjects
FOS: Computer and information sciences ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Feature extractor plays a critical role in text recognition (TR), but customizing its architecture is relatively less explored due to expensive manual tweaking. In this work, inspired by the success of neural architecture search (NAS), we propose to search for suitable feature extractors. We design a domain-specific search space by exploring principles for having good feature extractors. The space includes a 3D-structured space for the spatial model and a transformed-based space for the sequential model. As the space is huge and complexly structured, no existing NAS algorithms can be applied. We propose a two-stage algorithm to effectively search in the space. In the first stage, we cut the space into several blocks and progressively train each block with the help of an auxiliary head. We introduce the latency constraint into the second stage and search sub-network from the trained supernet via natural gradient descent. In experiments, a series of ablation studies are performed to better understand the designed space, search algorithm, and searched architectures. We also compare the proposed method with various state-of-the-art ones on both hand-written and scene TR tasks. Extensive results show that our approach can achieve better recognition performance with less latency.
- Published
- 2022
13. AutoWeird: Weird Translational Scoring Function Identified by Random Search
- Author
-
Yang, Hansi, Zhang, Yongqi, and Yao, Quanming
- Subjects
FOS: Computer and information sciences ,Artificial Intelligence (cs.AI) ,Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computation and Language (cs.CL) - Abstract
Scoring function (SF) measures the plausibility of triplets in knowledge graphs. Different scoring functions can lead to huge differences in link prediction performances on different knowledge graphs. In this report, we describe a weird scoring function found by random search on the open graph benchmark (OGB). This scoring function, called AutoWeird, only uses tail entity and relation in a triplet to compute its plausibility score. Experimental results show that AutoWeird achieves top-1 performance on ogbl-wikikg2 data set, but has much worse performance than other methods on ogbl-biokg data set. By analyzing the tail entity distribution and evaluation protocol of these two data sets, we attribute the unexpected success of AutoWeird on ogbl-wikikg2 to inappropriate evaluation and concentrated tail entity distribution. Such results may motivate further research on how to accurately evaluate the performance of different link prediction methods for knowledge graphs.
- Published
- 2022
14. Bridging the Gap of AutoGraph Between Academia and Industry: Analyzing AutoGraph Challenge at KDD Cup 2020
- Author
-
Xu, Zhen, Wei, Lanning, Zhao, Huan, Ying, Rex, Yao, Quanming, Tu, Wei-Wei, Guyon, Isabelle, 4Paradigm, Stanford University, Tsinghua University [Beijing] (THU), Chalearn, Université Paris-Saclay, Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), TAckling the Underspecified (TAU), Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), and ANR-19-CHIA-0022,HUMANIA,Intelligence Artificielle pour Tous(2019)
- Subjects
FOS: Computer and information sciences ,Automated Machine Learning ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Computer Science - Artificial Intelligence ,Data Challenge ,Artificial Intelligence ,Graph Neural Networks ,Node Classification ,Machine Learning (cs.LG) - Abstract
Graph structured data is ubiquitous in daily life and scientific areas and has attracted increasing attention. Graph Neural Networks (GNNs) have been proved to be effective in modeling graph structured data and many variants of GNN architectures have been proposed. However, much human effort is often needed to tune the architecture depending on different datasets. Researchers naturally adopt Automated Machine Learning on Graph Learning, aiming to reduce the human effort and achieve generally top-performing GNNs, but their methods focus more on the architecture search. To understand GNN practitioners' automated solutions, we organized AutoGraph Challenge at KDD Cup 2020, emphasizing on automated graph neural networks for node classification. We received top solutions especially from industrial tech companies like Meituan, Alibaba and Twitter, which are already open sourced on Github. After detailed comparisons with solutions from academia, we quantify the gaps between academia and industry on modeling scope, effectiveness and efficiency, and show that (1) academia AutoML for Graph solutions focus on GNN architecture search while industrial solutions, especially the winning ones in the KDD Cup, tend to obtain an overall solution (2) by neural architecture search only, academia solutions achieve on average 97.3% accuracy of industrial solutions (3) academia solutions are cheap to obtain with several GPU hours while industrial solutions take a few months' labors. Academic solutions also contain much fewer parameters., arXiv admin note: text overlap with arXiv:2110.05802
- Published
- 2022
- Full Text
- View/download PDF
15. AdaProp: Learning Adaptive Propagation for Graph Neural Network based Knowledge Graph Reasoning
- Author
-
Zhang, Yongqi, Zhou, Zhanke, Yao, Quanming, Chu, Xiaowen, and Han, Bo
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence - Abstract
Due to the popularity of Graph Neural Networks (GNNs), various GNN-based methods have been designed to reason on knowledge graphs (KGs). An important design component of GNN-based KG reasoning methods is called the propagation path, which contains a set of involved entities in each propagation step. Existing methods use hand-designed propagation paths, ignoring the correlation between the entities and the query relation. In addition, the number of involved entities will explosively grow at larger propagation steps. In this work, we are motivated to learn an adaptive propagation path in order to filter out irrelevant entities while preserving promising targets. First, we design an incremental sampling mechanism where the nearby targets and layer-wise connections can be preserved with linear complexity. Second, we design a learning-based sampling distribution to identify the semantically related entities. Extensive experiments show that our method is powerful, efficient, and semantic-aware. The code is available at https://github.com/LARS-research/AdaProp., Comment: accepted by KDD 2023
- Published
- 2022
16. KGTuner: Efficient Hyper-parameter Search for Knowledge Graph Learning
- Author
-
Zhang, Yongqi, Zhou, Zhanke, Yao, Quanming, and Li, Yong
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Machine Learning (cs.LG) - Abstract
While hyper-parameters (HPs) are important for knowledge graph (KG) learning, existing methods fail to search them efficiently. To solve this problem, we first analyze the properties of different HPs and measure the transfer ability from small subgraph to the full graph. Based on the analysis, we propose an efficient two-stage search algorithm KGTuner, which efficiently explores HP configurations on small subgraph at the first stage and transfers the top-performed configurations for fine-tuning on the large full graph at the second stage. Experiments show that our method can consistently find better HPs than the baseline algorithms within the same time budget, which achieves {9.1\%} average relative improvement for four embedding models on the large-scale KGs in open graph benchmark., Accepted to ACL 2022 (long paper)
- Published
- 2022
17. Knowledge Graph Reasoning with Relational Digraph
- Author
-
Zhang, Yongqi and Yao, Quanming
- Subjects
FOS: Computer and information sciences ,Artificial Intelligence (cs.AI) ,Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computation and Language (cs.CL) - Abstract
Reasoning on the knowledge graph (KG) aims to infer new facts from existing ones. Methods based on the relational path have shown strong, interpretable, and transferable reasoning ability. However, paths are naturally limited in capturing local evidence in graphs. In this paper, we introduce a novel relational structure, i.e., relational directed graph (r-digraph), which is composed of overlapped relational paths, to capture the KG's local evidence. Since the r- digraphs are more complex than paths, how to efficiently construct and effectively learn from them are challenging. Directly encoding the r-digraphs cannot scale well and capturing query-dependent information is hard in r-digraphs. We propose a variant of graph neural network, i.e., RED-GNN, to address the above challenges. Specifically, RED-GNN makes use of dynamic programming to recursively encodes multiple r-digraphs with shared edges, and utilizes a query-dependent attention mechanism to select the strongly correlated edges. We demonstrate that RED-GNN is not only efficient but also can achieve significant performance gains in both inductive and transductive reasoning tasks over existing methods. Besides, the learned attention weights in RED-GNN can exhibit interpretable evidence for KG reasoning.
- Published
- 2021
18. Efficient Low-Rank Semidefinite Programming With Robust Loss Functions.
- Author
-
Yao, Quanming, Yang, Hansi, Hu, En-Liang, and Kwok, James T.
- Subjects
- *
ROBUST programming , *SEMIDEFINITE programming , *DATA corruption , *SPARSE matrices , *ALGORITHMS , *MACHINE learning - Abstract
In real-world applications, it is important for machine learning algorithms to be robust against data outliers or corruptions. In this paper, we focus on improving the robustness of a large class of learning algorithms that are formulated as low-rank semi-definite programming (SDP) problems. Traditional formulations use the square loss, which is notorious for being sensitive to outliers. We propose to replace this with more robust noise models, including the $\ell _1$ ℓ 1 -loss and other nonconvex losses. However, the resultant optimization problem becomes difficult as the objective is no longer convex or smooth. To alleviate this problem, we design an efficient algorithm based on majorization-minimization. The crux is on constructing a good optimization surrogate, and we show that this surrogate can be efficiently obtained by the alternating direction method of multipliers (ADMM). By properly monitoring ADMM's convergence, the proposed algorithm is empirically efficient and also theoretically guaranteed to converge to a critical point. Extensive experiments are performed on four machine learning applications using both synthetic and real-world data sets. Results show that the proposed algorithm is not only fast but also has better performance than the state-of-the-arts. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
19. Tensorizing Subgraph Search in the Supernet
- Author
-
Yang, Hansi, Yao, Quanming, and Kwok, James
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Machine Learning (cs.LG) - Abstract
Recently, a special kind of graph, i.e., supernet, which allows two nodes connected by multi-choice edges, has exhibited its power in neural architecture search (NAS) by searching for better architectures for computer vision (CV) and natural language processing (NLP) tasks. In this paper, we discover that the design of such discrete architectures also appears in many other important learning tasks, e.g., logical chain inference in knowledge graphs (KGs) and meta-path discovery in heterogeneous information networks (HINs). Thus, we are motivated to generalize the supernet search problem on a broader horizon. However, none of the existing works are effective since the supernet topology is highly task-dependent and diverse. To address this issue, we propose to tensorize the supernet, i.e., unify the subgraph search problems by a tensor formulation and encode the topology inside the supernet by a tensor network. We further propose an efficient algorithm that admits both stochastic and deterministic objectives to solve the search problem. Finally, we perform extensive experiments on diverse learning tasks, i.e., architecture design for CV, logic inference for KG, and meta-path discovery for HIN. Empirical results demonstrate that our method leads to better performance and architectures.
- Published
- 2021
20. A Survey of Label-noise Representation Learning: Past, Present and Future
- Author
-
Han, Bo, Yao, Quanming, Liu, Tongliang, Niu, Gang, Tsang, Ivor W., Kwok, James T., and Sugiyama, Masashi
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Machine Learning (cs.LG) - Abstract
Classical machine learning implicitly assumes that labels of the training data are sampled from a clean distribution, which can be too restrictive for real-world scenarios. However, statistical-learning-based methods may not train deep learning models robustly with these noisy labels. Therefore, it is urgent to design Label-Noise Representation Learning (LNRL) methods for robustly training deep models with noisy labels. To fully understand LNRL, we conduct a survey study. We first clarify a formal definition for LNRL from the perspective of machine learning. Then, via the lens of learning theory and empirical study, we figure out why noisy labels affect deep models' performance. Based on the theoretical guidance, we categorize different LNRL methods into three directions. Under this unified taxonomy, we provide a thorough discussion of the pros and cons of different categories. More importantly, we summarize the essential components of robust LNRL, which can spark new directions. Lastly, we propose possible research directions within LNRL, such as new datasets, instance-dependent LNRL, and adversarial LNRL. We also envision potential directions beyond LNRL, such as learning with feature-noise, preference-noise, domain-noise, similarity-noise, graph-noise and demonstration-noise., The draft is kept updating; any comments and suggestions are welcome
- Published
- 2020
21. Non-Local Meets Global: An Iterative Paradigm for Hyperspectral Image Restoration.
- Author
-
He, Wei, Yao, Quanming, Li, Chao, Yokoya, Naoto, Zhao, Qibin, Zhang, Hongyan, and Zhang, Liangpei
- Subjects
- *
IMAGE reconstruction , *LIE groups , *INPAINTING , *NOISE control , *TASK analysis - Abstract
Non-local low-rank tensor approximation has been developed as a state-of-the-art method for hyperspectral image (HSI) restoration, which includes the tasks of denoising, compressed HSI reconstruction and inpainting. Unfortunately, while its restoration performance benefits from more spectral bands, its runtime also substantially increases. In this paper, we claim that the HSI lies in a global spectral low-rank subspace, and the spectral subspaces of each full band patch group should lie in this global low-rank subspace. This motivates us to propose a unified paradigm combining the spatial and spectral properties for HSI restoration. The proposed paradigm enjoys performance superiority from the non-local spatial denoising and light computation complexity from the low-rank orthogonal basis exploration. An efficient alternating minimization algorithm with rank adaptation is developed. It is done by first solving a fidelity term-related problem for the update of a latent input image, and then learning a low-dimensional orthogonal basis and the related reduced image from the latent input image. Subsequently, non-local low-rank denoising is developed to refine the reduced image and orthogonal basis iteratively. Finally, the experiments on HSI denoising, compressed reconstruction, and inpainting tasks, with both simulated and real datasets, demonstrate its superiority with respect to state-of-the-art HSI restoration methods. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
22. Guest Editorial: Automated Machine Learning.
- Author
-
Escalante, Hugo Jair, Yao, Quanming, Tu, Wei-Wei, Pillay, Nelishia, Qu, Rong, Yu, Yang, and Houlsby, Neil
- Subjects
- *
MACHINE learning , *ARTIFICIAL intelligence , *DECISION support systems , *DEEP learning , *COMPUTER vision , *SCIENCE journalism - Abstract
An introduction is presented in which the editor discusses articles in the issue on topics including cutting edge AutoML research; tree-structured covariance function; and comprehensive study on predicting run time performance under different AutoML.
- Published
- 2021
- Full Text
- View/download PDF
23. FasTer: Fast Tensor Completion with Nonconvex Regularization
- Author
-
Yao, Quanming, Kwok, James T, and Han, Bo
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Statistics - Machine Learning ,Machine Learning (stat.ML) ,Machine Learning (cs.LG) - Abstract
Low-rank tensor completion problem aims to recover a tensor from limited observations, which has many real-world applications. Due to the easy optimization, the convex overlapping nuclear norm has been popularly used for tensor completion. However, it over-penalizes top singular values and lead to biased estimations. In this paper, we propose to use the nonconvex regularizer, which can less penalize large singular values, instead of the convex one for tensor completion. However, as the new regularizer is nonconvex and overlapped with each other, existing algorithms are either too slow or suffer from the huge memory cost. To address these issues, we develop an efficient and scalable algorithm, which is based on the proximal average (PA) algorithm, for real-world problems. Compared with the direct usage of PA algorithm, the proposed algorithm runs orders faster and needs orders less space. We further speed up the proposed algorithm with the acceleration technique, and show the convergence to critical points is still guaranteed. Experimental comparisons of the proposed approach are made with various other tensor completion approaches. Empirical results show that the proposed algorithm is very fast and can produce much better recovery performance.
- Published
- 2018
24. Online Convolutional Sparse Coding with Sample-Dependent Dictionary
- Author
-
Wang, Yaqing, Yao, Quanming, Kwok, James T., and Ni, Lionel M.
- Subjects
FOS: Computer and information sciences ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Convolutional sparse coding (CSC) has been popularly used for the learning of shift-invariant dictionaries in image and signal processing. However, existing methods have limited scalability. In this paper, instead of convolving with a dictionary shared by all samples, we propose the use of a sample-dependent dictionary in which filters are obtained as linear combinations of a small set of base filters learned from the data. This added flexibility allows a large number of sample-dependent patterns to be captured, while the resultant model can still be efficiently learned by online learning. Extensive experimental results show that the proposed method outperforms existing CSC algorithms with significantly reduced time and space requirements., Accepted by ICML-2018
- Published
- 2018
25. Fast Learning with Nonconvex L1-2 Regularization
- Author
-
Yao, Quanming, Kwok, James T., and Guo, Xiawei
- Subjects
FOS: Computer and information sciences ,Statistics::Machine Learning ,Computer Science - Learning ,FOS: Mathematics ,Computer Science - Numerical Analysis ,Numerical Analysis (math.NA) ,Mathematics - Numerical Analysis ,Machine Learning (cs.LG) - Abstract
Convex regularizers are often used for sparse learning. They are easy to optimize, but can lead to inferior prediction performance. The difference of $\ell_1$ and $\ell_2$ ($\ell_{1-2}$) regularizer has been recently proposed as a nonconvex regularizer. It yields better recovery than both $\ell_0$ and $\ell_1$ regularizers on compressed sensing. However, how to efficiently optimize its learning problem is still challenging. The main difficulty is that both the $\ell_1$ and $\ell_2$ norms in $\ell_{1-2}$ are not differentiable, and existing optimization algorithms cannot be applied. In this paper, we show that a closed-form solution can be derived for the proximal step associated with this regularizer. We further extend the result for low-rank matrix learning and the total variation model. Experiments on both synthetic and real data sets show that the resultant accelerated proximal gradient algorithm is more efficient than other noncovex optimization algorithms.
- Published
- 2016
26. Large-Scale Low-Rank Matrix Learning with Nonconvex Regularizers.
- Author
-
Yao, Quanming, Kwok, James T., Wang, Taifeng, and Liu, Tie-Yan
- Subjects
- *
LOW-rank matrices , *COMPUTER vision , *MULTIPLE correspondence analysis (Statistics) , *SPARSE matrices , *NONCONVEX programming , *APPLICATION software - Abstract
Low-rank modeling has many important applications in computer vision and machine learning. While the matrix rank is often approximated by the convex nuclear norm, the use of nonconvex low-rank regularizers has demonstrated better empirical performance. However, the resulting optimization problem is much more challenging. Recent state-of-the-art requires an expensive full SVD in each iteration. In this paper, we show that for many commonly-used nonconvex low-rank regularizers, the singular values obtained from the proximal operator can be automatically threshold. This allows the proximal operator to be efficiently approximated by the power method. We then develop a fast proximal algorithm and its accelerated variant with inexact proximal step. It can be guaranteed that the squared distance between consecutive iterates converges at a rate of $O(1/T)$O(1/T), where $T$T is the number of iterations. Furthermore, we show the proposed algorithm can be parallelized, and the resultant algorithm achieves nearly linear speedup w.r.t. the number of threads. Extensive experiments are performed on matrix completion and robust principal component analysis. Significant speedup over the state-of-the-art is observed. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
27. Learning of Generalized Low-Rank Models: A Greedy Approach
- Author
-
Yao, Quanming and Kwok, James T.
- Subjects
FOS: Computer and information sciences ,Computer Science - Learning ,Optimization and Control (math.OC) ,FOS: Mathematics ,Computer Science - Numerical Analysis ,Numerical Analysis (math.NA) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
Learning of low-rank matrices is fundamental to many machine learning applications. A state-of-the-art algorithm is the rank-one matrix pursuit (R1MP). However, it can only be used in matrix completion problems with the square loss. In this paper, we develop a more flexible greedy algorithm for generalized low-rank models whose optimization objective can be smooth or nonsmooth, general convex or strongly convex. The proposed algorithm has low per-iteration time complexity and fast convergence rate. Experimental results show that it is much faster than the state-of-the-art, with comparable or even better prediction performance.
- Published
- 2016
28. Efficient Learning with a Family of Nonconvex Regularizers by Redistributing Nonconvexity
- Author
-
Yao, Quanming and Kwok, James. T
- Subjects
FOS: Computer and information sciences ,Computer Science - Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,MathematicsofComputing_NUMERICALANALYSIS ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
The use of convex regularizers allows for easy optimization, though they often produce biased estimation and inferior prediction performance. Recently, nonconvex regularizers have attracted a lot of attention and outperformed convex ones. However, the resultant optimization problem is much harder. In this paper, for a large class of nonconvex regularizers, we propose to move the nonconvexity from the regularizer to the loss. The nonconvex regularizer is then transformed to a familiar convex regularizer, while the resultant loss function can still be guaranteed to be smooth. Learning with the convexified regularizer can be performed by existing efficient algorithms originally designed for convex regularizers (such as the proximal algorithm, Frank-Wolfe algorithm, alternating direction method of multipliers and stochastic gradient descent). Extensions are made when the convexified regularizer does not have closed-form proximal step, and when the loss function is nonconvex, nonsmooth. Extensive experiments on a variety of machine learning application scenarios show that optimizing the transformed problem is much faster than running the state-of-the-art on the original problem., Journal version of previous conference paper appeared at ICML-2016 with same title
- Published
- 2016
29. Accelerated and Inexact Soft-Impute for Large-Scale Matrix and Tensor Completion.
- Author
-
Yao, Quanming and Kwok, James T.
- Subjects
- *
LOW-rank matrices , *RECOMMENDER systems , *DATA mining , *SPARSE matrices , *MATRICES (Mathematics) , *TECHNOLOGY convergence - Abstract
Matrix and tensor completion aim to recover a low-rank matrix / tensor from limited observations and have been commonly used in applications such as recommender systems and multi-relational data mining. A state-of-the-art matrix completion algorithm is Soft-Impute, which exploits the special "sparse plus low-rank" structure of the matrix iterates to allow efficient SVD in each iteration. Though Soft-Impute is a proximal algorithm, it is generally believed that acceleration destroys the special structure and is thus not useful. In this paper, we show that Soft-Impute can indeed be accelerated without comprising this structure. To further reduce the iteration time complexity, we propose an approximate singular value thresholding scheme based on the power method. Theoretical analysis shows that the proposed algorithm still enjoys the fast $O(1/T^2)$O(1/T2) convergence rate of accelerated proximal algorithms. We also extend the proposed algorithm to tensor completion with the scaled latent nuclear norm regularizer. We show that a similar "sparse plus low-rank" structure also exists, leading to low iteration complexity and fast $O(1/T^2)$O(1/T2) convergence rate. Besides, the proposed algorithm can be further extended to nonconvex low-rank regularizers, which have better empirical performance than the convex nuclear norm regularizer. Extensive experiments demonstrate that the proposed algorithm is much faster than Soft-Impute and other state-of-the-art matrix and tensor completion algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
30. Millionaire: a hint-guided approach for crowdsourcing.
- Author
-
Han, Bo, Yao, Quanming, Pan, Yuangang, Tsang, Ivor W., Xiao, Xiaokui, Yang, Qiang, and Sugiyama, Masashi
- Subjects
CROWDSOURCING ,MACHINE learning ,GAME theory ,QUALITY control ,ENVIRONMENTAL psychology - Abstract
Modern machine learning is migrating to the era of complex models, which requires a plethora of well-annotated data. While crowdsourcing is a promising tool to achieve this goal, existing crowdsourcing approaches barely acquire a sufficient amount of high-quality labels. In this paper, motivated by the "Guess-with-Hints" answer strategy from the Millionaire game show, we introduce the hint-guided approach into crowdsourcing to deal with this challenge. Our approach encourages workers to get help from hints when they are unsure of questions. Specifically, we propose a hybrid-stage setting, consisting of the main stage and the hint stage. When workers face any uncertain question on the main stage, they are allowed to enter the hint stage and look up hints before making any answer. A unique payment mechanism that meets two important design principles for crowdsourcing is developed. Besides, the proposed mechanism further encourages high-quality workers less using hints, which helps identify and assigns larger possible payment to them. Experiments are performed on Amazon Mechanical Turk, which show that our approach ensures a sufficient number of high-quality labels with low expenditure and detects high-quality workers. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
31. Fast Low-Rank Matrix Learning with Nonconvex Regularization.
- Author
-
Yao, Quanming, Kwok, James T., and Zhong, Wenliang
- Published
- 2015
- Full Text
- View/download PDF
32. Efficient Group Learning with Hypergraph Partition in Multi-task Learning.
- Author
-
Yao, Quanming, Jiang, Xiubao, Gong, Mingming, You, Xinge, Liu, Yu, and Xu, Duanquan
- Published
- 2012
- Full Text
- View/download PDF
33. Property-Aware Relation Networks for Few-Shot Molecular Property Prediction.
- Author
-
Yao Q, Shen Z, Wang Y, and Dou D
- Abstract
Molecular property prediction plays a fundamental role in AI-aided drug discovery to identify candidate molecules, which is also essentially a few-shot problem due to lack of labeled data. In this paper, we propose Property-Aware Relation networks (PAR) to handle this problem. We first introduce a property-aware molecular encoder to transform the generic molecular embeddings to property-aware ones. Then, we design a query-dependent relation graph learning module to estimate molecular relation graph and refine molecular embeddings w.r.t. the target property. Thus, the facts that both property-related information and relationships among molecules change across different properties are utilized to better learn and propagate molecular embeddings. Generally, PAR can be regarded as a combination of metric-based and optimization-based few-shot learning method. We further extend PAR to Transferable PAR (T-PAR) to handle the distribution shift, which is common in drug discovery. The keys are joint sampling and relation graph learning schemes, which simultaneously learn molecular embeddings from both source and target domains. Extensive results on benchmark datasets show that PAR and T-PAR consistently outperform existing methods on few-shot and transferable few-shot molecular property prediction tasks, respectively. Besides, ablation and case studies are conducted to validate the rationality of our designs in PAR and T-PAR.
- Published
- 2024
- Full Text
- View/download PDF
34. Flow to Candidate: Temporal Knowledge Graph Reasoning With Candidate-Oriented Relational Graph.
- Author
-
Fan S, Fan G, Nie H, Yao Q, Liu Y, Li X, and Wang Z
- Abstract
Reasoning over temporal knowledge graphs (TKGs) is a challenging task that requires models to infer future events based on past facts. Currently, subgraph-based methods have become the state-of-the-art (SOTA) techniques for this task due to their superior capability to explore local information in knowledge graphs (KGs). However, while previous methods have been effective in capturing semantic patterns in TKG, they are hard to capture more complex topological patterns. In contrast, path-based methods can efficiently capture relation paths between nodes and obtain relation patterns based on the order of relation connections. But subgraphs can retain much more information than a single path. Motivated by this observation, we propose a new subgraph-based approach to capture complex relational patterns. The method constructs candidate-oriented relational graphs to capture the local structure of TKGs and introduces a variant of a graph neural network model to learn the graph structure information between query-candidate pairs. In particular, we first design a prior directed temporal edge sampling method, which is starting from the query node and generating multiple candidate-oriented relational graphs simultaneously. Next, we propose a recursive propagation architecture that can encode all relational graphs in the local structures in parallel. Additionally, we introduce a self-attention mechanism in the propagation architecture to capture the query's preference. Finally, we design a simple scoring function to calculate the candidate nodes' scores and generate the model's predictions. To validate our approach, we conduct extensive experiments on four benchmark datasets (ICEWS14, ICEWS18, ICEWS0515, and YAGO). Experiments on four benchmark datasets demonstrate that our proposed approach possesses stronger inference and faster convergence than the SOTA methods. In addition, our method provides a relational graph for each query-candidate pair, which offers interpretable evidence for TKG prediction results.
- Published
- 2024
- Full Text
- View/download PDF
35. Deep Factor Learning for Accurate Brain Neuroimaging Data Analysis on Discrimination for Structural MRI and Functional MRI.
- Author
-
Ke H, Chen D, Yao Q, Tang Y, Wu J, Monaghan J, Sowman P, and McAlpine D
- Subjects
- Humans, Parkinson Disease diagnostic imaging, Neural Networks, Computer, Algorithms, Image Interpretation, Computer-Assisted methods, Magnetic Resonance Imaging methods, Brain diagnostic imaging, Deep Learning, Neuroimaging methods, Attention Deficit Disorder with Hyperactivity diagnostic imaging, Attention Deficit Disorder with Hyperactivity physiopathology
- Abstract
Analysis of neuroimaging data (e.g., Magnetic Resonance Imaging, structural and functional MRI) plays an important role in monitoring brain dynamics and probing brain structures. Neuroimaging data are multi-featured and non-linear by nature, and it is a natural way to organise these data as tensors prior to performing automated analyses such as discrimination of neurological disorders like Parkinson's Disease (PD) and Attention Deficit and Hyperactivity Disorder (ADHD). However, the existing approaches are often subject to performance bottlenecks (e.g., conventional feature extraction and deep learning based feature construction), as these can lose the structural information that correlates multiple data dimensions or/and demands excessive empirical and application-specific settings. This study proposes a Deep Factor Learning model on a Hilbert Basis tensor (namely, HB-DFL) to automatically derive latent low-dimensional and concise factors of tensors. This is achieved through the application of multiple Convolutional Neural Networks (CNNs) in a non-linear manner along all possible dimensions with no assumed a priori knowledge. HB-DFL leverages the Hilbert basis tensor to enhance the stability of the solution by regularizing the core tensor to allow any component in a certain domain to interact with any component in the other dimensions. The final multi-domain features are handled through another multi-branch CNN to achieve reliable classification, exemplified here using MRI discrimination as a typical case. A case study of MRI discrimination has been performed on public MRI datasets for discrimination of PD and ADHD. Results indicate that 1) HB-DFL outperforms the counterparts in terms of FIT, mSIR and stability (mSC and umSC) of factor learning; 2) HB-DFL identifies PD and ADHD with an accuracy significantly higher than state-of-the-art methods do. Overall, HB-DFL has significant potentials for neuroimaging data analysis applications with its stability of automatic construction of structural features.
- Published
- 2024
- Full Text
- View/download PDF
36. Searching to Exploit Memorization Effect in Deep Learning with Noisy Labels.
- Author
-
Yang H, Yao Q, Han B, and Kwok JT
- Abstract
Sample selection approaches are popular in robust learning from noisy labels. However, how to control the selection process properly so that deep networks can benefit from the memorization effect is a hard problem. In this paper, motivated by the success of automated machine learning (AutoML), we propose to control the selection process by bi-level optimization. Specifically, we parameterize the selection process by exploiting the general patterns of the memorization effect in the upper-level, and then update these parameters using predicting accuracy obtained from model training in the lower-level. We further introduce semi-supervised learning algorithms to utiilize noisy-labeled data as unlabeled data. To solve the bi-level optimization problem efficiently, we consider more information from the validation curvature by the Newton method and cubic regularization method. We provide convergence analysis for both optimization methods. Results show that while both methods can converge to an (approximately) stationary point, the cubic regularization method can find better local optimal than the Newton method with less time. Experiments on both benchmark and real-world data sets demonstrate that the proposed searching method can lead to significant improvements upon existing methods. Compared with existing AutoML approaches, our method is much more efficient on finding a good selection schedule.
- Published
- 2024
- Full Text
- View/download PDF
37. Searching a High Performance Feature Extractor for Text Recognition Network.
- Author
-
Zhang H, Yao Q, Kwok JT, and Bai X
- Abstract
Feature extractor plays a critical role in text recognition (TR), but customizing its architecture is relatively less explored due to expensive manual tweaking. In this article, inspired by the success of neural architecture search (NAS), we propose to search for suitable feature extractors. We design a domain-specific search space by exploring principles for having good feature extractors. The space includes a 3D-structured space for the spatial model and a transformed-based space for the sequential model. As the space is huge and complexly structured, no existing NAS algorithms can be applied. We propose a two-stage algorithm to effectively search in the space. In the first stage, we cut the space into several blocks and progressively train each block with the help of an auxiliary head. We introduce the latency constrain into the second stage and search sub-network from the trained supernet via natural gradient descent. In experiments, a series of ablation studies are performed to better understand the designed space, search algorithm, and searched architectures. We also compare the proposed method with various state-of-the-art ones on both hand-written and scene TR tasks. Extensive results show that our approach can achieve better recognition performance with less latency. Code is avaliable at https://github.com/AutoML-Research/TREFE.
- Published
- 2023
- Full Text
- View/download PDF
38. Bilinear Scoring Function Search for Knowledge Graph Learning.
- Author
-
Zhang Y, Yao Q, and Kwok JT
- Abstract
Learning embeddings for entities and relations in knowledge graph (KG) have benefited many downstream tasks. In recent years, scoring functions, the crux of KG learning, have been human designed to measure the plausibility of triples and capture different kinds of relations in KGs. However, as relations exhibit intricate patterns that are hard to infer before training, none of them consistently perform the best on benchmark tasks. In this paper, inspired by the recent success of automated machine learning (AutoML), we search bilinear scoring functions for different KG tasks through the AutoML techniques. However, it is non-trivial to explore domain-specific information here. We first set up a search space for AutoBLM by analyzing existing scoring functions. Then, we propose a progressive algorithm (AutoBLM) and an evolutionary algorithm (AutoBLM+), which are further accelerated by filter and predictor to deal with the domain-specific properties for KG learning. Finally, we perform extensive experiments on benchmarks in KG completion, multi-hop query, and entity classification tasks. Empirical results show that the searched scoring functions are KG dependent, new to the literature, and outperform the existing scoring functions. AutoBLM+ is better than AutoBLM as the evolutionary algorithm can flexibly explore better structures in the same budget.
- Published
- 2023
- Full Text
- View/download PDF
39. Scalable Online Convolutional Sparse Coding.
- Author
-
Wang Y, Yao Q, Kwok JT, and Ni LM
- Abstract
Convolutional sparse coding (CSC) improves sparse coding by learning a shift-invariant dictionary from the data. However, most existing CSC algorithms operate in the batch mode and are computationally expensive. In this paper, we alleviate this problem by online learning. The key is a reformulation of the CSC objective so that convolution can be handled easily in the frequency domain, and much smaller history matrices are needed. To solve the resultant optimization problem, we use the alternating direction method of multipliers (ADMMs), and its subproblems have efficient closed-form solutions. Theoretical analysis shows that the learned dictionary converges to a stationary point of the optimization problem. Extensive experiments are performed on both the standard CSC benchmark data sets and much larger data sets such as the ImageNet. Results show that the proposed algorithm outperforms the state-of-the-art batch and online CSC methods. It is more scalable, has faster convergence, and better reconstruction performance.
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.