Author: "Cheng, James" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Cheng, James"' showing total 752 results

Start Over Author "Cheng, James"

752 results on '"Cheng, James"'

1. HIGHT: Hierarchical Graph Tokenization for Graph-Language Alignment

Author: Chen, Yongqiang, Yao, Quanming, Zhang, Juzheng, Cheng, James, and Bian, Yatao
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning, Quantitative Biology - Quantitative Methods
Abstract: Recently there has been a surge of interest in extending the success of large language models (LLMs) to graph modality, such as social networks and molecules. As LLMs are predominantly trained with 1D text data, most existing approaches adopt a graph neural network to represent a graph as a series of node tokens and feed these tokens to LLMs for graph-language alignment. Despite achieving some successes, existing approaches have overlooked the hierarchical structures that are inherent in graph data. Especially, in molecular graphs, the high-order structural information contains rich semantics of molecular functional groups, which encode crucial biochemical functionalities of the molecules. We establish a simple benchmark showing that neglecting the hierarchical information in graph tokenization will lead to subpar graph-language alignment and severe hallucination in generated outputs. To address this problem, we propose a novel strategy called HIerarchical GrapH Tokenization (HIGHT). HIGHT employs a hierarchical graph tokenizer that extracts and encodes the hierarchy of node, motif, and graph levels of informative tokens to improve the graph perception of LLMs. HIGHT also adopts an augmented graph-language supervised fine-tuning dataset, enriched with the hierarchical graph information, to further enhance the graph-language alignment. Extensive experiments on 7 molecule-centric benchmarks confirm the effectiveness of HIGHT in reducing hallucination by 40%, as well as significant improvements in various molecule-language downstream tasks., Comment: Preliminary version of an ongoing project: https://higraphllm.github.io/
Published: 2024

2. How Interpretable Are Interpretable Graph Neural Networks?

Author: Chen, Yongqiang, Bian, Yatao, Han, Bo, and Cheng, James
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Interpretable graph neural networks (XGNNs ) are widely adopted in various scientific applications involving graph-structured data. Existing XGNNs predominantly adopt the attention-based mechanism to learn edge or node importance for extracting and making predictions with the interpretable subgraph. However, the representational properties and limitations of these methods remain inadequately explored. In this work, we present a theoretical framework that formulates interpretable subgraph learning with the multilinear extension of the subgraph distribution, coined as subgraph multilinear extension (SubMT). Extracting the desired interpretable subgraph requires an accurate approximation of SubMT, yet we find that the existing XGNNs can have a huge gap in fitting SubMT. Consequently, the SubMT approximation failure will lead to the degenerated interpretability of the extracted subgraphs. To mitigate the issue, we design a new XGNN architecture called Graph Multilinear neT (GMT), which is provably more powerful in approximating SubMT. We empirically validate our theoretical findings on a number of graph classification benchmarks. The results demonstrate that GMT outperforms the state-of-the-art up to 10% in terms of both interpretability and generalizability across 12 regular and geometric graph benchmarks., Comment: ICML2024, 44 pages, 21 figures, 12 tables
Published: 2024

3. Beyond microroughness: novel approaches to navigate osteoblast activity on implant surfaces.

Author: Matsuura, Takanori, Komatsu, Keiji, Cheng, James, Park, Gunwoo, and Ogawa, Takahiro
Subjects: Bone and implant integration, Meso-structuring, Nanotechnology, Osseointegration, UV photofunctionalization, Osteoblasts, Humans, Surface Properties, Osseointegration, Dental Implants, Cell Differentiation, Cell Proliferation, Titanium, Osteogenesis
Abstract: Considering the biological activity of osteoblasts is crucial when devising new approaches to enhance the osseointegration of implant surfaces, as their behavior profoundly influences clinical outcomes. An established inverse correlation exists between osteoblast proliferation and their functional differentiation, which constrains the rapid generation of a significant amount of bone. Examining the surface morphology of implants reveals that roughened titanium surfaces facilitate rapid but thin bone formation, whereas smooth, machined surfaces promote greater volumes of bone formation albeit at a slower pace. Consequently, osteoblasts differentiate faster on roughened surfaces but at the expense of proliferation speed. Moreover, the attachment and initial spreading behavior of osteoblasts are notably compromised on microrough surfaces. This review delves into our current understanding and recent advances in nanonodular texturing, meso-scale texturing, and UV photofunctionalization as potential strategies to address the biological dilemma of osteoblast kinetics, aiming to improve the quality and quantity of osseointegration. We discuss how these topographical and physicochemical strategies effectively mitigate and even overcome the dichotomy of osteoblast behavior and the biological challenges posed by microrough surfaces. Indeed, surfaces modified with these strategies exhibit enhanced recruitment, attachment, spread, and proliferation of osteoblasts compared to smooth surfaces, while maintaining or amplifying the inherent advantage of cell differentiation. These technology platforms suggest promising avenues for the development of future implants.
Published: 2024

4. Nanofeatured surfaces in dental implants: contemporary insights and impending challenges.

Author: Komatsu, Keiji, Matsuura, Takanori, Cheng, James, Kido, Daisuke, Park, Wonhee, and Ogawa, Takahiro
Subjects: Bone-titanium integration, Dental and orthopedic implants, Microrough surface, Osseointegration, Osteoblasts, Dental Implants, Surface Properties, Humans, Osseointegration, Titanium, Nanostructures, Osteoblasts, Dental Prosthesis Design
Abstract: Dental implant therapy, established as standard-of-care nearly three decades ago with the advent of microrough titanium surfaces, revolutionized clinical outcomes through enhanced osseointegration. However, despite this pivotal advancement, challenges persist, including prolonged healing times, restricted clinical indications, plateauing success rates, and a notable incidence of peri-implantitis. This review explores the biological merits and constraints of microrough surfaces and evaluates the current landscape of nanofeatured dental implant surfaces, aiming to illuminate strategies for addressing existing impediments in implant therapy. Currently available nanofeatured dental implants incorporated nano-structures onto their predecessor microrough surfaces. While nanofeature integration into microrough surfaces demonstrates potential for enhancing early-stage osseointegration, it falls short of surpassing its predecessors in terms of osseointegration capacity. This discrepancy may be attributed, in part, to the inherent dichotomy kinetics of osteoblasts, wherein increased surface roughness by nanofeatures enhances osteoblast differentiation but concomitantly impedes cell attachment and proliferation. We also showcase a controllable, hybrid micro-nano titanium model surface and contrast it with commercially-available nanofeatured surfaces. Unlike the commercial nanofeatured surfaces, the controllable micro-nano hybrid surface exhibits superior potential for enhancing both cell differentiation and proliferation. Hence, present nanofeatured dental implants represent an evolutionary step from conventional microrough implants, yet they presently lack transformative capacity to surmount existing limitations. Further research and development endeavors are imperative to devise optimized surfaces rooted in fundamental science, thereby propelling technological progress in the field.
Published: 2024

5. Efficient private SCO for heavy-tailed data via averaged clipping

Author: Jin, Chenhan, Zhou, Kaiwen, Han, Bo, Cheng, James, and Zeng, Tieyong
Published: 2024
Full Text: View/download PDF

6. Discovery of the Hidden World with Large Language Models

Author: Liu, Chenxi, Chen, Yongqiang, Liu, Tongliang, Gong, Mingming, Cheng, James, Han, Bo, and Zhang, Kun
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Methodology
Abstract: Revealing the underlying causal mechanisms in the real world is the key to the development of science. Despite the progress in the past decades, traditional causal discovery approaches (CDs) mainly rely on high-quality measured variables, usually given by human experts, to find causal relations. The lack of well-defined high-level variables in many real-world applications has already been a longstanding roadblock to a broader application of CDs. To this end, this paper presents Causal representatiOn AssistanT (COAT) that introduces large language models (LLMs) to bridge the gap. LLMs are trained on massive observations of the world and have demonstrated great capability in extracting key information from unstructured data. Therefore, it is natural to employ LLMs to assist with proposing useful high-level factors and crafting their measurements. Meanwhile, COAT also adopts CDs to find causal relations among the identified variables as well as to provide feedback to LLMs to iteratively refine the proposed factors. We show that LLMs and CDs are mutually beneficial and the constructed feedback provably also helps with the factor proposal. We construct and curate several synthetic and real-world benchmarks including analysis of human reviews and diagnosis of neuropathic and brain tumors, to comprehensively evaluate COAT. Extensive empirical results confirm the effectiveness and reliability of COAT with significant improvements., Comment: NeurIPS 2024; Chenxi and Yongqiang contributed equally; 59 pages, 72 figures; Project page: https://causalcoat.github.io/
Published: 2024

7. Enhancing Neural Subset Selection: Integrating Background Information into Set Representations

Author: Xie, Binghui, Bian, Yatao, zhou, Kaiwen, Chen, Yongqiang, Zhao, Peilin, Han, Bo, Meng, Wei, and Cheng, James
Subjects: Computer Science - Machine Learning
Abstract: Learning neural subset selection tasks, such as compound selection in AI-aided drug discovery, have become increasingly pivotal across diverse applications. The existing methodologies in the field primarily concentrate on constructing models that capture the relationship between utility function values and subsets within their respective supersets. However, these approaches tend to overlook the valuable information contained within the superset when utilizing neural networks to model set functions. In this work, we address this oversight by adopting a probabilistic perspective. Our theoretical findings demonstrate that when the target value is conditioned on both the input set and subset, it is essential to incorporate an \textit{invariant sufficient statistic} of the superset into the subset of interest for effective learning. This ensures that the output value remains invariant to permutations of the subset and its corresponding superset, enabling identification of the specific superset from which the subset originated. Motivated by these insights, we propose a simple yet effective information aggregation module designed to merge the representations of subsets and supersets from a permutation invariance perspective. Comprehensive empirical evaluations across diverse tasks and datasets validate the enhanced efficacy of our approach over conventional methods, underscoring the practicality and potency of our proposed strategies in real-world contexts.
Published: 2024

8. SPT: Fine-Tuning Transformer-based Language Models Efficiently with Sparsification

Author: Gui, Yuntao, Yan, Xiao, Yin, Peiqi, Yang, Han, and Cheng, James
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Artificial Intelligence
Abstract: Transformer-based large language models (e.g., BERT and GPT) achieve great success, and fine-tuning, which tunes a pre-trained model on a task-specific dataset, is the standard practice to utilize these models for downstream tasks. However, Transformer fine-tuning has long running time and high memory consumption due to the large size of the models. We propose the SPT system to fine-tune Transformer-based models efficiently by introducing sparsity. We observe that the memory consumption of Transformer mainly comes from storing attention weights for multi-head attention (MHA), and the majority of running time is spent on feed-forward network (FFN). Thus, we design the sparse MHA module, which computes and stores only large attention weights to reduce memory consumption, and the routed FFN module, which dynamically activates a subset of model parameters for each token to reduce computation cost. We implement SPT on PyTorch and customize CUDA kernels to run sparse MHA and routed FFN efficiently. Specifically, we use product quantization to identify the large attention weights and compute attention via sparse matrix multiplication for sparse MHA. For routed FFN, we batch the tokens according to their activated model parameters for efficient computation. We conduct extensive experiments to evaluate SPT on various model configurations. The results show that SPT consistently outperforms well-optimized baselines, reducing the peak memory consumption by up to 50% and accelerating fine-tuning by up to 2.2x., Comment: Firstly submitted to VLDB November 1, 2023, rejection received on December 15, 2023
Published: 2023

9. Comparing the Effectiveness of Physical Exercise Intervention and Melatonin Supplement in Improving Sleep Quality in Children with ASD

Author: Tse, Andy C. Y., Lee, Paul H., Sit, Cindy H. P., Poon, Eric Tsz-chun, Sun, F., Pang, Chi-Ling, and Cheng, James C. H.
Published: 2024
Full Text: View/download PDF

10. Does Invariant Graph Learning via Environment Augmentation Learn Invariance?

Author: Chen, Yongqiang, Bian, Yatao, Zhou, Kaiwen, Xie, Binghui, Han, Bo, and Cheng, James
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Invariant graph representation learning aims to learn the invariance among data from different environments for out-of-distribution generalization on graphs. As the graph environment partitions are usually expensive to obtain, augmenting the environment information has become the de facto approach. However, the usefulness of the augmented environment information has never been verified. In this work, we find that it is fundamentally impossible to learn invariant graph representations via environment augmentation without additional assumptions. Therefore, we develop a set of minimal assumptions, including variation sufficiency and variation consistency, for feasible invariant graph learning. We then propose a new framework Graph invAriant Learning Assistant (GALA). GALA incorporates an assistant model that needs to be sensitive to graph environment changes or distribution shifts. The correctness of the proxy predictions by the assistant model hence can differentiate the variations in spurious subgraphs. We show that extracting the maximally invariant subgraph to the proxy predictions provably identifies the underlying invariant subgraph for successful OOD generalization under the established minimal assumptions. Extensive experiments on datasets including DrugOOD with various graph distribution shifts confirm the effectiveness of GALA., Comment: NeurIPS 2023, 34 pages, 35 figures
Published: 2023

11. Towards out-of-distribution generalizable predictions of chemical kinetics properties

Author: Wang, Zihao, Chen, Yongqiang, Duan, Yang, Li, Weijiang, Han, Bo, Cheng, James, and Tong, Hanghang
Subjects: Computer Science - Machine Learning
Abstract: Machine Learning (ML) techniques have found applications in estimating chemical kinetic properties. With the accumulated drug molecules identified through "AI4drug discovery", the next imperative lies in AI-driven design for high-throughput chemical synthesis processes, with the estimation of properties of unseen reactions with unexplored molecules. To this end, the existing ML approaches for kinetics property prediction are required to be Out-Of-Distribution (OOD) generalizable. In this paper, we categorize the OOD kinetic property prediction into three levels (structure, condition, and mechanism), revealing unique aspects of such problems. Under this framework, we create comprehensive datasets to benchmark (1) the state-of-the-art ML approaches for reaction prediction in the OOD setting and (2) the state-of-the-art graph OOD methods in kinetics property prediction problems. Our results demonstrated the challenges and opportunities in OOD kinetics property prediction. Our datasets and benchmarks can further support research in this direction., Comment: NeurIPS 2023 Workshop in AI for Scientific Discovery: From Theory to Practice. 11 pages, 1 figure, and 5 tables Data and code can be found in https://github.com/zihao-wang/ReactionOOD
Published: 2023

12. Disparity in the Influence of Implant Provisional Materials on Human Gingival Fibroblasts with Different Phases of Cell Settlement: An In Vitro Study.

Author: Matsuura, Takanori, Stavrou, Stella, Komatsu, Keiji, Cheng, James, Pham, Alisa, Ferreira, Stephany, Baba, Tomomi, Chang, Ting-Ling, Chao, Denny, and Ogawa, Takahiro
Subjects: cytocompatibility, cytotoxicity, gingival fibroblasts, implant provisional materials, peri-implant soft tissue, Humans, Dental Materials, Research Design, Alloys, Drug-Related Side Effects and Adverse Reactions, Fibroblasts, Collagen
Abstract: The development of healthy peri-implant soft tissues is critical to achieving the esthetic and biological success of implant restorations throughout all stages of healing and tissue maturation, starting with provisionalization. The purpose of this study was to investigate the effects of eight different implant provisional materials on human gingival fibroblasts at various stages of cell settlement by examining initial cell attachment, growth, and function. Eight different specimens-bis-acrylic 1 and 2, flowable and bulk-fill composites, self-curing acrylic 1 and 2, milled acrylic, and titanium (Ti) alloy as a control-were fabricated in rectangular plates (n = 3). The condition of human gingival fibroblasts was divided into two groups: those in direct contact with test materials (contact experiment) and those in close proximity to test materials (proximity experiment). The proximity experiment was further divided into three phases: pre-settlement, early settlement, and late settlement. A cell culture insert containing each test plate was placed into a well where the cells were pre-cultured. The number of attached cells, cell proliferation, resistance to detachment, and collagen production were evaluated. In the contact experiment, bis-acrylics and composites showed detrimental effects on cells. The number of cells attached to milled acrylic and self-curing acrylic was relatively high, being approximately 70% and 20-30%, respectively, of that on Ti alloy. There was a significant difference between self-curing acrylic 1 and 2, even with the same curing modality. The cell retention ability also varied considerably among the materials. Although the detrimental effects were mitigated in the proximity experiment compared to the contact experiment, adverse effects on cell growth and collagen production remained significant during all phases of cell settlement for bis-acrylics and flowable composite. Specifically, the early settlement phase was not sufficient to significantly mitigate the material cytotoxicity. The flowable composite was consistently more cytotoxic than the bulk-fill composite. The harmful effects of the provisional materials on gingival fibroblasts vary considerably depending on the curing modality and compositions. Pre-settlement of cells mitigated the harmful effects, implying the susceptibility to material toxicity varies depending on the progress of wound healing and tissue condition. However, cell pre-settlement was not sufficient to fully restore the fibroblastic function to the normal level. Particularly, the adverse effects of bis-acrylics and flowable composite remained significant. Milled and self-curing acrylic exhibited excellent and acceptable biocompatibility, respectively, compared to other materials.
Published: 2023

13. Influence of Surface Contaminants and Hydrocarbon Pellicle on the Results of Wettability Measurements of Titanium.

Author: Cheng, James, Kim, Jeong, Park, Wonhee, Ogawa, Takahiro, Kido, Daisuke, Komatsu, Keiji, Suzumura, Toshikatsu, and Matsuura, Takanori
Subjects: UV photofunctionalization, bone integration, osseointegration, titanium implants, wettability
Abstract: Hydrophilicity/hydrophobicity-or wettability-is a key surface characterization metric for titanium used in dental and orthopedic implants. However, the effects of hydrophilicity/hydrophobicity on biological capability remain uncertain, and the relationships between surface wettability and other surface parameters, such as topography and chemistry, are poorly understood. The objective of this study was to identify determinants of surface wettability of titanium and establish the reliability and validity of the assessment. Wettability was evaluated as the contact angle of ddH2O. The age of titanium specimens significantly affected the contact angle, with acid-etched, microrough titanium surfaces becoming superhydrophilic immediately after surface processing, hydrophobic after 7 days, and hydrorepellent after 90 days. Similar age-related loss of hydrophilicity was also confirmed on sandblasted supra-micron rough surfaces so, regardless of surface topography, titanium surfaces eventually become hydrophobic or hydrorepellent with time. On age-standardized titanium, surface roughness increased the contact angle and hydrophobicity. UV treatment of titanium regenerated the superhydrophilicity regardless of age or surface roughness, with rougher surfaces becoming more superhydrophilic than machined surfaces after UV treatment. Conditioning titanium surfaces by autoclaving increased the hydrophobicity of already-hydrophobic surfaces, whereas conditioning with 70% alcohol and hydrating with water or saline attenuated pre-existing hydrophobicity. Conversely, when titanium surfaces were superhydrophilic like UV-treated ones, autoclaving and alcohol cleaning turned the surfaces hydrorepellent and hydrophobic, respectively. UV treatment recovered hydrophilicity without exception. In conclusion, surface roughness accentuates existing wettability and can either increase or decrease the contact angle. Titanium must be age-standardized when evaluating surface wettability. Surface conditioning techniques significantly but unpredictably affect existing wettability. These implied that titanium wettability is significantly influenced by the hydrocarbon pellicle and other contaminants inevitably accumulated. UV treatment may be an effective strategy to standardize wettability by making all titanium surfaces superhydrophilic, thereby allowing the characterization of individual surface topography and chemistry parameters in future studies.
Published: 2023

14. Understanding and Improving Feature Learning for Out-of-Distribution Generalization

Author: Chen, Yongqiang, Huang, Wei, Zhou, Kaiwen, Bian, Yatao, Han, Bo, and Cheng, James
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: A common explanation for the failure of out-of-distribution (OOD) generalization is that the model trained with empirical risk minimization (ERM) learns spurious features instead of invariant features. However, several recent studies challenged this explanation and found that deep networks may have already learned sufficiently good features for OOD generalization. Despite the contradictions at first glance, we theoretically show that ERM essentially learns both spurious and invariant features, while ERM tends to learn spurious features faster if the spurious correlation is stronger. Moreover, when fed the ERM learned features to the OOD objectives, the invariant feature learning quality significantly affects the final OOD performance, as OOD objectives rarely learn new features. Therefore, ERM feature learning can be a bottleneck to OOD generalization. To alleviate the reliance, we propose Feature Augmented Training (FeAT), to enforce the model to learn richer features ready for OOD generalization. FeAT iteratively augments the model to learn new features while retaining the already learned features. In each round, the retention and augmentation operations are performed on different subsets of the training data that capture distinct features. Extensive experiments show that FeAT effectively learns richer features thus boosting the performance of various OOD objectives., Comment: Yongqiang Chen, Wei Huang, and Kaiwen Zhou contributed equally; NeurIPS 2023, 55 pages, 64 figures
Published: 2023

15. Follower Agnostic Methods for Stackelberg Games

Author: Maheshwari, Chinmay, Cheng, James, Sasty, S. Shankar, Ratliff, Lillian, and Mazumdar, Eric
Subjects: Mathematics - Optimization and Control, Computer Science - Artificial Intelligence, Computer Science - Computer Science and Game Theory, Mathematics - Dynamical Systems, 91A65
Abstract: In this paper, we present an efficient algorithm to solve online Stackelberg games, featuring multiple followers, in a follower-agnostic manner. Unlike previous works, our approach works even when leader has no knowledge about the followers' utility functions or strategy space. Our algorithm introduces a unique gradient estimator, leveraging specially designed strategies to probe followers. In a departure from traditional assumptions of optimal play, we model followers' responses using a convergent adaptation rule, allowing for realistic and dynamic interactions. The leader constructs the gradient estimator solely based on observations of followers' actions. We provide both non-asymptotic convergence rates to stationary points of the leader's objective and demonstrate asymptotic convergence to a \emph{local Stackelberg equilibrium}. To validate the effectiveness of our algorithm, we use this algorithm to solve the problem of incentive design on a large-scale transportation network, showcasing its robustness even when the leader lacks access to followers' demand., Comment: 31 pages
Published: 2023

16. DGI: Easy and Efficient Inference for GNNs

Author: Yin, Peiqi, Yan, Xiao, Zhou, Jinjing, Fu, Qiang, Cai, Zhenkun, Cheng, James, Tang, Bo, and Wang, Minjie
Subjects: Computer Science - Machine Learning
Abstract: While many systems have been developed to train Graph Neural Networks (GNNs), efficient model inference and evaluation remain to be addressed. For instance, using the widely adopted node-wise approach, model evaluation can account for up to 94% of the time in the end-to-end training process due to neighbor explosion, which means that a node accesses its multi-hop neighbors. On the other hand, layer-wise inference avoids the neighbor explosion problem by conducting inference layer by layer such that the nodes only need their one-hop neighbors in each layer. However, implementing layer-wise inference requires substantial engineering efforts because users need to manually decompose a GNN model into layers for computation and split workload into batches to fit into device memory. In this paper, we develop Deep Graph Inference (DGI) -- a system for easy and efficient GNN model inference, which automatically translates the training code of a GNN model for layer-wise execution. DGI is general for various GNN models and different kinds of inference requests, and supports out-of-core execution on large graphs that cannot fit in CPU memory. Experimental results show that DGI consistently outperforms layer-wise inference across different datasets and hardware settings, and the speedup can be over 1,000x., Comment: 10 pages, 10 figures
Published: 2022

17. A Representation Learning Framework for Property Graphs

Author: Hou, Yifan, Chen, Hongzhi, Li, Changji, Cheng, James, and Yang, Ming-Chang
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Representation learning on graphs, also called graph embedding, has demonstrated its significant impact on a series of machine learning applications such as classification, prediction and recommendation. However, existing work has largely ignored the rich information contained in the properties (or attributes) of both nodes and edges of graphs in modern applications, e.g., those represented by property graphs. To date, most existing graph embedding methods either focus on plain graphs with only the graph topology, or consider properties on nodes only. We propose PGE, a graph representation learning framework that incorporates both node and edge properties into the graph embedding procedure. PGE uses node clustering to assign biases to differentiate neighbors of a node and leverages multiple data-driven matrices to aggregate the property information of neighbors sampled based on a biased strategy. PGE adopts the popular inductive model for neighborhood aggregation. We provide detailed analyses on the efficacy of our method and validate the performance of PGE by showing how PGE achieves better embedding results than the state-of-the-art graph embedding methods on benchmark applications such as node classification and link prediction over real-world datasets., Comment: This paper is published in KDD 2019. Code can be found here: https://github.com/yifan-h/PGE
Published: 2022

18. Measuring and Improving the Use of Graph Information in Graph Neural Networks

Author: Hou, Yifan, Zhang, Jian, Cheng, James, Ma, Kaili, Ma, Richard T. B., Chen, Hongzhi, and Yang, Ming-Chang
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Graph neural networks (GNNs) have been widely used for representation learning on graph data. However, there is limited understanding on how much performance GNNs actually gain from graph data. This paper introduces a context-surrounding GNN framework and proposes two smoothness metrics to measure the quantity and quality of information obtained from graph data. A new GNN model, called CS-GNN, is then designed to improve the use of graph information based on the smoothness values of a graph. CS-GNN is shown to achieve better performance than existing methods in different types of real graphs., Comment: This paper has been published in ICLR 2020. Code and Dataset can be found here: https://github.com/yifan-h/CS-GNN
Published: 2022

19. Efficient Private SCO for Heavy-Tailed Data via Averaged Clipping

Author: Jin, Chenhan, Zhou, Kaiwen, Han, Bo, Cheng, James, and Zeng, Tieyong
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control
Abstract: We consider stochastic convex optimization for heavy-tailed data with the guarantee of being differentially private (DP). Most prior works on differentially private stochastic convex optimization for heavy-tailed data are either restricted to gradient descent (GD) or performed multi-times clipping on stochastic gradient descent (SGD), which is inefficient for large-scale problems. In this paper, we consider a one-time clipping strategy and provide principled analyses of its bias and private mean estimation. We establish new convergence results and improved complexity bounds for the proposed algorithm called AClipped-dpSGD for constrained and unconstrained convex problems. We also extend our convergent analysis to the strongly convex case and non-smooth case (which works for generalized smooth objectives with H$\ddot{\text{o}}$lder-continuous gradients). All the above results are guaranteed with a high probability for heavy-tailed data. Numerical experiments are conducted to justify the theoretical improvement.
Published: 2022

20. Pareto Invariant Risk Minimization: Towards Mitigating the Optimization Dilemma in Out-of-Distribution Generalization

Author: Chen, Yongqiang, Zhou, Kaiwen, Bian, Yatao, Xie, Binghui, Wu, Bingzhe, Zhang, Yonggang, Ma, Kaili, Yang, Han, Zhao, Peilin, Han, Bo, and Cheng, James
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Recently, there has been a growing surge of interest in enabling machine learning systems to generalize well to Out-of-Distribution (OOD) data. Most efforts are devoted to advancing optimization objectives that regularize models to capture the underlying invariance; however, there often are compromises in the optimization process of these OOD objectives: i) Many OOD objectives have to be relaxed as penalty terms of Empirical Risk Minimization (ERM) for the ease of optimization, while the relaxed forms can weaken the robustness of the original objective; ii) The penalty terms also require careful tuning of the penalty weights due to the intrinsic conflicts between ERM and OOD objectives. Consequently, these compromises could easily lead to suboptimal performance of either the ERM or OOD objective. To address these issues, we introduce a multi-objective optimization (MOO) perspective to understand the OOD optimization process, and propose a new optimization scheme called PAreto Invariant Risk Minimization (PAIR). PAIR improves the robustness of OOD objectives by cooperatively optimizing with other OOD objectives, thereby bridging the gaps caused by the relaxations. Then PAIR approaches a Pareto optimal solution that trades off the ERM and OOD objectives properly. Extensive experiments on challenging benchmarks, WILDS, show that PAIR alleviates the compromises and yields top OOD performances., Comment: ICLR 2023, 50 pages, 58 figures
Published: 2022

21. Fast and Reliable Evaluation of Adversarial Robustness with Minimum-Margin Attack

Author: Gao, Ruize, Wang, Jiongxiao, Zhou, Kaiwen, Liu, Feng, Xie, Binghui, Niu, Gang, Han, Bo, and Cheng, James
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: The AutoAttack (AA) has been the most reliable method to evaluate adversarial robustness when considerable computational resources are available. However, the high computational cost (e.g., 100 times more than that of the project gradient descent attack) makes AA infeasible for practitioners with limited computational resources, and also hinders applications of AA in the adversarial training (AT). In this paper, we propose a novel method, minimum-margin (MM) attack, to fast and reliably evaluate adversarial robustness. Compared with AA, our method achieves comparable performance but only costs 3% of the computational time in extensive experiments. The reliability of our method lies in that we evaluate the quality of adversarial examples using the margin between two targets that can precisely identify the most adversarial example. The computational efficiency of our method lies in an effective Sequential TArget Ranking Selection (STARS) method, ensuring that the cost of the MM attack is independent of the number of classes. The MM attack opens a new way for evaluating adversarial robustness and provides a feasible and reliable way to generate high-quality adversarial examples in AT.
Published: 2022

22. Conditional Mitigation of Dental-Composite Material-Induced Cytotoxicity by Increasing the Cure Time.

Author: Matsuura, Takanori, Komatsu, Keiji, Choi, Kimberly, Suzumura, Toshikatsu, Cheng, James, Chang, Ting-Ling, Chao, Denny, and Ogawa, Takahiro
Subjects: composite, curing time, cytotoxicity, fibroblast, light-curing, Dental/Oral and Craniofacial Disease, Biomedical Engineering, Medical Biotechnology
Abstract: Light-cured composite resins are widely used in dental restorations to fill cavities and fabricate temporary crowns. After curing, the residual monomer is a known to be cytotoxic, but increasing the curing time should improve biocompatibility. However, a biologically optimized cure time has not been determined through systematic experimentation. The objective of this study was to examine the behavior and function of human gingival fibroblasts cultured with flowable and bulk-fill composites cured for different periods of time, while considering the physical location of the cells with regard to the materials. Biological effects were separately evaluated for cells in direct contact with, and in close proximity to, the two composite materials. Curing time varied from the recommended 20 s to 40, 60, and 80 s. Pre-cured, milled-acrylic resin was used as a control. No cell survived and attached to or around the flowable composite, regardless of curing time. Some cells survived and attached close to (but not on) the bulk-fill composite, with survival increasing with a longer curing time, albeit to
Published: 2023

23. An Adaptive Incremental Gradient Method With Support for Non-Euclidean Norms

Author: Xie, Binghui, Jin, Chenhan, Zhou, Kaiwen, Cheng, James, and Meng, Wei
Subjects: Mathematics - Optimization and Control, Computer Science - Machine Learning
Abstract: Stochastic variance reduced methods have shown strong performance in solving finite-sum problems. However, these methods usually require the users to manually tune the step-size, which is time-consuming or even infeasible for some large-scale optimization tasks. To overcome the problem, we propose and analyze several novel adaptive variants of the popular SAGA algorithm. Eventually, we design a variant of Barzilai-Borwein step-size which is tailored for the incremental gradient method to ensure memory efficiency and fast convergence. We establish its convergence guarantees under general settings that allow non-Euclidean norms in the definition of smoothness and the composite objectives, which cover a broad range of applications in machine learning. We improve the analysis of SAGA to support non-Euclidean norms, which fills the void of existing work. Numerical experiments on standard datasets demonstrate a competitive performance of the proposed algorithm compared with existing variance-reduced methods and their adaptive variants.
Published: 2022

24. Understanding and Improving Graph Injection Attack by Promoting Unnoticeability

Author: Chen, Yongqiang, Yang, Han, Zhang, Yonggang, Ma, Kaili, Liu, Tongliang, Han, Bo, and Cheng, James
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security, Statistics - Machine Learning
Abstract: Recently Graph Injection Attack (GIA) emerges as a practical attack scenario on Graph Neural Networks (GNNs), where the adversary can merely inject few malicious nodes instead of modifying existing nodes or edges, i.e., Graph Modification Attack (GMA). Although GIA has achieved promising results, little is known about why it is successful and whether there is any pitfall behind the success. To understand the power of GIA, we compare it with GMA and find that GIA can be provably more harmful than GMA due to its relatively high flexibility. However, the high flexibility will also lead to great damage to the homophily distribution of the original graph, i.e., similarity among neighbors. Consequently, the threats of GIA can be easily alleviated or even prevented by homophily-based defenses designed to recover the original homophily. To mitigate the issue, we introduce a novel constraint -- homophily unnoticeability that enforces GIA to preserve the homophily, and propose Harmonious Adversarial Objective (HAO) to instantiate it. Extensive experiments verify that GIA with HAO can break homophily-based defenses and outperform previous GIA attacks by a significant margin. We believe our methods can serve for a more reliable evaluation of the robustness of GNNs., Comment: ICLR2022, 42 pages, 22 figures
Published: 2022

25. Learning Causally Invariant Representations for Out-of-Distribution Generalization on Graphs

Author: Chen, Yongqiang, Zhang, Yonggang, Bian, Yatao, Yang, Han, Ma, Kaili, Xie, Binghui, Liu, Tongliang, Han, Bo, and Cheng, James
Subjects: Computer Science - Machine Learning
Abstract: Despite recent success in using the invariance principle for out-of-distribution (OOD) generalization on Euclidean data (e.g., images), studies on graph data are still limited. Different from images, the complex nature of graphs poses unique challenges to adopting the invariance principle. In particular, distribution shifts on graphs can appear in a variety of forms such as attributes and structures, making it difficult to identify the invariance. Moreover, domain or environment partitions, which are often required by OOD methods on Euclidean data, could be highly expensive to obtain for graphs. To bridge this gap, we propose a new framework, called Causality Inspired Invariant Graph LeArning (CIGA), to capture the invariance of graphs for guaranteed OOD generalization under various distribution shifts. Specifically, we characterize potential distribution shifts on graphs with causal models, concluding that OOD generalization on graphs is achievable when models focus only on subgraphs containing the most information about the causes of labels. Accordingly, we propose an information-theoretic objective to extract the desired subgraphs that maximally preserve the invariant intra-class information. Learning with these subgraphs is immune to distribution shifts. Extensive experiments on 16 synthetic or real-world datasets, including a challenging setting -- DrugOOD, from AI-aided drug discovery, validate the superior OOD performance of CIGA., Comment: NeurIPS2022, 46 pages, 72 figures
Published: 2022

26. Human Gingival Fibroblast Growth and Function in Response to Laser-Induced Meso- and Microscale Hybrid Topography on Dental Implant Healing Abutments.

Author: Chao, Denny, Komatsu, Keiji, Matsuura, Takanori, Cheng, James, Stavrou, Stella C., Jayanetti, Jay, Ting-Ling Chang, and Takahiro Ogawa
Subjects: DENTAL implants, WOUND healing, IN vitro studies, DENTAL abutments, RESEARCH funding, GINGIVA, CELL proliferation, REVERSE transcriptase polymerase chain reaction, GLYCOPROTEINS, FIBROBLASTS, LASER therapy, CELL culture, FIBRONECTINS, GENES, COLLAGEN
Abstract: Purpose: To examine the behavior and function of human gingival fibroblasts growing on healing abutments with or without laser-textured topography. Materials and Methods: Human primary gingival connective tissue fibroblasts were cultured on healing abutments with machined or laser-textured (Laser-Lok, BioHorizons) surfaces. Cellular and molecular responses were evaluated by a variety of tests, including cell density assay (WST-1), fluorescence microscopy, real-time quantitative reverse-transcription polymerase chain reaction (qRT-PCR), and detachment tests. Results: The machined surface showed monodirectional traces and scratches from milling, whereas the laser-textured surface showed a distinct morphology consisting of monodirectional mesoscale channels (15-µm pitch) and woven oblique microridges formed within the channels. There were no differences in initial fibroblast attachment, subsequent fibroblast proliferation, or collagen production between the machined and laser-textured surfaces. Fibroblasts growing on a laser-textured surface were found to spread in one direction along the mesochannels, while cells growing on machined surfaces tended to spread randomly. Fibroblasts on laser-textured surfaces were 1.8 times more resistant to detachment than those on machined surfaces. An adhesive glycoprotein (fibronectin) and transmembrane adhesion linker gene (integrin ß-1) were upregulated on laser-textured surfaces. Conclusions: The increased fibroblast retention, uniform growth, and increased transcription of cell adhesion proteins compellingly explain the enhanced tissue-level response to laser-created and hybrid-textured titanium surfaces. These results provide a cellular and molecular rationale for the tissue reaction to this unique surface; in addition, they support its extended use, from implants and healing abutments to diverse prosthetic components where enhanced soft tissue responses would be desirable. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Accelerating Perturbed Stochastic Iterates in Asynchronous Lock-Free Optimization

Author: Zhou, Kaiwen, So, Anthony Man-Cho, and Cheng, James
Subjects: Mathematics - Optimization and Control, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We show that stochastic acceleration can be achieved under the perturbed iterate framework (Mania et al., 2017) in asynchronous lock-free optimization, which leads to the optimal incremental gradient complexity for finite-sum objectives. We prove that our new accelerated method requires the same linear speed-up condition as the existing non-accelerated methods. Our core algorithmic discovery is a new accelerated SVRG variant with sparse updates. Empirical results are presented to verify our theoretical findings., Comment: 21 pages, 22 figures
Published: 2021

28. Solving the non-submodular network collapse problems via Decision Transformer

Author: Ma, Kaili, Yang, Han, Yang, Shanchao, Zhao, Kangfei, Li, Lanqing, Chen, Yongqiang, Huang, Junzhou, Cheng, James, and Rong, Yu
Published: 2024
Full Text: View/download PDF

29. PPS: Fair and efficient black-box scheduling for multi-tenant GPU clusters

Author: Ma, Kaihao, Cai, Zhenkun, Yan, Xiao, Zhang, Yang, Liu, Zhi, Feng, Yihui, Li, Chao, Lin, Wei, and Cheng, James
Published: 2024
Full Text: View/download PDF

30. Local Reweighting for Adversarial Training

Author: Gao, Ruize, Liu, Feng, Zhou, Kaiwen, Niu, Gang, Han, Bo, and Cheng, James
Subjects: Computer Science - Machine Learning
Abstract: Instances-reweighted adversarial training (IRAT) can significantly boost the robustness of trained models, where data being less/more vulnerable to the given attack are assigned smaller/larger weights during training. However, when tested on attacks different from the given attack simulated in training, the robustness may drop significantly (e.g., even worse than no reweighting). In this paper, we study this problem and propose our solution--locally reweighted adversarial training (LRAT). The rationale behind IRAT is that we do not need to pay much attention to an instance that is already safe under the attack. We argue that the safeness should be attack-dependent, so that for the same instance, its weight can change given different attacks based on the same model. Thus, if the attack simulated in training is mis-specified, the weights of IRAT are misleading. To this end, LRAT pairs each instance with its adversarial variants and performs local reweighting inside each pair, while performing no global reweighting--the rationale is to fit the instance itself if it is immune to the attack, but not to skip the pair, in order to passively defend different attacks in future. Experiments show that LRAT works better than both IRAT (i.e., global reweighting) and the standard AT (i.e., no reweighting) when trained with an attack and tested on different attacks.
Published: 2021

31. Real-World Sensitization and Tolerance Pattern to Seafood in Fish-Allergic Individuals

Author: Leung, Agnes S.Y., Wai, Christine Y.Y., Leung, Nicki Y.H., Ngai, Noelle Anne, Chua, Gilbert T., Ho, Po Ki, Lam, Ivan C.S., Cheng, James W.C.H., Chan, Oi Man, Li, Pui Fung, Au, Ann W.S., Leung, Chloris H.W., Cheng, Nam Sze, Tang, Man Fung, Fong, Brian L.Y., Rosa Duque, Jaime S., Wong, Joshua S.C., Luk, David C.K., Ho, Marco H.K., Kwan, Mike Y.W., Yau, Yat Sun, Lee, Qun Ui, Chan, Wai Hung, Wong, Gary W.K., and Leung, Ting Fan
Published: 2024
Full Text: View/download PDF

32. Cell Type-Specific Effects of Implant Provisional Restoration Materials on the Growth and Function of Human Fibroblasts and Osteoblasts

Author: Matsuura, Takanori, Komatsu, Keiji, Chao, Denny, Lin, Yu-Chun, Oberoi, Nimish, McCulloch, Kalie, Cheng, James, Orellana, Daniela, and Ogawa, Takahiro
Subjects: Biomedical and Clinical Sciences, Engineering, Dentistry, Biomedical Engineering, peri-implant tissue, provisional restoration, fibroblast, osteoblast, cytotoxicity
Abstract: Implant provisional restorations should ideally be nontoxic to the contacting and adjacent tissues, create anatomical and biophysiological stability, and establish a soft tissue seal through interactions between prosthesis, soft tissue, and alveolar bone. However, there is a lack of robust, systematic, and fundamental data to inform clinical decision making. Here we systematically explored the biocompatibility of fibroblasts and osteoblasts in direct contact with, or close proximity to, provisional restoration materials. Human gingival fibroblasts and osteoblasts were cultured on the "contact" effect and around the "proximity" effect with various provisional materials: bis-acrylic, composite, self-curing acrylic, and milled acrylic, with titanium alloy as a bioinert control. The number of fibroblasts and osteoblasts surviving and attaching to and around the materials varied considerably depending on the material, with milled acrylic the most biocompatible and similar to titanium alloy, followed by self-curing acrylic and little to no attachment on or around bis-acrylic and composite materials. Milled and self-curing acrylics similarly favored subsequent cellular proliferation and physiological functions such as collagen production in fibroblasts and alkaline phosphatase activity in osteoblasts. Neither fibroblasts nor osteoblasts showed a functional phenotype when cultured with bis-acrylic or composite. By calculating a biocompatibility index for each material, we established that fibroblasts were more resistant to the cytotoxicity induced by most materials in direct contact, however, the osteoblasts were more resistant when the materials were in close proximity. In conclusion, there was a wide variation in the cytotoxicity of implant provisional restoration materials ranging from lethal and tolerant to near inert, and this cytotoxicity may be received differently between the different cell types and depending on their physical interrelationships.
Published: 2022

33. Practical Schemes for Finding Near-Stationary Points of Convex Finite-Sums

Author: Zhou, Kaiwen, Tian, Lai, So, Anthony Man-Cho, and Cheng, James
Subjects: Mathematics - Optimization and Control, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: In convex optimization, the problem of finding near-stationary points has not been adequately studied yet, unlike other optimality measures such as the function value. Even in the deterministic case, the optimal method (OGM-G, due to Kim and Fessler (2021)) has just been discovered recently. In this work, we conduct a systematic study of algorithmic techniques for finding near-stationary points of convex finite-sums. Our main contributions are several algorithmic discoveries: (1) we discover a memory-saving variant of OGM-G based on the performance estimation problem approach (Drori and Teboulle, 2014); (2) we design a new accelerated SVRG variant that can simultaneously achieve fast rates for minimizing both the gradient norm and function value; (3) we propose an adaptively regularized accelerated SVRG variant, which does not require the knowledge of some unknown initial constants and achieves near-optimal complexities. We put an emphasis on the simplicity and practicality of the new schemes, which could facilitate future work., Comment: 29 pages, 4 figures
Published: 2021

34. G-Tran: Making Distributed Graph Transactions Fast

Author: Chen, Hongzhi, Li, Changji, Zheng, Chenguang, Huang, Chenghuan, Fang, Juncheng, Cheng, James, and Zhang, Jian
Subjects: Computer Science - Databases, Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Graph transaction processing raises many unique challenges such as random data access due to the irregularity of graph structures, low throughput and high abort rate due to the relatively large read/write sets in graph transactions. To address these challenges, we present G-Tran -- an RDMA-enabled distributed in-memory graph database with serializable and snapshot isolation support. First, we propose a graph-native data store to achieve good data locality and fast data access for transactional updates and queries. Second, G-Tran adopts a fully decentralized architecture that leverages RDMA to process distributed transactions with the MPP model, which can achieve high performance by utilizing all computing resources. In addition, we propose a new MV-OCC implementation with two optimizations to address the issue of large read/write sets in graph transactions. Extensive experiments show that G-Tran achieves competitive performance compared with other popular graph databases on benchmark workloads.
Published: 2021

35. Calibrating and Improving Graph Contrastive Learning

Author: Ma, Kaili, Yang, Haochen, Yang, Han, Chen, Yongqiang, and Cheng, James
Subjects: Computer Science - Machine Learning
Abstract: Graph contrastive learning algorithms have demonstrated remarkable success in various applications such as node classification, link prediction, and graph clustering. However, in unsupervised graph contrastive learning, some contrastive pairs may contradict the truths in downstream tasks and thus the decrease of losses on these pairs undesirably harms the performance in the downstream tasks. To assess the discrepancy between the prediction and the ground-truth in the downstream tasks for these contrastive pairs, we adapt the expected calibration error (ECE) to graph contrastive learning. The analysis of ECE motivates us to propose a novel regularization method, Contrast-Reg, to ensure that decreasing the contrastive loss leads to better performance in the downstream tasks. As a plug-in regularizer, Contrast-Reg effectively improves the performance of existing graph contrastive learning algorithms. We provide both theoretical and empirical results to demonstrate the effectiveness of Contrast-Reg in enhancing the generalizability of the Graph Neural Network(GNN) model and improving the performance of graph contrastive algorithms with different similarity definitions and encoder backbones across various downstream tasks.
Published: 2021

36. The item selection problem for user cold-start recommendation

Author: Meng, Yitong, Liu, Jie, Yan, Xiao, and Cheng, James
Subjects: Computer Science - Information Retrieval, Computer Science - Machine Learning, Computer Science - Social and Information Networks
Abstract: When a new user just signs up on a website, we usually have no information about him/her, i.e. no interaction with items, no user profile and no social links with other users. Under such circumstances, we still expect our recommender systems could attract the users at the first time so that the users decide to stay on the website and become active users. This problem falls into new user cold-start category and it is crucial to the development and even survival of a company. Existing works on user cold-start recommendation either require additional user efforts, e.g. setting up an interview process, or make use of side information [10] such as user demographics, locations, social relations, etc. However, users may not be willing to take the interview and side information on cold-start users is usually not available. Therefore, we consider a pure cold-start scenario where neither interaction nor side information is available and no user effort is required. Studying this setting is also important for the initialization of other cold-start solutions, such as initializing the first few questions of an interview.
Published: 2020

37. Rethinking Graph Regularization for Graph Neural Networks

Author: Yang, Han, Ma, Kaili, and Cheng, James
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: The graph Laplacian regularization term is usually used in semi-supervised representation learning to provide graph structure information for a model $f(X)$. However, with the recent popularity of graph neural networks (GNNs), directly encoding graph structure $A$ into a model, i.e., $f(A, X)$, has become the more common approach. While we show that graph Laplacian regularization brings little-to-no benefit to existing GNNs, and propose a simple but non-trivial variant of graph Laplacian regularization, called Propagation-regularization (P-reg), to boost the performance of existing GNN models. We provide formal analyses to show that P-reg not only infuses extra information (that is not captured by the traditional graph Laplacian regularization) into GNNs, but also has the capacity equivalent to an infinite-depth graph convolutional network. We demonstrate that P-reg can effectively boost the performance of existing GNN models on both node-level and graph-level tasks across many different datasets., Comment: AAAI2021
Published: 2020

38. Hierarchical Graph Matching Network for Graph Similarity Computation

Author: Xiu, Haibo, Yan, Xiao, Wang, Xiaoqiang, Cheng, James, and Cao, Lei
Subjects: Computer Science - Databases
Abstract: Graph edit distance / similarity is widely used in many tasks, such as graph similarity search, binary function analysis, and graph clustering. However, computing the exact graph edit distance (GED) or maximum common subgraph (MCS) between two graphs is known to be NP-hard. In this paper, we propose the hierarchical graph matching network (HGMN), which learns to compute graph similarity from data. HGMN is motivated by the observation that two similar graphs should also be similar when they are compressed into more compact graphs. HGMN utilizes multiple stages of hierarchical clustering to organize a graph into successively more compact graphs. At each stage, the earth mover distance (EMD) is adopted to obtain a one-to-one mapping between the nodes in two graphs (on which graph similarity is to be computed), and a correlation matrix is also derived from the embeddings of the nodes in the two graphs. The correlation matrices from all stages are used as input for a convolutional neural network (CNN), which is trained to predict graph similarity by minimizing the mean squared error (MSE). Experimental evaluation on 4 datasets in different domains and 4 performance metrics shows that HGMN consistently outperforms existing baselines in the accuracy of graph similarity approximation.
Published: 2020

39. Understanding Graph Neural Networks from Graph Signal Denoising Perspectives

Author: Fu, Guoji, Hou, Yifan, Zhang, Jian, Ma, Kaili, Kamhoua, Barakeel Fanseu, and Cheng, James
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Graph neural networks (GNNs) have attracted much attention because of their excellent performance on tasks such as node classification. However, there is inadequate understanding on how and why GNNs work, especially for node representation learning. This paper aims to provide a theoretical framework to understand GNNs, specifically, spectral graph convolutional networks and graph attention networks, from graph signal denoising perspectives. Our framework shows that GNNs are implicitly solving graph signal denoising problems: spectral graph convolutions work as denoising node features, while graph attentions work as denoising edge weights. We also show that a linear self-attention mechanism is able to compete with the state-of-the-art graph attention methods. Our theoretical results further lead to two new models, GSDN-F and GSDN-EF, which work effectively for graphs with noisy node features and/or noisy edges. We validate our theoretical findings and also the effectiveness of our new models by experiments on benchmark datasets. The source code is available at \url{https://github.com/fuguoji/GSDN}., Comment: 19 pages, 8 figures
Published: 2020

40. Boosting First-Order Methods by Shifting Objective: New Schemes with Faster Worst-Case Rates

Author: Zhou, Kaiwen, So, Anthony Man-Cho, and Cheng, James
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We propose a new methodology to design first-order methods for unconstrained strongly convex problems. Specifically, instead of tackling the original objective directly, we construct a shifted objective function that has the same minimizer as the original objective and encodes both the smoothness and strong convexity of the original objective in an interpolation condition. We then propose an algorithmic template for tackling the shifted objective, which can exploit such a condition. Following this template, we derive several new accelerated schemes for problems that are equipped with various first-order oracles and show that the interpolation condition allows us to vastly simplify and tighten the analysis of the derived methods. In particular, all the derived methods have faster worst-case convergence rates than their existing counterparts. Experiments on machine learning tasks are conducted to evaluate the new methods., Comment: NeurIPS 2020, 29 pages, 7 figures
Published: 2020

41. TensorOpt: Exploring the Tradeoffs in Distributed DNN Training with Auto-Parallelism

Author: Cai, Zhenkun, Ma, Kaihao, Yan, Xiao, Wu, Yidi, Huang, Yuzhen, Cheng, James, Su, Teng, and Yu, Fan
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: A good parallelization strategy can significantly improve the efficiency or reduce the cost for the distributed training of deep neural networks (DNNs). Recently, several methods have been proposed to find efficient parallelization strategies but they all optimize a single objective (e.g., execution time, memory consumption) and produce only one strategy. We propose FT, an efficient algorithm that searches for an optimal set of parallelization strategies to allow the trade-off among different objectives. FT can adapt to different scenarios by minimizing the memory consumption when the number of devices is limited and fully utilize additional resources to reduce the execution time. For popular DNN models (e.g., vision, language), an in-depth analysis is conducted to understand the trade-offs among different objectives and their influence on the parallelization strategies. We also develop a user-friendly system, called TensorOpt, which allows users to run their distributed DNN training jobs without caring the details of parallelization strategies. Experimental results show that FT runs efficiently and provides accurate estimation of runtime costs, and TensorOpt is more flexible in adapting to resource availability compared with existing frameworks.
Published: 2020
Full Text: View/download PDF

42. Self-Enhanced GNN: Improving Graph Neural Networks Using Model Outputs

Author: Yang, Han, Yan, Xiao, Dai, Xinyan, Chen, Yongqiang, and Cheng, James
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Graph neural networks (GNNs) have received much attention recently because of their excellent performance on graph-based tasks. However, existing research on GNNs focuses on designing more effective models without considering much about the quality of the input data. In this paper, we propose self-enhanced GNN (SEG), which improves the quality of the input data using the outputs of existing GNN models for better performance on semi-supervised node classification. As graph data consist of both topology and node labels, we improve input data quality from both perspectives. For topology, we observe that higher classification accuracy can be achieved when the ratio of inter-class edges (connecting nodes from different classes) is low and propose topology update to remove inter-class edges and add intra-class edges. For node labels, we propose training node augmentation, which enlarges the training set using the labels predicted by existing GNN models. SEG is a general framework that can be easily combined with existing GNN models. Experimental results validate that SEG consistently improves the performance of well-known GNN models such as GCN, GAT and SGC across different datasets.
Published: 2020

43. Convolutional Embedding for Edit Distance

Author: Dai, Xinyan, Yan, Xiao, Zhou, Kaiwen, Wang, Yuxuan, Yang, Han, and Cheng, James
Subjects: Computer Science - Databases, Computer Science - Machine Learning
Abstract: Edit-distance-based string similarity search has many applications such as spell correction, data de-duplication, and sequence alignment. However, computing edit distance is known to have high complexity, which makes string similarity search challenging for large datasets. In this paper, we propose a deep learning pipeline (called CNN-ED) that embeds edit distance into Euclidean distance for fast approximate similarity search. A convolutional neural network (CNN) is used to generate fixed-length vector embeddings for a dataset of strings and the loss function is a combination of the triplet loss and the approximation error. To justify our choice of using CNN instead of other structures (e.g., RNN) as the model, theoretical analysis is conducted to show that some basic operations in our CNN model preserve edit distance. Experimental results show that CNN-ED outperforms data-independent CGK embedding and RNN-based GRU embedding in terms of both accuracy and efficiency by a large margin. We also show that string similarity search can be significantly accelerated using CNN-based embeddings, sometimes by orders of magnitude., Comment: Accepted by the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2020
Published: 2020
Full Text: View/download PDF

44. Norm-Explicit Quantization: Improving Vector Quantization for Maximum Inner Product Search

Author: Dai, Xinyan, Yan, Xiao, Ng, Kelvin K. W., Liu, Jie, and Cheng, James
Subjects: Computer Science - Information Retrieval, Computer Science - Machine Learning
Abstract: Vector quantization (VQ) techniques are widely used in similarity search for data compression, fast metric computation and etc. Originally designed for Euclidean distance, existing VQ techniques (e.g., PQ, AQ) explicitly or implicitly minimize the quantization error. In this paper, we present a new angle to analyze the quantization error, which decomposes the quantization error into norm error and direction error. We show that quantization errors in norm have much higher influence on inner products than quantization errors in direction, and small quantization error does not necessarily lead to good performance in maximum inner product search (MIPS). Based on this observation, we propose norm-explicit quantization (NEQ) --- a general paradigm that improves existing VQ techniques for MIPS. NEQ quantizes the norms of items in a dataset explicitly to reduce errors in norm, which is crucial for MIPS. For the direction vectors, NEQ can simply reuse an existing VQ technique to quantize them without modification. We conducted extensive experiments on a variety of datasets and parameter configurations. The experimental results show that NEQ improves the performance of various VQ techniques for MIPS, including PQ, OPQ, RQ and AQ.
Published: 2019

45. Hyper-Sphere Quantization: Communication-Efficient SGD for Federated Learning

Author: Dai, Xinyan, Yan, Xiao, Zhou, Kaiwen, Yang, Han, Ng, Kelvin K. W., Cheng, James, and Fan, Yu
Subjects: Computer Science - Machine Learning, Computer Science - Information Retrieval, Statistics - Machine Learning
Abstract: The high cost of communicating gradients is a major bottleneck for federated learning, as the bandwidth of the participating user devices is limited. Existing gradient compression algorithms are mainly designed for data centers with high-speed network and achieve $O(\sqrt{d} \log d)$ per-iteration communication cost at best, where $d$ is the size of the model. We propose hyper-sphere quantization (HSQ), a general framework that can be configured to achieve a continuum of trade-offs between communication efficiency and gradient accuracy. In particular, at the high compression ratio end, HSQ provides a low per-iteration communication cost of $O(\log d)$, which is favorable for federated learning. We prove the convergence of HSQ theoretically and show by experiments that HSQ significantly reduces the communication cost of model training without hurting convergence accuracy.
Published: 2019

46. Understanding and Improving Proximity Graph based Maximum Inner Product Search

Author: Liu, Jie, Yan, Xiao, Dai, Xinyan, Li, Zhirong, Cheng, James, and Yang, Ming-Chang
Subjects: Computer Science - Information Retrieval, Computer Science - Data Structures and Algorithms, Computer Science - Machine Learning
Abstract: The inner-product navigable small world graph (ip-NSW) represents the state-of-the-art method for approximate maximum inner product search (MIPS) and it can achieve an order of magnitude speedup over the fastest baseline. However, to date it is still unclear where its exceptional performance comes from. In this paper, we show that there is a strong norm bias in the MIPS problem, which means that the large norm items are very likely to become the result of MIPS. Then we explain the good performance of ip-NSW as matching the norm bias of the MIPS problem - large norm items have big in-degrees in the ip-NSW proximity graph and a walk on the graph spends the majority of computation on these items, thus effectively avoids unnecessary computation on small norm items. Furthermore, we propose the ip-NSW+ algorithm, which improves ip-NSW by introducing an additional angular proximity graph. Search is first conducted on the angular graph to find the angular neighbors of a query and then the MIPS neighbors of these angular neighbors are used to initialize the candidate pool for search on the inner-product proximity graph. Experiment results show that ip-NSW+ consistently and significantly outperforms ip-NSW and provides more robust performance under different data distributions., Comment: 8 pages, 8 figures
Published: 2019

47. Elastic deep learning in multi-tenant GPU cluster

Author: Wu, Yidi, Ma, Kaihao, Yan, Xiao, Liu, Zhi, Cai, Zhenkun, Huang, Yuzhen, Cheng, James, Yuan, Han, and Yu, Fan
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: We study how to support elasticity, i.e., the ability to dynamically adjust the parallelism (number of GPUs), for deep neural network (DNN) training. Elasticity can benefit multi-tenant GPU cluster management in many ways, e.g., achieving various scheduling objectives (e.g., job throughput, job completion time, GPU efficiency) according to cluster load variations, maximizing the use of transient idle resources, performance profiling, job migration, and straggler mitigation. However, existing parallelism adjustment strategies incur high overheads, which hinder many applications from making effective use of elasticity. We propose EDL to enable low-overhead elastic deep learning with a simple API. We present techniques that are necessary to reduce the overhead of parallelism adjustments, such as stop-free scaling and dynamic data pipeline. We also demonstrate that EDL can indeed bring significant benefits to the above-listed applications in GPU cluster management.
Published: 2019

48. Development and validation of assessment tools for food allergy–related knowledge and management confidence

Author: Leung, Agnes Sze Yin, Cheng, Nam Sze, Cheng, James Wesley Ching-hei, Pun, Jack, and Leung, Ting Fan
Published: 2023
Full Text: View/download PDF

49. PMD: An Optimal Transportation-based User Distance for Recommender Systems

Author: Meng, Yitong, Dai, Xinyan, Yan, Xiao, Cheng, James, Liu, Weiwen, Liao, Benben, Guo, Jun, and Chen, Guangyong
Subjects: Computer Science - Information Retrieval, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Collaborative filtering, a widely-used recommendation technique, predicts a user's preference by aggregating the ratings from similar users. As a result, these measures cannot fully utilize the rating information and are not suitable for real world sparse data. To solve these issues, we propose a novel user distance measure named Preference Mover's Distance (PMD) which makes full use of all ratings made by each user. Our proposed PMD can properly measure the distance between a pair of users even if they have no co-rated items. We show that this measure can be cast as an instance of the Earth Mover's Distance, a well-studied transportation problem for which several highly efficient solvers have been developed. Experimental results show that PMD can help achieve superior recommendation accuracy than state-of-the-art methods, especially when training data is very sparse., Comment: This paper is accepted by European Conference on Information Retrieval (ECIR 2020)
Published: 2019

50. Pyramid: A General Framework for Distributed Similarity Search

Author: Deng, Shiyuan, Yan, Xiao, Ng, Kelvin K. W., Jiang, Chenyu, and Cheng, James
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing
Abstract: Similarity search is a core component in various applications such as image matching, product recommendation and low-shot classification. However, single machine solutions are usually insufficient due to the large cardinality of modern datasets and stringent latency requirement of on-line query processing. We present Pyramid, a general and efficient framework for distributed similarity search. Pyramid supports search with popular similarity functions including Euclidean distance, angular distance and inner product. Different from existing distributed solutions that are based on KD-tree or locality sensitive hashing (LSH), Pyramid is based on Hierarchical Navigable Small World graph (HNSW), which is the state of the art similarity search algorithm on a single machine. To achieve high query processing throughput, Pyramid partitions a dataset into sub-datasets containing similar items for index building and assigns a query to only some of the sub-datasets for query processing. To provide the robustness required by production deployment, Pyramid also supports failure recovery and straggler mitigation. Pyramid offers a set of concise API such that users can easily use Pyramid without knowing the details of distributed execution. Experiments on large-scale datasets show that Pyramid produces quality results for similarity search, achieves high query processing throughput and is robust under node failure and straggler.
Published: 2019

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

752 results on '"Cheng, James"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources