Author: "Guan, Chaoyu" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Guan, Chaoyu"' showing total 16 results

Start Over Author "Guan, Chaoyu"

16 results on '"Guan, Chaoyu"'

1. Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox

Author: Liu, Yijun, Meng, Yuan, Wu, Fang, Peng, Shenhao, Yao, Hang, Guan, Chaoyu, Tang, Chen, Ma, Xinzhu, Wang, Zhi, and Zhu, Wenwu
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Large language models (LLMs) have exhibited exciting progress in multiple scenarios, while the huge computational demands hinder their deployments in lots of real-world applications. As an effective means to reduce memory footprint and inference cost, quantization also faces challenges in performance degradation at low bit-widths. Understanding the impact of quantization on LLM capabilities, especially the generalization ability, is crucial. However, the community's main focus remains on the algorithms and models of quantization, with insufficient attention given to whether the quantized models can retain the strong generalization abilities of LLMs. In this work, we fill this gap by providing a comprehensive benchmark suite for this research topic, including an evaluation system, detailed analyses, and a general toolbox. Specifically, based on the dominant pipeline in LLM quantization, we primarily explore the impact of calibration data distribution on the generalization of quantized LLMs and conduct the benchmark using more than 40 datasets within two main scenarios. Based on this benchmark, we conduct extensive experiments with two well-known LLMs (English and Chinese) and four quantization algorithms to investigate this topic in-depth, yielding several counter-intuitive and valuable findings, e.g., models quantized using a calibration set with the same distribution as the test data are not necessarily optimal. Besides, to facilitate future research, we also release a modular-designed toolbox, which decouples the overall pipeline into several separate components, e.g., base LLM module, dataset module, quantizer module, etc. and allows subsequent researchers to easily assemble their methods through a simple configuration. Our benchmark suite is publicly available at https://github.com/TsingmaoAI/MI-optimize
Published: 2024

2. Post-training Quantization for Text-to-Image Diffusion Models with Progressive Calibration and Activation Relaxing

Author: Tang, Siao, Wang, Xin, Chen, Hong, Guan, Chaoyu, Wu, Zewen, Tang, Yansong, and Zhu, Wenwu
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: High computational overhead is a troublesome problem for diffusion models. Recent studies have leveraged post-training quantization (PTQ) to compress diffusion models. However, most of them only focus on unconditional models, leaving the quantization of widely-used pretrained text-to-image models, e.g., Stable Diffusion, largely unexplored. In this paper, we propose a novel post-training quantization method PCR (Progressive Calibration and Relaxing) for text-to-image diffusion models, which consists of a progressive calibration strategy that considers the accumulated quantization error across timesteps, and an activation relaxing strategy that improves the performance with negligible cost. Additionally, we demonstrate the previous metrics for text-to-image diffusion model quantization are not accurate due to the distribution gap. To tackle the problem, we propose a novel QDiffBench benchmark, which utilizes data in the same domain for more accurate evaluation. Besides, QDiffBench also considers the generalization performance of the quantized model outside the calibration dataset. Extensive experiments on Stable Diffusion and Stable Diffusion XL demonstrate the superiority of our method and benchmark. Moreover, we are the first to achieve quantization for Stable Diffusion XL while maintaining the performance., Comment: Accepted by ECCV2024
Published: 2023

3. Lightweight Diffusion Models with Distillation-Based Block Neural Architecture Search

Author: Tang, Siao, Wang, Xin, Chen, Hong, Guan, Chaoyu, Tang, Yansong, and zhu, Wenwu
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Diffusion models have recently shown remarkable generation ability, achieving state-of-the-art performance in many tasks. However, the high computational cost is still a troubling problem for diffusion models. To tackle this problem, we propose to automatically remove the structural redundancy in diffusion models with our proposed Diffusion Distillation-based Block-wise Neural Architecture Search (DiffNAS). Specifically, given a larger pretrained teacher, we leverage DiffNAS to search for the smallest architecture which can achieve on-par or even better performance than the teacher. Considering current diffusion models are based on UNet which naturally has a block-wise structure, we perform neural architecture search independently in each block, which largely reduces the search space. Different from previous block-wise NAS methods, DiffNAS contains a block-wise local search strategy and a retraining strategy with a joint dynamic loss. Concretely, during the search process, we block-wisely select the best subnet to avoid the unfairness brought by the global search strategy used in previous works. When retraining the searched architecture, we adopt a dynamic joint loss to maintain the consistency between supernet training and subnet retraining, which also provides informative objectives for each block and shortens the paths of gradient propagation. We demonstrate this joint loss can effectively improve model performance. We also prove the necessity of the dynamic adjustment of this loss. The experiments show that our method can achieve significant computational reduction, especially on latent diffusion models with about 50\% MACs and Parameter reduction.
Published: 2023

4. NeurIPS'22 Cross-Domain MetaDL competition: Design and baseline results

Author: Carrión-Ojeda, Dustin, Chen, Hong, Baz, Adrian El, Escalera, Sergio, Guan, Chaoyu, Guyon, Isabelle, Ullah, Ihsan, Wang, Xin, and Zhu, Wenwu
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Neural and Evolutionary Computing
Abstract: We present the design and baseline results for a new challenge in the ChaLearn meta-learning series, accepted at NeurIPS'22, focusing on "cross-domain" meta-learning. Meta-learning aims to leverage experience gained from previous tasks to solve new tasks efficiently (i.e., with better performance, little training data, and/or modest computational resources). While previous challenges in the series focused on within-domain few-shot learning problems, with the aim of learning efficiently N-way k-shot tasks (i.e., N class classification problems with k training examples), this competition challenges the participants to solve "any-way" and "any-shot" problems drawn from various domains (healthcare, ecology, biology, manufacturing, and others), chosen for their humanitarian and societal impact. To that end, we created Meta-Album, a meta-dataset of 40 image classification datasets from 10 domains, from which we carve out tasks with any number of "ways" (within the range 2-20) and any number of "shots" (within the range 1-20). The competition is with code submission, fully blind-tested on the CodaLab challenge platform. The code of the winners will be open-sourced, enabling the deployment of automated machine learning solutions for few-shot image classification across several domains., Comment: Meta-Knowledge Transfer/Communication in Different Systems, Sep 2022, Grenoble, France
Published: 2022

5. Lessons learned from the NeurIPS 2021 MetaDL challenge: Backbone fine-tuning without episodic meta-learning dominates for few-shot learning image classification

Author: Baz, Adrian El, Ullah, Ihsan, Alcobaça, Edesio, Carvalho, André C. P. L. F., Chen, Hong, Ferreira, Fabio, Gouk, Henry, Guan, Chaoyu, Guyon, Isabelle, Hospedales, Timothy, Hu, Shell, Huisman, Mike, Hutter, Frank, Liu, Zhengying, Mohr, Felix, Öztürk, Ekrem, van Rijn, Jan N., Sun, Haozhe, Wang, Xin, and Zhu, Wenwu
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Computer Science - Neural and Evolutionary Computing
Abstract: Although deep neural networks are capable of achieving performance superior to humans on various tasks, they are notorious for requiring large amounts of data and computing resources, restricting their success to domains where such resources are available. Metalearning methods can address this problem by transferring knowledge from related tasks, thus reducing the amount of data and computing resources needed to learn new tasks. We organize the MetaDL competition series, which provide opportunities for research groups all over the world to create and experimentally assess new meta-(deep)learning solutions for real problems. In this paper, authored collaboratively between the competition organizers and the top-ranked participants, we describe the design of the competition, the datasets, the best experimental results, as well as the top-ranked methods in the NeurIPS 2021 challenge, which attracted 15 active teams who made it to the final phase (by outperforming the baseline), making over 100 code submissions during the feedback phase. The solutions of the top participants have been open-sourced. The lessons learned include that learning good representations is essential for effective transfer learning., Comment: version 2 is the correct version, including supplementary material at the end
Published: 2022

6. AutoGL: A Library for Automated Graph Learning

Author: Zhang, Ziwei, Qin, Yijian, Zhang, Zeyang, Guan, Chaoyu, Cai, Jie, Chang, Heng, Jiang, Jiyan, Li, Haoyang, Sun, Zixin, Xie, Beini, Yao, Yang, Zhang, Yipeng, Wang, Xin, and Zhu, Wenwu
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Recent years have witnessed an upsurge in research interests and applications of machine learning on graphs. However, manually designing the optimal machine learning algorithms for different graph datasets and tasks is inflexible, labor-intensive, and requires expert knowledge, limiting its adaptivity and applicability. Automated machine learning (AutoML) on graphs, aiming to automatically design the optimal machine learning algorithm for a given graph dataset and task, has received considerable attention. However, none of the existing libraries can fully support AutoML on graphs. To fill this gap, we present Automated Graph Learning (AutoGL), the first dedicated library for automated machine learning on graphs. AutoGL is open-source, easy to use, and flexible to be extended. Specifically, we propose a three-layer architecture, consisting of backends to interface with devices, a complete automated graph learning pipeline, and supported graph applications. The automated machine learning pipeline further contains five functional modules: auto feature engineering, neural architecture search, hyper-parameter optimization, model training, and auto ensemble, covering the majority of existing AutoML methods on graphs. For each module, we provide numerous state-of-the-art methods and flexible base classes and APIs, which allow easy usage and customization. We further provide experimental results to showcase the usage of our AutoGL library. We also present AutoGL-light, a lightweight version of AutoGL to facilitate customizing pipelines and enriching applications, as well as benchmarks for graph neural architecture search. The codes of AutoGL are publicly available at https://github.com/THUMNLab/AutoGL., Comment: Extended version; initial version published at ICLR 2021 Workshop on Geometrical and Topological Representation Learning
Published: 2021

7. MetaDelta: A Meta-Learning System for Few-shot Image Classification

Author: Chen, Yudong, Guan, Chaoyu, Wei, Zhikun, Wang, Xin, and Zhu, Wenwu
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Meta-learning aims at learning quickly on novel tasks with limited data by transferring generic experience learned from previous tasks. Naturally, few-shot learning has been one of the most popular applications for meta-learning. However, existing meta-learning algorithms rarely consider the time and resource efficiency or the generalization capacity for unknown datasets, which limits their applicability in real-world scenarios. In this paper, we propose MetaDelta, a novel practical meta-learning system for the few-shot image classification. MetaDelta consists of two core components: i) multiple meta-learners supervised by a central controller to ensure efficiency, and ii) a meta-ensemble module in charge of integrated inference and better generalization. In particular, each meta-learner in MetaDelta is composed of a unique pretrained encoder fine-tuned by batch training and parameter-free decoder used for prediction. MetaDelta ranks first in the final phase in the AAAI 2021 MetaDL Challenge\footnote{https://competitions.codalab.org/competitions/26638}, demonstrating the advantages of our proposed system. The codes are publicly available at https://github.com/Frozenmad/MetaDelta.
Published: 2021

8. Semantic Role Labeling with Associated Memory Network

Author: Guan, Chaoyu, Cheng, Yuhao, and Zhao, Hai
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Semantic role labeling (SRL) is a task to recognize all the predicate-argument pairs of a sentence, which has been in a performance improvement bottleneck after a series of latest works were presented. This paper proposes a novel syntax-agnostic SRL model enhanced by the proposed associated memory network (AMN), which makes use of inter-sentence attention of label-known associated sentences as a kind of memory to further enhance dependency-based SRL. In detail, we use sentences and their labels from train dataset as an associated memory cue to help label the target sentence. Furthermore, we compare several associated sentences selecting strategies and label merging methods in AMN to find and utilize the label of associated sentences while attending them. By leveraging the attentive memory from known training data, Our full model reaches state-of-the-art on CoNLL-2009 benchmark datasets for syntax-agnostic setting, showing a new effective research line of SRL enhancement other than exploiting external resources such as well pre-trained language models., Comment: Published at NAACL 2019; This is camera Ready version; Code is available at https://github.com/Frozenmad/AMN_SRL
Published: 2019
Full Text: View/download PDF

9. Post-training Quantization with Progressive Calibration and Activation Relaxing for Text-to-Image Diffusion Models

Author: Tang, Siao, Wang, Xin, Chen, Hong, Guan, Chaoyu, Wu, Zewen, Tang, Yansong, Zhu, Wenwu, Tang, Siao, Wang, Xin, Chen, Hong, Guan, Chaoyu, Wu, Zewen, Tang, Yansong, and Zhu, Wenwu
Abstract: Diffusion models have achieved great success due to their remarkable generation ability. However, their high computational overhead is still a troublesome problem. Recent studies have leveraged post-training quantization (PTQ) to compress diffusion models. However, most of them only focus on unconditional models, leaving the quantization of widely used large pretrained text-to-image models, e.g., Stable Diffusion, largely unexplored. In this paper, we propose a novel post-training quantization method PCR (Progressive Calibration and Relaxing) for text-to-image diffusion models, which consists of a progressive calibration strategy that considers the accumulated quantization error across timesteps, and an activation relaxing strategy that improves the performance with negligible cost. Additionally, we demonstrate the previous metrics for text-to-image diffusion model quantization are not accurate due to the distribution gap. To tackle the problem, we propose a novel QDiffBench benchmark, which utilizes data in the same domain for more accurate evaluation. Besides, QDiffBench also considers the generalization performance of the quantized model outside the calibration dataset. Extensive experiments on Stable Diffusion and Stable Diffusion XL demonstrate the superiority of our method and benchmark. Moreover, we are the first to achieve quantization for Stable Diffusion XL while maintaining the performance.
Published: 2023

10. Curriculum-NAS: Curriculum Weight-Sharing Neural Architecture Search

Author: Zhou, Yuwei, primary, Wang, Xin, additional, Chen, Hong, additional, Duan, Xuguang, additional, Guan, Chaoyu, additional, and Zhu, Wenwu, additional
Published: 2022
Full Text: View/download PDF

11. 硬件感知的神经架构搜索

Author: Wang, Xin, primary, Yao, Yang, additional, Jiang, Yuhang, additional, Guan, Chaoyu, additional, and Zhu, Wenwu, additional
Published: 2022
Full Text: View/download PDF

12. Multimodal Continual Graph Learning with Neural Architecture Search

Author: Cai, Jie, primary, Wang, Xin, additional, Guan, Chaoyu, additional, Tang, Yateng, additional, Xu, Jin, additional, Zhong, Bin, additional, and Zhu, Wenwu, additional
Published: 2022
Full Text: View/download PDF

13. Lessons learned from the NeurIPS 2021 MetaDL challenge: Backbone fine-tuning without episodic meta-learning dominates for few-shot learning image classification

Author: Baz, Adrian El, Ullah, Ihsan, Alcobaça, Edesio, Carvalho, André C. P. L. F., Chen, Hong, Ferreira, Fabio, Gouk, Henry, Guan, Chaoyu, Guyon, Isabelle, Hospedales, Timothy, Hu, Shell, Huisman, Mike, Hutter, Frank, Liu, Zhengying, Mohr, Felix, Öztürk, Ekrem, van Rijn, Jan N., Sun, Haozhe, Wang, Xin, Zhu, Wenwu, Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Université Paris-Saclay, TAckling the Underspecified (TAU), Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), and Institut National de Recherche en Informatique et en Automatique (Inria)
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Computer Science - Neural and Evolutionary Computing, [INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV], [INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE], Machine Learning (cs.LG), [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], Automated Machine Learning, Artificial Intelligence (cs.AI), meta-learning, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], challenge, few-shot learning, Neural and Evolutionary Computing (cs.NE), competition
Abstract: Although deep neural networks are capable of achieving performance superior to humans on various tasks, they are notorious for requiring large amounts of data and computing resources, restricting their success to domains where such resources are available. Metalearning methods can address this problem by transferring knowledge from related tasks, thus reducing the amount of data and computing resources needed to learn new tasks. We organize the MetaDL competition series, which provide opportunities for research groups all over the world to create and experimentally assess new meta-(deep)learning solutions for real problems. In this paper, authored collaboratively between the competition organizers and the top-ranked participants, we describe the design of the competition, the datasets, the best experimental results, as well as the top-ranked methods in the NeurIPS 2021 challenge, which attracted 15 active teams who made it to the final phase (by outperforming the baseline), making over 100 code submissions during the feedback phase. The solutions of the top participants have been open-sourced. The lessons learned include that learning good representations is essential for effective transfer learning., version 2 is the correct version, including supplementary material at the end
Published: 2021

14. AutoGL: A Library for Automated Graph Learning

Author: Guan, Chaoyu, Zhang, Ziwei, Li, Haoyang, Chang, Heng, Zhang, Zeyang, Qin, Yijian, Jiang, Jiyan, Wang, Xin, and Zhu, Wenwu
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, Machine Learning (cs.LG)
Abstract: Recent years have witnessed an upsurge of research interests and applications of machine learning on graphs. Automated machine learning (AutoML) on graphs is on the horizon to automatically design the optimal machine learning algorithm for a given graph task. However, none of the existing libraries can fully support AutoML on graphs. To fill this gap, we present Automated Graph Learning (AutoGL), the first library for automated machine learning on graphs. AutoGL is open-source, easy to use, and flexible to be extended. Specifically, we propose an automated machine learning pipeline for graph data containing four modules: auto feature engineering, model training, hyper-parameter optimization, and auto ensemble. For each module, we provide numerous state-of-the-art methods and flexible base classes and APIs, which allow easy customization. We further provide experimental results to showcase the usage of our AutoGL library., Comment: *Equal contributions. 8 pages, 1 figure, accepted at ICLR 2021 Workshop on Geometrical and Topological Representation Learning
Published: 2021
Full Text: View/download PDF

15. Memory Network for Linguistic Structure Parsing

Author: Li, Zuchao, primary, Guan, Chaoyu, additional, Zhao, Hai, additional, Wang, Rui, additional, Parnow, Kevin, additional, and Zhang, Zhuosheng, additional
Published: 2020
Full Text: View/download PDF

16. Semantic Role Labeling with Associated Memory Network

Author: Guan, Chaoyu, primary, Cheng, Yuhao, additional, and Zhao, Hai, additional
Published: 2019
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

16 results on '"Guan, Chaoyu"'

1. Evaluating the Generalization Ability of Quantized LLMs: Benchmark, Analysis, and Toolbox

2. Post-training Quantization for Text-to-Image Diffusion Models with Progressive Calibration and Activation Relaxing

3. Lightweight Diffusion Models with Distillation-Based Block Neural Architecture Search

4. NeurIPS'22 Cross-Domain MetaDL competition: Design and baseline results

5. Lessons learned from the NeurIPS 2021 MetaDL challenge: Backbone fine-tuning without episodic meta-learning dominates for few-shot learning image classification

6. AutoGL: A Library for Automated Graph Learning

7. MetaDelta: A Meta-Learning System for Few-shot Image Classification

8. Semantic Role Labeling with Associated Memory Network

9. Post-training Quantization with Progressive Calibration and Activation Relaxing for Text-to-Image Diffusion Models

10. Curriculum-NAS: Curriculum Weight-Sharing Neural Architecture Search

11. 硬件感知的神经架构搜索

12. Multimodal Continual Graph Learning with Neural Architecture Search

13. Lessons learned from the NeurIPS 2021 MetaDL challenge: Backbone fine-tuning without episodic meta-learning dominates for few-shot learning image classification

14. AutoGL: A Library for Automated Graph Learning

15. Memory Network for Linguistic Structure Parsing

16. Semantic Role Labeling with Associated Memory Network

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

16 results on '"Guan, Chaoyu"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources