Author: "Guo, Zhicheng" / Database: arXiv - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Guo, Zhicheng"' showing total 16 results

Start Over Author "Guo, Zhicheng" Database arXiv

16 results on '"Guo, Zhicheng"'

1. StepTool: A Step-grained Reinforcement Learning Framework for Tool Learning in LLMs

Author: Yu, Yuanqing, Wang, Zhefan, Ma, Weizhi, Guo, Zhicheng, Zhan, Jingtao, Wang, Shuai, Wu, Chuhan, Guo, Zhiqiang, and Zhang, Min
Subjects: Computer Science - Computation and Language
Abstract: Despite having powerful reasoning and inference capabilities, Large Language Models (LLMs) still need external tools to acquire real-time information retrieval or domain-specific expertise to solve complex tasks, which is referred to as tool learning. Existing tool learning methods primarily rely on tuning with expert trajectories, focusing on token-sequence learning from a linguistic perspective. However, there are several challenges: 1) imitating static trajectories limits their ability to generalize to new tasks. 2) even expert trajectories can be suboptimal, and better solution paths may exist. In this work, we introduce StepTool, a novel step-grained reinforcement learning framework to improve tool learning in LLMs. It consists of two components: Step-grained Reward Shaping, which assigns rewards at each tool interaction based on tool invocation success and its contribution to the task, and Step-grained Optimization, which uses policy gradient methods to optimize the model in a multi-step manner. Experimental results demonstrate that StepTool significantly outperforms existing methods in multi-step, tool-based tasks, providing a robust solution for complex task environments. Codes are available at https://github.com/yuyq18/StepTool., Comment: Ongoning Work
Published: 2024

2. Is It Good Data for Multilingual Instruction Tuning or Just Bad Multilingual Evaluation for Large Language Models?

Author: Chen, Pinzhen, Yu, Simon, Guo, Zhicheng, and Haddow, Barry
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Multilingual large language models are designed, claimed, and expected to cater to speakers of varied languages. We hypothesise that the current practices of fine-tuning and evaluating these models may not perfectly align with this objective owing to a heavy reliance on translation, which cannot cover language-specific knowledge but can introduce translation defects. It remains unknown whether the nature of the instruction data has an impact on the model output; conversely, it is questionable whether translated test sets can capture such nuances. Due to the often coupled practices of using translated data in both stages, such imperfections could have been overlooked. This work investigates these issues using controlled native or translated data during the instruction tuning and evaluation stages. We show that native or generation benchmarks reveal a notable difference between native and translated instruction data especially when model performance is high, whereas other types of test sets cannot. The comparison between round-trip and single-pass translations reflects the importance of knowledge from language-native resources. Finally, we demonstrate that regularization is beneficial to bridging this gap on structured but not generative tasks., Comment: EMNLP 2024
Published: 2024

3. SiamQuality: A ConvNet-Based Foundation Model for Imperfect Physiological Signals

Author: Ding, Cheng, Guo, Zhicheng, Chen, Zhaoliang, Lee, Randall J, Rudin, Cynthia, and Hu, Xiao
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Machine Learning
Abstract: Foundation models, especially those using transformers as backbones, have gained significant popularity, particularly in language and language-vision tasks. However, large foundation models are typically trained on high-quality data, which poses a significant challenge, given the prevalence of poor-quality real-world data. This challenge is more pronounced for developing foundation models for physiological data; such data are often noisy, incomplete, or inconsistent. The present work aims to provide a toolset for developing foundation models on physiological data. We leverage a large dataset of photoplethysmography (PPG) signals from hospitalized intensive care patients. For this data, we propose SimQuality, a novel self-supervised learning task based on convolutional neural networks (CNNs) as the backbone to enforce representations to be similar for good and poor quality signals that are from similar physiological states. We pre-trained the SimQuality on over 36 million 30-second PPG pairs and then fine-tuned and tested on six downstream tasks using external datasets. The results demonstrate the superiority of the proposed approach on all the downstream tasks, which are extremely important for heart monitoring on wearable devices. Our method indicates that CNNs can be an effective backbone for foundation models that are robust to training data quality.
Published: 2024

4. StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models

Author: Guo, Zhicheng, Cheng, Sijie, Wang, Hao, Liang, Shihao, Qin, Yujia, Li, Peng, Liu, Zhiyuan, Sun, Maosong, and Liu, Yang
Subjects: Computer Science - Computation and Language
Abstract: Large Language Models (LLMs) have witnessed remarkable advancements in recent years, prompting the exploration of tool learning, which integrates LLMs with external tools to address diverse real-world challenges. Assessing the capability of LLMs to utilise tools necessitates large-scale and stable benchmarks. However, previous works relied on either hand-crafted online tools with limited scale, or large-scale real online APIs suffering from instability of API status. To address this problem, we introduce StableToolBench, a benchmark evolving from ToolBench, proposing a virtual API server and stable evaluation system. The virtual API server contains a caching system and API simulators which are complementary to alleviate the change in API status. Meanwhile, the stable evaluation system designs solvable pass and win rates using GPT-4 as the automatic evaluator to eliminate the randomness during evaluation. Experimental results demonstrate the stability of StableToolBench, and further discuss the effectiveness of API simulators, the caching system, and the evaluator system.
Published: 2024

5. What is different between these datasets?

Author: Babbar, Varun, Guo, Zhicheng, and Rudin, Cynthia
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: The performance of machine learning models heavily depends on the quality of input data, yet real-world applications often encounter various data-related challenges. One such challenge could arise when curating training data or deploying the model in the real world - two comparable datasets in the same domain may have different distributions. While numerous techniques exist for detecting distribution shifts, the literature lacks comprehensive approaches for explaining dataset differences in a human-understandable manner. To address this gap, we propose a suite of interpretable methods (toolbox) for comparing two datasets. We demonstrate the versatility of our approach across diverse data modalities, including tabular data, language, images, and signals in both low and high-dimensional settings. Our methods not only outperform comparable and related approaches in terms of explanation quality and correctness, but also provide actionable, complementary insights to understand and mitigate dataset differences effectively.
Published: 2024

6. Towards Unified Alignment Between Agents, Humans, and Environment

Author: Yang, Zonghan, Liu, An, Liu, Zijun, Liu, Kaiming, Xiong, Fangzhou, Wang, Yile, Yang, Zeyuan, Hu, Qingyuan, Chen, Xinrui, Zhang, Zhenhe, Luo, Fuwen, Guo, Zhicheng, Li, Peng, and Liu, Yang
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: The rapid progress of foundation models has led to the prosperity of autonomous agents, which leverage the universal capabilities of foundation models to conduct reasoning, decision-making, and environmental interaction. However, the efficacy of agents remains limited when operating in intricate, realistic environments. In this work, we introduce the principles of $\mathbf{U}$nified $\mathbf{A}$lignment for $\mathbf{A}$gents ($\mathbf{UA}^2$), which advocate for the simultaneous alignment of agents with human intentions, environmental dynamics, and self-constraints such as the limitation of monetary budgets. From the perspective of $\mathbf{UA}^2$, we review the current agent research and highlight the neglected factors in existing agent benchmarks and method candidates. We also conduct proof-of-concept studies by introducing realistic features to WebShop, including user profiles to demonstrate intentions, personalized reranking for complex environmental dynamics, and runtime cost statistics to reflect self-constraints. We then follow the principles of $\mathbf{UA}^2$ to propose an initial design of our agent, and benchmark its performance with several candidate baselines in the retrofitted WebShop. The extensive experimental results further prove the importance of the principles of $\mathbf{UA}^2$. Our research sheds light on the next steps of autonomous agent research with improved general problem-solving abilities., Comment: Project webpage: https://agent-force.github.io/unified-alignment-for-agents.html
Published: 2024

7. Reconsideration on evaluation of machine learning models in continuous monitoring using wearables

Author: Ding, Cheng, Guo, Zhicheng, Rudin, Cynthia, Xiao, Ran, Nahab, Fadi B, and Hu, Xiao
Subjects: Computer Science - Machine Learning, Electrical Engineering and Systems Science - Signal Processing
Abstract: This paper explores the challenges in evaluating machine learning (ML) models for continuous health monitoring using wearable devices beyond conventional metrics. We state the complexities posed by real-world variability, disease dynamics, user-specific characteristics, and the prevalence of false notifications, necessitating novel evaluation strategies. Drawing insights from large-scale heart studies, the paper offers a comprehensive guideline for robust ML model evaluation on continuous health monitoring.
Published: 2023

8. EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models

Author: Cheng, Sijie, Guo, Zhicheng, Wu, Jingwen, Fang, Kechen, Li, Peng, Liu, Huaping, and Liu, Yang
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Computation and Language
Abstract: Vision-language models (VLMs) have recently shown promising results in traditional downstream tasks. Evaluation studies have emerged to assess their abilities, with the majority focusing on the third-person perspective, and only a few addressing specific tasks from the first-person perspective. However, the capability of VLMs to "think" from a first-person perspective, a crucial attribute for advancing autonomous agents and robotics, remains largely unexplored. To bridge this research gap, we introduce EgoThink, a novel visual question-answering benchmark that encompasses six core capabilities with twelve detailed dimensions. The benchmark is constructed using selected clips from egocentric videos, with manually annotated question-answer pairs containing first-person information. To comprehensively assess VLMs, we evaluate eighteen popular VLMs on EgoThink. Moreover, given the open-ended format of the answers, we use GPT-4 as the automatic judge to compute single-answer grading. Experimental results indicate that although GPT-4V leads in numerous dimensions, all evaluated VLMs still possess considerable potential for improvement in first-person perspective tasks. Meanwhile, enlarging the number of trainable parameters has the most significant impact on model performance on EgoThink. In conclusion, EgoThink serves as a valuable addition to existing evaluation benchmarks for VLMs, providing an indispensable resource for future research in the realm of embodied artificial intelligence and robotics.
Published: 2023

9. SiamAF: Learning Shared Information from ECG and PPG Signals for Robust Atrial Fibrillation Detection

Author: Guo, Zhicheng, Ding, Cheng, Do, Duc H., Shah, Amit, Lee, Randall J., Hu, Xiao, and Rudin, Cynthia
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Atrial fibrillation (AF) is the most common type of cardiac arrhythmia. It is associated with an increased risk of stroke, heart failure, and other cardiovascular complications, but can be clinically silent. Passive AF monitoring with wearables may help reduce adverse clinical outcomes related to AF. Detecting AF in noisy wearable data poses a significant challenge, leading to the emergence of various deep learning techniques. Previous deep learning models learn from a single modality, either electrocardiogram (ECG) or photoplethysmography (PPG) signals. However, deep learning models often struggle to learn generalizable features and rely on features that are more susceptible to corruption from noise, leading to sub-optimal performances in certain scenarios, especially with low-quality signals. Given the increasing availability of ECG and PPG signal pairs from wearables and bedside monitors, we propose a new approach, SiamAF, leveraging a novel Siamese network architecture and joint learning loss function to learn shared information from both ECG and PPG signals. At inference time, the proposed model is able to predict AF from either PPG or ECG and outperforms baseline methods on three external test sets. It learns medically relevant features as a result of our novel architecture design. The proposed model also achieves comparable performance to traditional learning regimes while requiring much fewer training labels, providing a potential approach to reduce future reliance on manual labeling.
Published: 2023

10. Sparse learned kernels for interpretable and efficient medical time series processing

Author: Chen, Sully F., Guo, Zhicheng, Ding, Cheng, Hu, Xiao, and Rudin, Cynthia
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Rapid, reliable, and accurate interpretation of medical time-series signals is crucial for high-stakes clinical decision-making. Deep learning methods offered unprecedented performance in medical signal processing but at a cost: they were compute-intensive and lacked interpretability. We propose Sparse Mixture of Learned Kernels (SMoLK), an interpretable architecture for medical time series processing. SMoLK learns a set of lightweight flexible kernels that form a single-layer sparse neural network, providing not only interpretability, but also efficiency, robustness, and generalization to unseen data distributions. We introduce a parameter reduction techniques to reduce the size of SMoLK's networks while maintaining performance. We test SMoLK on two important tasks common to many consumer wearables: photoplethysmography (PPG) artifact detection and atrial fibrillation detection from single-lead electrocardiograms (ECGs). We find that SMoLK matches the performance of models orders of magnitude larger. It is particularly suited for real-time applications using low-power devices, and its interpretability benefits high-stakes situations., Comment: Published as an article in Nature Machine Intelligence (https://doi.org/10.1038/s42256-024-00898-4). 23 pages, 9 figures
Published: 2023
Full Text: View/download PDF

11. Iterative Translation Refinement with Large Language Models

Author: Chen, Pinzhen, Guo, Zhicheng, Haddow, Barry, and Heafield, Kenneth
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: We propose iteratively prompting a large language model to self-correct a translation, with inspiration from their strong language understanding and translation capability as well as a human-like translation approach. Interestingly, multi-turn querying reduces the output's string-based metric scores, but neural metrics suggest comparable or improved quality. Human evaluations indicate better fluency and naturalness compared to initial translations and even human references, all while maintaining quality. Ablation studies underscore the importance of anchoring the refinement to the source and a reasonable seed translation for quality considerations. We also discuss the challenges in evaluation and relation to human performance and translationese.
Published: 2023

12. Prompt-Guided Retrieval Augmentation for Non-Knowledge-Intensive Tasks

Author: Guo, Zhicheng, Cheng, Sijie, Wang, Yile, Li, Peng, and Liu, Yang
Subjects: Computer Science - Computation and Language
Abstract: Retrieval-augmented methods have received increasing attention to support downstream tasks by leveraging useful information from external resources. Recent studies mainly focus on exploring retrieval to solve knowledge-intensive (KI) tasks. However, the potential of retrieval for most non-knowledge-intensive (NKI) tasks remains under-explored. There are two main challenges to leveraging retrieval-augmented methods for NKI tasks: 1) the demand for diverse relevance score functions and 2) the dilemma between training cost and task performance. To address these challenges, we propose a two-stage framework for NKI tasks, named PGRA. In the first stage, we adopt a task-agnostic retriever to build a shared static index and select candidate evidence efficiently. In the second stage, we design a prompt-guided reranker to rerank the nearest evidence according to task-specific relevance for the reader. Experimental results show that PGRA outperforms other state-of-the-art retrieval-augmented methods. Our analyses further investigate the influence factors to model performance and demonstrate the generality of PGRA. Codes are available at https://github.com/THUNLP-MT/PGRA.
Published: 2023

13. Improving Clinician Performance in Classification of EEG Patterns on the Ictal-Interictal-Injury Continuum using Interpretable Machine Learning

Author: Barnett, Alina Jade, Guo, Zhicheng, Jing, Jin, Ge, Wendong, Kaplan, Peter W., Kong, Wan Yee, Karakis, Ioannis, Herlopian, Aline, Jayagopal, Lakshman Arcot, Taraschenko, Olga, Selioutski, Olga, Osman, Gamaleldin, Goldenholz, Daniel, Rudin, Cynthia, and Westover, M. Brandon
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, I.2.6, I.4.9, I.5.4
Abstract: In intensive care units (ICUs), critically ill patients are monitored with electroencephalograms (EEGs) to prevent serious brain injury. The number of patients who can be monitored is constrained by the availability of trained physicians to read EEGs, and EEG interpretation can be subjective and prone to inter-observer variability. Automated deep learning systems for EEG could reduce human bias and accelerate the diagnostic process. However, black box deep learning models are untrustworthy, difficult to troubleshoot, and lack accountability in real-world applications, leading to a lack of trust and adoption by clinicians. To address these challenges, we propose a novel interpretable deep learning model that not only predicts the presence of harmful brainwave patterns but also provides high-quality case-based explanations of its decisions. Our model performs better than the corresponding black box model, despite being constrained to be interpretable. The learned 2D embedded space provides the first global overview of the structure of ictal-interictal-injury continuum brainwave patterns. The ability to understand how our model arrived at its decisions will not only help clinicians to diagnose and treat harmful brain activities more accurately but also increase their trust and adoption of machine learning models in clinical practice; this could be an integral component of the ICU neurologists' standard workflow., Comment: 24 pages including appendices, 9 figures, published at NEJM AI
Published: 2022
Full Text: View/download PDF

14. Learning From Alarms: A Robust Learning Approach for Accurate Photoplethysmography-Based Atrial Fibrillation Detection using Eight Million Samples Labeled with Imprecise Arrhythmia Alarms

Author: Ding, Cheng, Guo, Zhicheng, Rudin, Cynthia, Xiao, Ran, Shah, Amit, Do, Duc H., Lee, Randall J, Clifford, Gari, Nahab, Fadi B, and Hu, Xiao
Subjects: Electrical Engineering and Systems Science - Signal Processing
Abstract: Atrial fibrillation (AF) is a common cardiac arrhythmia with serious health consequences if not detected and treated early. Detecting AF using wearable devices with photoplethysmography (PPG) sensors and deep neural networks has demonstrated some success using proprietary algorithms in commercial solutions. However, further advancement of this paradigm of continuous AF detection in ambulatory settings, towards a population-wide screening use case, still faces several challenges, one of which is the lack of large-scale labeled training data. To address this challenge, in this study, we propose to leverage AF alarms from bedside patient monitors to label concurrent PPG signals, resulting in the largest PPG-AF dataset so far (8.5M 30-second records from 24100 patients) and demonstrating a practical approach to build large labeled PPG datasets. Furthermore, we recognize that the AF labels thus obtained contain errors because of false AF alarms generated from imperfect built-in algorithms from bedside monitors. Dealing with label noise with unknown distribution characteristics in this case requires advanced algorithms. We, therefore, introduce and open source a novel loss design, the cluster membership consistency (CMC) loss, to mitigate label errors. By comparing CMC with state-of-the-art methods selected from a noisy label competition, we demonstrate its superiority in multiple aspects including handling label noise in PPG data, resilience to poor-quality signals, and computational efficiency.
Published: 2022

15. The Multi-Modal Video Reasoning and Analyzing Competition

Author: Peng, Haoran, Huang, He, Xu, Li, Li, Tianjiao, Liu, Jun, Rahmani, Hossein, Ke, Qiuhong, Guo, Zhicheng, Wu, Cong, Li, Rongchang, Ye, Mang, Wang, Jiahao, Zhang, Jiaxu, Liu, Yuanzhong, He, Tao, Zhang, Fuwei, Liu, Xianbin, and Lin, Tao
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, I.2.10, I.2.6
Abstract: In this paper, we introduce the Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC) workshop in conjunction with ICCV 2021. This competition is composed of four different tracks, namely, video question answering, skeleton-based action recognition, fisheye video-based action recognition, and person re-identification, which are based on two datasets: SUTD-TrafficQA and UAV-Human. We summarize the top-performing methods submitted by the participants in this competition and show their results achieved in the competition., Comment: Accepted to ICCV 2021 Workshops
Published: 2021

16. Balanced Coarsening for Multilevel Hypergraph Partitioning via Wasserstein Discrepancy

Author: Guo, Zhicheng, Zhao, Jiaxuan, Jiao, Licheng, and Liu, Xu
Subjects: Computer Science - Machine Learning
Abstract: We propose a balanced coarsening scheme for multilevel hypergraph partitioning. In addition, an initial partitioning algorithm is designed to improve the quality of k-way hypergraph partitioning. By assigning vertex weights through the LPT algorithm, we generate a prior hypergraph under a relaxed balance constraint. With the prior hypergraph, we have defined the Wasserstein discrepancy to coordinate the optimal transport of coarsening process. And the optimal transport matrix is solved by Sinkhorn algorithm. Our coarsening scheme fully takes into account the minimization of connectivity metric (objective function). For the initial partitioning stage, we define a normalized cut function induced by Fiedler vector, which is theoretically proved to be a concave function. Thereby, a three-point algorithm is designed to find the best cut under the balance constraint., Comment: There are some errors in the paper
Published: 2021

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

16 results on '"Guo, Zhicheng"'

1. StepTool: A Step-grained Reinforcement Learning Framework for Tool Learning in LLMs

2. Is It Good Data for Multilingual Instruction Tuning or Just Bad Multilingual Evaluation for Large Language Models?

3. SiamQuality: A ConvNet-Based Foundation Model for Imperfect Physiological Signals

4. StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models

5. What is different between these datasets?

6. Towards Unified Alignment Between Agents, Humans, and Environment

7. Reconsideration on evaluation of machine learning models in continuous monitoring using wearables

8. EgoThink: Evaluating First-Person Perspective Thinking Capability of Vision-Language Models

9. SiamAF: Learning Shared Information from ECG and PPG Signals for Robust Atrial Fibrillation Detection

10. Sparse learned kernels for interpretable and efficient medical time series processing

11. Iterative Translation Refinement with Large Language Models

12. Prompt-Guided Retrieval Augmentation for Non-Knowledge-Intensive Tasks

13. Improving Clinician Performance in Classification of EEG Patterns on the Ictal-Interictal-Injury Continuum using Interpretable Machine Learning

14. Learning From Alarms: A Robust Learning Approach for Accurate Photoplethysmography-Based Atrial Fibrillation Detection using Eight Million Samples Labeled with Imprecise Arrhythmia Alarms

15. The Multi-Modal Video Reasoning and Analyzing Competition

16. Balanced Coarsening for Multilevel Hypergraph Partitioning via Wasserstein Discrepancy

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Database

16 results on '"Guo, Zhicheng"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources