Author: "Hong, Ilgee" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Hong, Ilgee"' showing total 5 results

Start Over Author "Hong, Ilgee"

5 results on '"Hong, Ilgee"'

1. Robust Reinforcement Learning from Corrupted Human Feedback

Author: Bukharin, Alexander, Hong, Ilgee, Jiang, Haoming, Li, Zichong, Zhang, Qingru, Zhang, Zixuan, and Zhao, Tuo
Subjects: Computer Science - Machine Learning
Abstract: Reinforcement learning from human feedback (RLHF) provides a principled framework for aligning AI systems with human preference data. For various reasons, e.g., personal bias, context ambiguity, lack of training, etc, human annotators may give incorrect or inconsistent preference labels. To tackle this challenge, we propose a robust RLHF approach -- $R^3M$, which models the potentially corrupted preference label as sparse outliers. Accordingly, we formulate the robust reward learning as an $\ell_1$-regularized maximum likelihood estimation problem. Computationally, we develop an efficient alternating optimization algorithm, which only incurs negligible computational overhead compared with the standard RLHF approach. Theoretically, we prove that under proper regularity conditions, $R^3M$ can consistently learn the underlying reward and identify outliers, provided that the number of outlier labels scales sublinearly with the preference sample size. Furthermore, we remark that $R^3M$ is versatile and can be extended to various preference optimization methods, including direct preference optimization (DPO). Our experiments on robotic control and natural language generation with large language models (LLMs) show that $R^3M$ improves robustness of the reward against several types of perturbations to the preference data., Comment: 22 pages, 7 figures
Published: 2024

2. Adaptive Preference Scaling for Reinforcement Learning with Human Feedback

Author: Hong, Ilgee, Li, Zichong, Bukharin, Alexander, Li, Yixiao, Jiang, Haoming, Yang, Tianbao, and Zhao, Tuo
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Reinforcement learning from human feedback (RLHF) is a prevalent approach to align AI systems with human values by learning rewards from human preference data. Due to various reasons, however, such data typically takes the form of rankings over pairs of trajectory segments, which fails to capture the varying strengths of preferences across different pairs. In this paper, we propose a novel adaptive preference loss, underpinned by distributionally robust optimization (DRO), designed to address this uncertainty in preference strength. By incorporating an adaptive scaling parameter into the loss for each pair, our method increases the flexibility of the reward function. Specifically, it assigns small scaling parameters to pairs with ambiguous preferences, leading to more comparable rewards, and large scaling parameters to those with clear preferences for more distinct rewards. Computationally, our proposed loss function is strictly convex and univariate with respect to each scaling parameter, enabling its efficient optimization through a simple second-order algorithm. Our method is versatile and can be readily adapted to various preference optimization frameworks, including direct preference optimization (DPO). Our experiments with robotic control and natural language generation with large language models (LLMs) show that our method not only improves policy performance but also aligns reward function selection more closely with policy optimization, simplifying the hyperparameter tuning process.
Published: 2024

3. Constrained Optimization via Exact Augmented Lagrangian and Randomized Iterative Sketching

Author: Hong, Ilgee, Na, Sen, Mahoney, Michael W., and Kolar, Mladen
Subjects: Mathematics - Optimization and Control, Computer Science - Machine Learning, Mathematics - Numerical Analysis, Statistics - Machine Learning
Abstract: We consider solving equality-constrained nonlinear, nonconvex optimization problems. This class of problems appears widely in a variety of applications in machine learning and engineering, ranging from constrained deep neural networks, to optimal control, to PDE-constrained optimization. We develop an adaptive inexact Newton method for this problem class. In each iteration, we solve the Lagrangian Newton system inexactly via a randomized iterative sketching solver, and select a suitable stepsize by performing line search on an exact augmented Lagrangian merit function. The randomized solvers have advantages over deterministic linear system solvers by significantly reducing per-iteration flops complexity and storage cost, when equipped with suitable sketching matrices. Our method adaptively controls the accuracy of the randomized solver and the penalty parameters of the exact augmented Lagrangian, to ensure that the inexact Newton direction is a descent direction of the exact augmented Lagrangian. This allows us to establish a global almost sure convergence. We also show that a unit stepsize is admissible locally, so that our method exhibits a local linear convergence. Furthermore, we prove that the linear convergence can be strengthened to superlinear convergence if we gradually sharpen the adaptive accuracy condition on the randomized solver. We demonstrate the superior performance of our method on benchmark nonlinear problems in CUTEst test set, constrained logistic regression with data from LIBSVM, and a PDE-constrained problem., Comment: 25 pages, 4 figures
Published: 2023

4. Who Wrote this Code? Watermarking for Code Generation

Author: Lee, Taehyun, Hong, Seokhee, Ahn, Jaewoo, Hong, Ilgee, Lee, Hwaran, Yun, Sangdoo, Shin, Jamin, and Kim, Gunhee
Subjects: Computer Science - Computation and Language
Abstract: Since the remarkable generation performance of large language models raised ethical and legal concerns, approaches to detect machine-generated text by embedding watermarks are being developed. However, we discover that the existing works fail to function appropriately in code generation tasks due to the task's nature of having low entropy. Extending a logit-modifying watermark method, we propose Selective WatErmarking via Entropy Thresholding (SWEET), which enhances detection ability and mitigates code quality degeneration by removing low-entropy segments at generating and detecting watermarks. Our experiments show that SWEET significantly improves code quality preservation while outperforming all baselines, including post-hoc detection methods, in detecting machine-generated code text. Our code is available in https://github.com/hongcheki/sweet-watermark., Comment: To be presented at ACL 2024
Published: 2023

5. A Simplified Framework for Contrastive Learning for Node Representations

Author: Hong, Ilgee, Tran, Huy, and Donnat, Claire
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Contrastive learning has recently established itself as a powerful self-supervised learning framework for extracting rich and versatile data representations. Broadly speaking, contrastive learning relies on a data augmentation scheme to generate two versions of the input data and learns low-dimensional representations by maximizing a normalized temperature-scaled cross entropy loss (NT-Xent) to identify augmented samples corresponding to the same original entity. In this paper, we investigate the potential of deploying contrastive learning in combination with Graph Neural Networks for embedding nodes in a graph. Specifically, we show that the quality of the resulting embeddings and training time can be significantly improved by a simple column-wise postprocessing of the embedding matrix, instead of the row-wise postprocessing via multilayer perceptrons (MLPs) that is adopted by the majority of peer methods. This modification yields improvements in downstream classification tasks of up to 1.5% and even beats existing state-of-the-art approaches on 6 out of 8 different benchmarks. We justify our choices of postprocessing by revisiting the "alignment vs. uniformity paradigm", and show that column-wise post-processing improves both "alignment" and "uniformity" of the embeddings.
Published: 2023

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

5 results on '"Hong, Ilgee"'

1. Robust Reinforcement Learning from Corrupted Human Feedback

2. Adaptive Preference Scaling for Reinforcement Learning with Human Feedback

3. Constrained Optimization via Exact Augmented Lagrangian and Randomized Iterative Sketching

4. Who Wrote this Code? Watermarking for Code Generation

5. A Simplified Framework for Contrastive Learning for Node Representations

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

5 results on '"Hong, Ilgee"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources