Search

Your search keyword '"Zheng, Rui"' showing total 36 results

Search Constraints

Start Over You searched for: Author "Zheng, Rui" Remove constraint Author: "Zheng, Rui" Topic computer science - computation and language Remove constraint Topic: computer science - computation and language
36 results on '"Zheng, Rui"'

Search Results

1. RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent

2. What's Wrong with Your Code Generated by Large Language Models? An Extensive Study

3. SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance

4. SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model

5. Aligning Large Language Models from Self-Reference AI Feedback with one General Principle

6. Toward Optimal LLM Alignments Using Two-Player Games

7. Uncertainty Aware Learning for Language Model Alignment

8. AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

9. MetaRM: Shifted Distributions Alignment via Meta-Learning

10. Subspace Defense: Discarding Adversarial Perturbations by Learning a Subspace for Clean Signals

11. EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models

12. Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards

13. DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation

14. Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution

15. Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

16. StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

17. MouSi: Poly-Visual-Expert Vision-Language Models

18. Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback

19. Rethinking Jailbreaking through the Lens of Representation Engineering

20. LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models via MoE-Style Plugin

21. Rescue: Ranking LLM Responses with Partial Ordering to Improve Response Generation

22. Orthogonal Subspace Learning for Language Model Continual Learning

23. RealBehavior: A Framework for Faithfully Characterizing Foundation Models' Human-like Behavior Mechanisms

24. TRACE: A Comprehensive Benchmark for Continual Learning in Large Language Models

25. Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning from Human Feedback

26. The Rise and Potential of Large Language Model Based Agents: A Survey

27. Secrets of RLHF in Large Language Models Part I: PPO

28. Self-Polish: Enhance Reasoning in Large Language Models via Problem Refinement

29. Modeling the Q-Diversity in a Min-max Play Game for Robust Optimization

30. Enhancing Contrastive Learning with Noise-Guided Attack: Towards Continual Relation Extraction in the Wild

31. InstructUIE: Multi-task Instruction Tuning for Unified Information Extraction

32. How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding Tasks

33. Efficient Adversarial Training with Robust Early-Bird Tickets

34. Robust Lottery Tickets for Pre-trained Language Models

35. Decorrelate Irrelevant, Purify Relevant: Overcome Textual Spurious Correlations from a Feature Perspective

36. TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing

Catalog

Books, media, physical & digital resources