16 results on '"Ruiqi Zhang"'
Search Results
2. Fast Best-of-N Decoding via Speculative Rejection.
3. Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning.
4. ProxFly: Robust Control for Close Proximity Quadcopter Flight via Residual Reinforcement Learning.
5. Multi-Agent Reinforcement Learning for Autonomous Driving: A Survey.
6. Is Offline Decision Making Possible with Only Few Samples? Reliable Decisions in Data-Starved Bandits via Trust Region Enhancement.
7. Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning.
8. In-Context Learning of a Linear Transformer Block: Benefits of the MLP Component and One-Step GD Initialization.
9. AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition.
10. Spreeze: High-Throughput Parallel Reinforcement Learning Framework.
11. Explicifying Neural Implicit Fields for Efficient Dynamic Human Avatar Modeling via a Neural Explicit Surface.
12. Policy Finetuning in Reinforcement Learning via Design of Experiments using Offline Data.
13. Trained Transformers Learn Linear Models In-Context.
14. Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration.
15. NDF: Neural Deformable Fields for Dynamic Human Modelling.
16. Off-Policy Fitted Q-Evaluation with Differentiable Function Approximators: Z-Estimation and Inference Theory.
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.