Search

Your search keyword '"Zhao, Tuo"' showing total 495 results

Search Constraints

Start Over You searched for: Author "Zhao, Tuo" Remove constraint Author: "Zhao, Tuo"
495 results on '"Zhao, Tuo"'

Search Results

1. Robust Reinforcement Learning from Corrupted Human Feedback

2. RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning

3. Adaptive Preference Scaling for Reinforcement Learning with Human Feedback

4. To Cool or not to Cool? Temperature Network Meets Large Foundation Models via DRO

5. Stochastic Constrained Decentralized Optimization for Machine Learning with Fewer Data Oracles: a Gradient Sliding Approach

6. GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM

7. BlendFilter: Advancing Retrieval-Augmented Large Language Models via Query Generation Blending and Knowledge Filtering

8. Data Diversity Matters for Robust Instruction Tuning

9. Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs

10. Model-Based Reparameterization Policy Gradient Methods: Theory and Practical Algorithms

11. Good regularity creates large learning rate implicit biases: edge of stability, balancing, and catapult

12. SMURF-THP: Score Matching-based UnceRtainty quantiFication for Transformer Hawkes Process

13. Score Matching-based Pseudolikelihood Estimation of Neural Marked Spatio-Temporal Point Process with Uncertainty Quantification

14. Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer

15. Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms

16. LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models

17. Module-wise Adaptive Distillation for Multimodality Foundation Models

18. Sample Complexity of Neural Policy Mirror Descent for Policy Optimization on Low-Dimensional Manifolds

19. Pivotal Estimation of Linear Discriminant Analysis in High Dimensions

20. Deep Reinforcement Learning from Hierarchical Preference Design

22. Provable Benefits of Policy Learning from Human Preferences in Contextual Bandit Problems

23. Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks

24. Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories

25. LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation

26. Machine Learning Force Fields with Data Cost Aware Training

27. AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning

28. Hybrid Deep Generative and Sequential Learning Approach for Stock Market Prediction

29. On Deep Generative Models for Approximation and Estimation of Distributions on Manifolds

30. HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers

31. Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data

32. Efficient Long Sequence Modeling via State Space Augmented Transformer

33. High Dimensional Binary Classification under Label Shift: Phase Transition and Regularization

34. Less is More: Task-aware Layer-wise Distillation for Language Model Compression

35. First-order Policy Optimization for Robust Markov Decision Process

36. Context-Aware Query Rewriting for Improving Users' Search Experience on E-commerce Websites

37. DiP-GNN: Discriminative Pre-Training of Graph Neural Networks

38. Differentially Private Estimation of Hawkes Process

39. PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance

40. Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint

41. Sample Complexity of Nonparametric Off-Policy Evaluation on Low-Dimensional Manifolds using Deep Networks

42. A Manifold Two-Sample Test Study: Integral Probability Metric with Neural Networks

43. MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation

44. CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing

45. CERES: Pretraining of Graph-Conditioned Transformer for Semi-Structured Session Data

48. Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably

49. No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models

50. Homotopic Policy Mirror Descent: Policy Convergence, Implicit Regularization, and Improved Sample Complexity

Catalog

Books, media, physical & digital resources