Search

Your search keyword '"Zhao, Tuo"' showing total 68 results

Search Constraints

Start Over You searched for: Author "Zhao, Tuo" Remove constraint Author: "Zhao, Tuo" Topic machine learning (cs.lg) Remove constraint Topic: machine learning (cs.lg)
68 results on '"Zhao, Tuo"'

Search Results

1. Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks

2. Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning

3. HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers

4. Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data

5. LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation

6. Machine Learning Force Fields with Data Cost Aware Training

7. On Deep Generative Models for Approximation and Estimation of Distributions on Manifolds

8. Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories

9. First-order Policy Optimization for Robust Markov Decision Process

10. DiP-GNN: Discriminative Pre-Training of Graph Neural Networks

11. Sample Complexity of Nonparametric Off-Policy Evaluation on Low-Dimensional Manifolds using Deep Networks

12. Efficient Long Sequence Modeling via State Space Augmented Transformer

13. High Dimensional Binary Classification under Label Shift: Phase Transition and Regularization

14. Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint

15. PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance

16. CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing

17. Block Policy Mirror Descent

18. No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models

19. Deep Nonparametric Estimation of Operators between Infinite Dimensional Spaces

20. Differentially Private Estimation of Hawkes Process

21. Less is More: Task-aware Layer-wise Distillation for Language Model Compression

22. Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably

23. Homotopic Policy Mirror Descent: Policy Convergence, Implicit Regularization, and Improved Sample Complexity

24. Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect

25. Adversarially Regularized Policy Learning Guided by Trajectory Optimization

26. Implicit Regularization of Bregman Proximal Point Algorithm and Mirror Descent on Separable Data

27. Reinforcement Learning for Adaptive Mesh Refinement

28. Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization

29. Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach

30. Frequency-aware SGD for Efficient Embedding Learning with Provable Benefits

31. Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach

32. Taming Sparsely Activated Transformer with Stochastic Experts

33. Besov Function Approximation and Binary Classification on Low-Dimensional Manifolds Using Convolutional Residual Networks

34. A Hypergradient Approach to Robust Regression without Correspondence

35. Picasso: A Sparse Learning Library for High Dimensional Data Analysis in R and Python

36. Towards Understanding Hierarchical Learning: Benefits of Neural Representations

37. Differentiable Top-k Operator with Optimal Transport

38. Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? -- A Neural Tangent Kernel Perspective

39. Doubly Robust Off-Policy Learning on Low-Dimensional Manifolds by Deep Neural Networks

40. Transformer Hawkes Process

41. The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R

42. Deep Reinforcement Learning with Robust and Smooth Policy

43. On Computation and Generalization of Generative Adversarial Imitation Learning

44. How Important is the Train-Validation Split in Meta-Learning?

45. Distribution Approximation and Statistical Estimation Guarantees of Generative Adversarial Networks

46. On Generalization Bounds of a Family of Recurrent Neural Networks

47. Towards Understanding the Importance of Shortcut Connections in Residual Networks

48. Meta Learning with Relational Information for Short Sequences

49. Inductive Bias of Gradient Descent based Adversarial Training on Separable Data

50. Nonparametric Regression on Low-Dimensional Manifolds using Deep ReLU Networks : Function Approximation and Statistical Recovery

Catalog

Books, media, physical & digital resources