Search

Your search keyword '"Zhao, Tuo"' showing total 40 results

Search Constraints

Start Over You searched for: Author "Zhao, Tuo" Remove constraint Author: "Zhao, Tuo" Publisher arxiv Remove constraint Publisher: arxiv
40 results on '"Zhao, Tuo"'

Search Results

1. LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation

2. Machine Learning Force Fields with Data Cost Aware Training

3. On Deep Generative Models for Approximation and Estimation of Distributions on Manifolds

4. Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories

5. Efficient Long Sequence Modeling via State Space Augmented Transformer

6. High Dimensional Binary Classification under Label Shift: Phase Transition and Regularization

7. Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint

8. PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance

9. CAMERO: Consistency Regularized Ensemble of Perturbed Language Models with Weight Sharing

10. Block Policy Mirror Descent

11. No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models

12. Deep Nonparametric Estimation of Operators between Infinite Dimensional Spaces

13. Differentially Private Estimation of Hawkes Process

14. Less is More: Task-aware Layer-wise Distillation for Language Model Compression

15. Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably

16. Homotopic Policy Mirror Descent: Policy Convergence, Implicit Regularization, and Improved Sample Complexity

17. Frequency-aware SGD for Efficient Embedding Learning with Provable Benefits

18. Learning Generalizable Vision-Tactile Robotic Grasping Strategy for Deformable Objects via Transformer

19. Towards Automatic Evaluation of Dialog Systems: A Model-Free Off-Policy Evaluation Approach

20. Taming Sparsely Activated Transformer with Stochastic Experts

21. Besov Function Approximation and Binary Classification on Low-Dimensional Manifolds Using Convolutional Residual Networks

22. Doubly Robust Off-Policy Learning on Low-Dimensional Manifolds by Deep Neural Networks

23. Transformer Hawkes Process

24. The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R

25. Deep Reinforcement Learning with Robust and Smooth Policy

26. On Computation and Generalization of Generative Adversarial Imitation Learning

27. How Important is the Train-Validation Split in Meta-Learning?

28. Distribution Approximation and Statistical Estimation Guarantees of Generative Adversarial Networks

29. Inductive Bias of Gradient Descent based Adversarial Training on Separable Data

30. Nonparametric Regression on Low-Dimensional Manifolds using Deep ReLU Networks : Function Approximation and Statistical Recovery

31. On Scalable and Efficient Computation of Large Scale Optimal Transport

32. Towards Understanding the Importance of Noise in Training Neural Networks

33. On Tighter Generalization Bound for Deep Neural Networks: CNNs, ResNets, and Beyond

34. Provable Gaussian Embedding with One Observation

35. Learning to Defend by Learning to Attack

36. Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

37. A Diffusion Approximation Theory of Momentum SGD in Nonconvex Optimization

38. Deep Hyperspherical Learning

39. Nonconvex Sparse Learning via Stochastic Optimization with Progressive Variance Reduction

40. On Fast Convergence of Proximal Algorithms for SQRT-Lasso Optimization: Don't Worry About Its Nonsmooth Loss Function

Catalog

Books, media, physical & digital resources