Search

Your search keyword '"Allen Zhu"' showing total 369 results

Search Constraints

Start Over You searched for: Author "Allen Zhu" Remove constraint Author: "Allen Zhu"
369 results on '"Allen Zhu"'

Search Results

1. Epitranscriptomic reader YTHDF2 regulates SEK1(MAP2K4)-JNK-cJUN inflammatory signaling in astrocytes during neurotoxic stress

2. REPIC: a database for exploring the N 6-methyladenosine methylome

3. RADAR: differential analysis of MeRIP-seq data with a random effect model

4. Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems

5. Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process

6. Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws

7. Reverse Training to Nurse the Reversal Curse

8. Egr2 overexpression in Schwann cells increases myelination frequency in vitro

9. Carbon trading, co-pollutants, and environmental equity: Evidence from California's cap-and-trade program (2011-2015).

10. Physics of Language Models: Part 3.2, Knowledge Manipulation

11. Physics of Language Models: Part 3.1, Knowledge Storage and Extraction

12. SALSA VERDE: a machine learning attack on Learning With Errors with sparse small secrets

13. Physics of Language Models: Part 1, Learning Hierarchical Language Structures

14. LoRA: Low-Rank Adaptation of Large Language Models

15. Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions

17. Byzantine-Resilient Non-Convex Stochastic Gradient Descent

18. Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning

19. Feature Purification: How Adversarial Training Performs Robust Deep Learning

24. Backward Feature Correction: How Deep Learning Performs Deep (Hierarchical) Learning

25. What Can ResNet Learn Efficiently, Going Beyond Kernels?

26. Can SGD Learn Recurrent Neural Networks with Provable Generalization?

27. The Lingering of Gradients: Theory and Applications

28. Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers

29. A Convergence Theory for Deep Learning via Over-Parameterization

30. On the Convergence Rate of Training Recurrent Neural Networks

31. Is Q-learning Provably Efficient?

32. Operator Scaling via Geodesically Convex Optimization, Invariant Theory and Polynomial Identity Testing

33. Byzantine Stochastic Gradient Descent

37. Katyusha X: Practical Momentum Method for Stochastic Sum-of-Nonconvex Optimization

38. Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits

39. How To Make the Gradients Small Stochastically: Even Faster Convex and Nonconvex SGD

40. Neon2: Finding Local Minima via First-Order Oracles

41. Near-Optimal Discrete Optimization for Experimental Design: A Regret Minimization Approach

42. Natasha 2: Faster Non-Convex Optimization Than SGD

43. Linear Convergence of a Frank-Wolfe Type Algorithm over Trace-Norm Balls

44. Much Faster Algorithms for Matrix Scaling

50. Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter

Catalog

Books, media, physical & digital resources