Search

Your search keyword '"Jaggi, Martin"' showing total 425 results

Search Constraints

Start Over You searched for: Author "Jaggi, Martin" Remove constraint Author: "Jaggi, Martin" Publication Year Range Last 10 years Remove constraint Publication Year Range: Last 10 years
425 results on '"Jaggi, Martin"'

Search Results

1. Improving Stochastic Cubic Newton with Momentum

2. HyperINF: Unleashing the HyperPower of the Schulz's Method for Data Influence Estimation

3. On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists

4. CoBo: Collaborative Learning via Bilevel Optimization

5. A New First-Order Meta-Learning Algorithm with Convergence Guarantees

6. Could ChatGPT get an Engineering Degree? Evaluating Higher Education Vulnerability to AI Assistants

7. Effective Interplay between Sparsity and Quantization: From Theory to Practice

8. Deep Grokking: Would Deep Neural Networks Generalize Better?

9. Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

10. The Privacy Power of Correlated Noise in Decentralized Learning

11. Personalized Collaborative Fine-Tuning for On-Device Large Language Models

12. QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs

13. Towards an empirical understanding of MoE design choices

14. Spectral Preconditioning for Gradient Methods on Graded Non-convex Functions

15. Attention with Markov: A Framework for Principled Analysis of Transformers via Markov Chains

16. InterpretCC: Intrinsic User-Centric Interpretability through Global Mixture of Experts

17. DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging

18. Controllable Topic-Focused Abstractive Summarization

19. DoGE: Domain Reweighting with Generalization Estimation

20. Irreducible Curriculum for Language Model Pretraining

21. LASER: Linear Compression in Wireless Distributed Optimization

22. CoTFormer: A Chain-of-Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference

23. MultiModN- Multimodal, Multi-Task, Interpretable Modular Networks

24. Layer-wise Linear Mode Connectivity

25. Provably Personalized and Robust Federated Learning

26. Faster Causal Attention Over Large Sequences Through Sparse Flash Attention

27. On Convergence of Incremental Gradient for Non-Convex Smooth Functions

28. Collaborative Learning via Prediction Consensus

29. Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks

30. Ghost Noise for Regularizing Deep Neural Networks

31. Multiplication-Free Transformer Training via Piecewise Affine Operations

32. Landmark Attention: Random-Access Infinite Context Length for Transformers

33. Linearization Algorithms for Fully Composite Optimization

34. Unified Convergence Theory of Stochastic and Variance-Reduced Cubic Newton Methods

35. Beyond spectral gap (extended): The role of the topology in decentralized learning

36. Second-order optimization with lazy Hessians

37. Scalable Collaborative Learning via Representation Sharing

38. Accuracy Booster: Enabling 4-bit Fixed-point Arithmetic for DNN Training

39. Modular Clinical Decision Support Networks (MoDN) -- Updatable, Interpretable, and Portable Predictions for Evolving Clinical Environments

40. FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings

41. Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning

42. Beyond spectral gap: The role of the topology in decentralized learning

43. Special Properties of Gradient Descent with Large Learning Rates

44. SKILL: Structured Knowledge Infusion for Large Language Models

45. Data-heterogeneity-aware Mixing for Decentralized Learning

46. Improving Generalization via Uncertainty Driven Perturbations

47. Agree to Disagree: Diversity through Disagreement for Better Transferability

48. Characterizing & Finding Good Data Orderings for Fast Convergence of Sequential Gradient Methods

49. Byzantine-Robust Decentralized Learning via ClippedGossip

50. Understanding Memorization from the Perspective of Optimization via Efficient Influence Estimation

Catalog

Books, media, physical & digital resources