Search

Your search keyword '"Ghavamzadeh, Mohammad"' showing total 261 results

Search Constraints

Start Over You searched for: Author "Ghavamzadeh, Mohammad" Remove constraint Author: "Ghavamzadeh, Mohammad" Publication Year Range Last 10 years Remove constraint Publication Year Range: Last 10 years
261 results on '"Ghavamzadeh, Mohammad"'

Search Results

1. Conservative Contextual Bandits: Beyond Linear Representations

2. Q-learning for Quantile MDPs: A Decomposition, Performance, and Convergence Analysis

3. Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models

4. Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage

5. Preference Elicitation with Soft Attributes in Interactive Recommendation

6. Factual and Personalized Recommendations using Language Models and Reinforcement Learning

7. Bayesian Regret Minimization in Offline Bandits

8. DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models

9. Private and Communication-Efficient Algorithms for Entropy Estimation

10. On Dynamic Programming Decompositions of Static Risk Measures in Markov Decision Processes

11. A Review of Deep Learning for Video Captioning

12. Aligning Text-to-Image Models using Human Feedback

13. Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management

14. Multi-Task Off-Policy Learning from Bandit Feedback

15. Operator Splitting Value Iteration

16. RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk

17. Robust Reinforcement Learning using Offline Data

18. Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings

19. A Mixture-of-Expert Approach to RL-based Dialogue Management

20. Collaborative Multi-agent Stochastic Linear Bandits

21. Multi-Environment Meta-Learning in Stochastic Linear Bandits

22. Efficient Risk-Averse Reinforcement Learning

23. Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms

24. Meta-Learning for Simple Regret Minimization

25. Discovering Personalized Semantics for Soft Attributes in Recommender Systems using Concept Activation Vectors

26. Deep Hierarchy in Bandits

27. Hierarchical Bayesian Bandits

28. Thompson Sampling with a Mixture Prior

29. Feature and Parameter Selection in Stochastic Linear Bandits

30. Fixed-Budget Best-Arm Identification in Structured Bandits

31. Adaptive Sampling for Minimax Fair Classification

32. Non-Stationary Latent Bandits

33. Soft-Robust Algorithms for Batch Reinforcement Learning

34. A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges

35. Variance-Reduced Off-Policy Memory-Efficient Policy Search

36. Deep Bayesian Quadrature Policy Optimization

37. Control-Aware Representations for Model-based Reinforcement Learning

38. Stochastic Bandits with Linear Constraints

39. Variational Model-based Policy Optimization

40. Proximal Gradient Temporal Difference Learning: Stable Reinforcement Learning with Polynomial Sample Complexity

41. Finite-Sample Analysis of Proximal Gradient TD Algorithms

42. Neural Lyapunov Redesign

43. Mirror Descent Policy Optimization

44. Active Model Estimation in Markov Decision Processes

45. Predictive Coding for Locally-Linear Control

46. Policy-Aware Model Learning for Policy Gradient Methods

47. Improved Algorithms for Conservative Exploration in Bandits

48. Conservative Exploration in Reinforcement Learning

49. Adaptive Sampling for Estimating Multiple Probability Distributions

50. Multi-step Greedy Reinforcement Learning Algorithms

Catalog

Books, media, physical & digital resources