Search

Your search keyword '"Sutton, Richard"' showing total 2,690 results

Search Constraints

Start Over You searched for: Author "Sutton, Richard" Remove constraint Author: "Sutton, Richard"
2,690 results on '"Sutton, Richard"'

Search Results

1. Asynchronous Stochastic Approximation and Average-Reward Reinforcement Learning

2. On Convergence of Average-Reward Q-Learning in Weakly Communicating Markov Decision Processes

3. An Idiosyncrasy of Time-discretization in Reinforcement Learning

4. Reward Centering

5. MetaOptimize: A Framework for Optimizing Step Sizes and Other Meta-parameters

7. Step-size Optimization for Continual Learning

8. A Note on Stability in Asynchronous Stochastic Approximation without Communication Delays

10. Iterative Option Discovery for Planning, by Planning

11. Reinforcement learning : an introduction.

12. Reinforcement learning : an introduction.

13. Value-aware Importance Weighting for Off-policy Reinforcement Learning

14. Maintaining Plasticity in Deep Continual Learning

16. Toward Efficient Gradient-Based Value Estimation

18. Auxiliary task discovery through generate-and-test

19. On Convergence of Average-Reward Off-Policy Control Algorithms in Weakly Communicating MDPs

20. The Alberta Plan for AI Research

23. Doubly-Asynchronous Value Iteration: Making Value Iteration Asynchronous in Actions

24. Toward Discovering Options that Achieve Faster Planning

25. The Quest for a Common Model of the Intelligent Decision Maker

26. A History of Meta-gradient: Gradient Methods for Meta-learning

27. Reward-Respecting Subtasks for Model-Based Reinforcement Learning

31. Learning Agent State Online with Recurrent Generate-and-Test

32. Average-Reward Learning and Planning with Options

33. An Empirical Comparison of Off-policy Prediction Learning Algorithms in the Four Rooms Environment

34. Continual Backprop: Stochastic Gradient Descent with Persistent Randomness

35. An Empirical Comparison of Off-policy Prediction Learning Algorithms on the Collision Task

36. Planning with Expectation Models for Control

37. Does the Adam Optimizer Exacerbate Catastrophic Forgetting?

38. Average-Reward Off-Policy Policy Evaluation with Function Approximation

39. From Eye-blinks to State Construction: Diagnostic Benchmarks for Online Representation Learning

40. Understanding the Pathologies of Approximate Policy Evaluation when Combined with Greedification in Reinforcement Learning

41. Document-editing Assistants and Model-based Reinforcement Learning as a Path to Conversational AI

42. Inverse Policy Evaluation for Value-based Sequential Decision-making

43. Learning and Planning in Average-Reward Markov Decision Processes

44. Learning Sparse Representations Incrementally in Deep Reinforcement Learning

46. Discounted Reinforcement Learning Is Not an Optimization Problem

47. Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning

48. Behaviour Suite for Reinforcement Learning

49. Planning with Expectation Models

50. Learning Feature Relevance Through Step Size Adaptation in Temporal-Difference Learning

Catalog

Books, media, physical & digital resources