13 results on '"Huizhen Yu"'
Search Results
2. On Convergence of Average-Reward Q-Learning in Weakly Communicating Markov Decision Processes.
3. A Note on Stability in Asynchronous Stochastic Approximation without Communication Delays.
4. Two geometric input transformation methods for fast online reinforcement learning with neural nets.
5. On Generalized Bellman Equations and Temporal-Difference Learning.
6. Multi-step Off-policy Learning Without Importance Sampling Ratios.
7. On Convergence of some Gradient-based Temporal-Differences Algorithms for Off-Policy Learning.
8. Some Simulation Results for Emphatic Temporal-Difference Learning Algorithms.
9. Emphatic Temporal-Difference Learning.
10. On Convergence of Emphatic Temporal-Difference Learning.
11. Weak Convergence Properties of Constrained Emphatic Temporal-difference Learning with Constant and Slowly Diminishing Stepsize.
12. A Function Approximation Approach to Estimation of Policy Gradient for POMDP with Structured Policies
13. Discretized Approximations for POMDP with Average Cost
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.