Back to Search Start Over

Policy Iteration for Exploratory Hamilton--Jacobi--Bellman Equations

Authors :
Tran, Hung Vinh
Wang, Zhenhua
Zhang, Yuming Paul
Publication Year :
2024

Abstract

We study the policy iteration algorithm (PIA) for entropy-regularized stochastic control problems on an infinite time horizon with a large discount rate, focusing on two main scenarios. First, we analyze PIA with bounded coefficients where the controls applied to the diffusion term satisfy a smallness condition. We demonstrate the convergence of PIA based on a uniform $\mathcal{C}^{2,\alpha}$ estimate for the value sequence generated by PIA, and provide a quantitative convergence analysis for this scenario. Second, we investigate PIA with unbounded coefficients but no control over the diffusion term. In this scenario, we first provide the well-posedness of the exploratory Hamilton--Jacobi--Bellman equation with linear growth coefficients and polynomial growth reward function. By such a well-posedess result we achieve PIA's convergence by establishing a quantitative locally uniform $\mathcal{C}^{1,\alpha}$ estimates for the generated value sequence.<br />Comment: 21 pages

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2406.00612
Document Type :
Working Paper