37 results on '"Han, Jiequn"'
Search Results
2. Reinforcement Learning with Function Approximation: From Linear to Nonlinear
- Author
-
Long, Jihao, Han, Jiequn, Long, Jihao, and Han, Jiequn
- Abstract
Function approximation has been an indispensable component in modern reinforcement learning algorithms designed to tackle problems with large state spaces in high dimensions. This paper reviews recent results on error analysis for these reinforcement learning algorithms in linear or nonlinear approximation settings, emphasizing approximation error and estimation error/sample complexity. We discuss various properties related to approximation error and present concrete conditions on transition probability and reward function under which these properties hold true. Sample complexity analysis in reinforcement learning is more complicated than in supervised learning, primarily due to the distribution mismatch phenomenon. With assumptions on the linear structure of the problem, numerous algorithms in the literature achieve polynomial sample complexity with respect to the number of features, episode length, and accuracy, although the minimax rate has not been achieved yet. These results rely on the $L^\infty$ and UCB estimation of estimation error, which can handle the distribution mismatch phenomenon. The problem and analysis become substantially more challenging in the setting of nonlinear function approximation, as both $L^\infty$ and UCB estimation are inadequate for bounding the error with a favorable rate in high dimensions. We discuss additional assumptions necessary to address the distribution mismatch and derive meaningful results for nonlinear RL problems.
- Published
- 2023
- Full Text
- View/download PDF
3. Improving Gradient Computation for Differentiable Physics Simulation with Contacts
- Author
-
Zhong, Yaofeng Desmond, Han, Jiequn, Dey, Biswadip, Brikis, Georgia Olympia, Zhong, Yaofeng Desmond, Han, Jiequn, Dey, Biswadip, and Brikis, Georgia Olympia
- Abstract
Differentiable simulation enables gradients to be back-propagated through physics simulations. In this way, one can learn the dynamics and properties of a physics system by gradient-based optimization or embed the whole differentiable simulation as a layer in a deep learning model for downstream tasks, such as planning and control. However, differentiable simulation at its current stage is not perfect and might provide wrong gradients that deteriorate its performance in learning tasks. In this paper, we study differentiable rigid-body simulation with contacts. We find that existing differentiable simulation methods provide inaccurate gradients when the contact normal direction is not fixed - a general situation when the contacts are between two moving objects. We propose to improve gradient computation by continuous collision detection and leverage the time-of-impact (TOI) to calculate the post-collision velocities. We demonstrate our proposed method, referred to as TOI-Velocity, on two optimal control problems. We show that with TOI-Velocity, we are able to learn an optimal control sequence that matches the analytical solution, while without TOI-Velocity, existing differentiable simulation methods fail to do so., Comment: 5th Annual Conference on Learning for Dynamics and Control
- Published
- 2023
4. Stochastic Optimal Control Matching
- Author
-
Domingo-Enrich, Carles, Han, Jiequn, Amos, Brandon, Bruna, Joan, Chen, Ricky T. Q., Domingo-Enrich, Carles, Han, Jiequn, Amos, Brandon, Bruna, Joan, and Chen, Ricky T. Q.
- Abstract
Stochastic optimal control, which has the goal of driving the behavior of noisy systems, is broadly applicable in science, engineering and artificial intelligence. Our work introduces Stochastic Optimal Control Matching (SOCM), a novel Iterative Diffusion Optimization (IDO) technique for stochastic optimal control that stems from the same philosophy as the conditional score matching loss for diffusion models. That is, the control is learned via a least squares problem by trying to fit a matching vector field. The training loss, which is closely connected to the cross-entropy loss, is optimized with respect to both the control function and a family of reparameterization matrices which appear in the matching vector field. The optimization with respect to the reparameterization matrices aims at minimizing the variance of the matching vector field. Experimentally, our algorithm achieves lower error than all the existing IDO techniques for stochastic optimal control for three out of four control problems, in some cases by an order of magnitude. The key idea underlying SOCM is the path-wise reparameterization trick, a novel technique that may be of independent interest. Code at https://github.com/facebookresearch/SOC-matching
- Published
- 2023
5. Learning Free Terminal Time Optimal Closed-loop Control of Manipulators
- Author
-
Hu, Wei, Zhao, Yue, E, Weinan, Han, Jiequn, Long, Jihao, Hu, Wei, Zhao, Yue, E, Weinan, Han, Jiequn, and Long, Jihao
- Abstract
This paper presents a novel approach to learning free terminal time closed-loop control for robotic manipulation tasks, enabling dynamic adjustment of task duration and control inputs to enhance performance. We extend the supervised learning approach, namely solving selected optimal open-loop problems and utilizing them as training data for a policy network, to the free terminal time scenario. Three main challenges are addressed in this extension. First, we introduce a marching scheme that enhances the solution quality and increases the success rate of the open-loop solver by gradually refining time discretization. Second, we extend the QRnet in Nakamura-Zimmerer et al. (2021b) to the free terminal time setting to address discontinuity and improve stability at the terminal state. Third, we present a more automated version of the initial value problem (IVP) enhanced sampling method from previous work (Zhang et al., 2022) to adaptively update the training dataset, significantly improving its quality. By integrating these techniques, we develop a closed-loop policy that operates effectively over a broad domain with varying optimal time durations, achieving near globally optimal total costs.
- Published
- 2023
6. A PDE-free, neural network-based eddy viscosity model coupled with RANS equations
- Author
-
Xu, R. (author), Zhou, Xu Hui (author), Han, Jiequn (author), Dwight, R.P. (author), Xiao, Heng (author), Xu, R. (author), Zhou, Xu Hui (author), Han, Jiequn (author), Dwight, R.P. (author), and Xiao, Heng (author)
- Abstract
In fluid dynamics, constitutive models are often used to describe the unresolved turbulence and to close the Reynolds averaged Navier–Stokes (RANS) equations. Traditional PDE-based constitutive models are usually too rigid to calibrate with a large set of high-fidelity data. Moreover, commonly used turbulence models are based on the weak equilibrium assumption, which cannot adequately capture the nonlocal physics of turbulence. In this work, we propose using a vector-cloud neural network (VCNN) to learn the nonlocal constitutive model, which maps a regional mean flow field to the local turbulence quantities without solving the transport PDEs. The network is strictly invariant to coordinate translation, rotation, and uniform motion, as well as ordering of the input points. The VCNN-based nonlocal constitutive model is trained and evaluated on flows over a family of parameterized periodic hills. Numerical results demonstrate its predictive capability on target turbulence quantities of turbulent kinetic energy k and dissipation ɛ. More importantly, we investigate the robustness and stability of the method by coupling the trained model back to RANS solver. The solver shows good convergence with the simulated velocity field comparable to that based on k–ɛ model when starting from a reasonable initial condition. This study, as a proof of concept, highlights the feasibility of using a nonlocal, frame-independent, neural network-based constitutive model to close the RANS equations, paving the way for the further emulation of the Reynolds stress transport models., Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public., Aerodynamics
- Published
- 2022
- Full Text
- View/download PDF
7. A Neural Network Warm-Start Approach for the Inverse Acoustic Obstacle Scattering Problem
- Author
-
Zhou, Mo, Han, Jiequn, Rachh, Manas, Borges, Carlos, Zhou, Mo, Han, Jiequn, Rachh, Manas, and Borges, Carlos
- Abstract
We consider the inverse acoustic obstacle problem for sound-soft star-shaped obstacles in two dimensions wherein the boundary of the obstacle is determined from measurements of the scattered field at a collection of receivers outside the object. One of the standard approaches for solving this problem is to reformulate it as an optimization problem: finding the boundary of the domain that minimizes the $L^2$ distance between computed values of the scattered field and the given measurement data. The optimization problem is computationally challenging since the local set of convexity shrinks with increasing frequency and results in an increasing number of local minima in the vicinity of the true solution. In many practical experimental settings, low frequency measurements are unavailable due to limitations of the experimental setup or the sensors used for measurement. Thus, obtaining a good initial guess for the optimization problem plays a vital role in this environment. We present a neural network warm-start approach for solving the inverse scattering problem, where an initial guess for the optimization problem is obtained using a trained neural network. We demonstrate the effectiveness of our method with several numerical examples. For high frequency problems, this approach outperforms traditional iterative methods such as Gauss-Newton initialized without any prior (i.e., initialized using a unit circle), or initialized using the solution of a direct method such as the linear sampling method. The algorithm remains robust to noise in the scattered field measurements and also converges to the true solution for limited aperture data. However, the number of training samples required to train the neural network scales exponentially in frequency and the complexity of the obstacles considered. We conclude with a discussion of this phenomenon and potential directions for future research.
- Published
- 2022
- Full Text
- View/download PDF
8. Offline Supervised Learning V.S. Online Direct Policy Optimization: A Comparative Study and A Unified Training Paradigm for Neural Network-Based Optimal Feedback Control
- Author
-
Zhao, Yue, Han, Jiequn, Zhao, Yue, and Han, Jiequn
- Abstract
This work is concerned with solving neural network-based feedback controllers efficiently for optimal control problems. We first conduct a comparative study of two prevalent approaches: offline supervised learning and online direct policy optimization. Albeit the training part of the supervised learning approach is relatively easy, the success of the method heavily depends on the optimal control dataset generated by open-loop optimal control solvers. In contrast, direct policy optimization turns the optimal control problem into an optimization problem directly without any requirement of pre-computing, but the dynamics-related objective can be hard to optimize when the problem is complicated. Our results underscore the superiority of offline supervised learning in terms of both optimality and training time. To overcome the main challenges, dataset and optimization, in the two approaches respectively, we complement them and propose the Pre-train and Fine-tune strategy as a unified training paradigm for optimal feedback control, which further improves the performance and robustness significantly. Our code is accessible at https://github.com/yzhao98/DeepOptimalControl., Comment: Accepted for publication in Physica D
- Published
- 2022
- Full Text
- View/download PDF
9. Pandemic Control, Game Theory and Machine Learning
- Author
-
Xuan, Yao, Balkin, Robert, Han, Jiequn, Hu, Ruimeng, Ceniceros, Hector D., Xuan, Yao, Balkin, Robert, Han, Jiequn, Hu, Ruimeng, and Ceniceros, Hector D.
- Abstract
Game theory has been an effective tool in the control of disease spread and in suggesting optimal policies at both individual and area levels. In this AMS Notices article, we focus on the decision-making development for the intervention of COVID-19, aiming to provide mathematical models and efficient machine learning methods, and justifications for related policies that have been implemented in the past and explain how the authorities' decisions affect their neighboring regions from a game theory viewpoint.
- Published
- 2022
10. Differentiable Physics Simulations with Contacts: Do They Have Correct Gradients w.r.t. Position, Velocity and Control?
- Author
-
Zhong, Yaofeng Desmond, Han, Jiequn, Brikis, Georgia Olympia, Zhong, Yaofeng Desmond, Han, Jiequn, and Brikis, Georgia Olympia
- Abstract
In recent years, an increasing amount of work has focused on differentiable physics simulation and has produced a set of open source projects such as Tiny Differentiable Simulator, Nimble Physics, diffTaichi, Brax, Warp, Dojo and DiffCoSim. By making physics simulations end-to-end differentiable, we can perform gradient-based optimization and learning tasks. A majority of differentiable simulators consider collisions and contacts between objects, but they use different contact models for differentiability. In this paper, we overview four kinds of differentiable contact formulations - linear complementarity problems (LCP), convex optimization models, compliant models and position-based dynamics (PBD). We analyze and compare the gradients calculated by these models and show that the gradients are not always correct. We also demonstrate their ability to learn an optimal control strategy by comparing the learned strategies with the optimal strategy in an analytical form. The codebase to reproduce the experiment results is available at https://github.com/DesmondZhong/diff_sim_grads., Comment: 2nd AI4Science Workshop at ICML 2022
- Published
- 2022
11. Learning High-Dimensional McKean-Vlasov Forward-Backward Stochastic Differential Equations with General Distribution Dependence
- Author
-
Han, Jiequn, Hu, Ruimeng, Long, Jihao, Han, Jiequn, Hu, Ruimeng, and Long, Jihao
- Abstract
One of the core problems in mean-field control and mean-field games is to solve the corresponding McKean-Vlasov forward-backward stochastic differential equations (MV-FBSDEs). Most existing methods are tailored to special cases in which the mean-field interaction only depends on expectation or other moments and thus inadequate to solve problems when the mean-field interaction has full distribution dependence. In this paper, we propose a novel deep learning method for computing MV-FBSDEs with a general form of mean-field interactions. Specifically, built on fictitious play, we recast the problem into repeatedly solving standard FBSDEs with explicit coefficient functions. These coefficient functions are used to approximate the MV-FBSDEs' model coefficients with full distribution dependence, and are updated by solving another supervising learning problem using training data simulated from the last iteration's FBSDE solutions. We use deep neural networks to solve standard BSDEs and approximate coefficient functions in order to solve high-dimensional MV-FBSDEs. Under proper assumptions on the learned functions, we prove that the convergence of the proposed method is free of the curse of dimensionality (CoD) by using a class of integral probability metrics previously developed in [Han, Hu and Long, arXiv:2104.12036]. The proved theorem shows the advantage of the method in high dimensions. We present the numerical performance in high-dimensional MV-FBSDE problems, including a mean-field game example of the well-known Cucker-Smale model whose cost depends on the full distribution of the forward process.
- Published
- 2022
12. Frame-independent vector-cloud neural network for nonlocal constitutive modeling on arbitrary grids
- Author
-
Zhou, Xu-Hui, Han, Jiequn, Xiao, Heng, Zhou, Xu-Hui, Han, Jiequn, and Xiao, Heng
- Abstract
Constitutive models are widely used for modeling complex systems in science and engineering, where first-principle-based, well-resolved simulations are often prohibitively expensive. For example, in fluid dynamics, constitutive models are required to describe nonlocal, unresolved physics such as turbulence and laminar–turbulent transition. However, traditional constitutive models based on partial differential equations (PDEs) often lack robustness and are too rigid to accommodate diverse calibration datasets. We propose a frame-independent, nonlocal constitutive model based on a vector-cloud neural network that can be learned with data. The model predicts the closure variable at a point based on the flow information in its neighborhood. Such nonlocal information is represented by a group of points, each having a feature vector attached to it, and thus the input is referred to as vector cloud. The cloud is mapped to the closure variable through a frame-independent neural network, invariant both to coordinate translation and rotation and to the ordering of points in the cloud. As such, the network can deal with any number of arbitrarily arranged grid points and thus is suitable for unstructured meshes in fluid simulations. The merits of the proposed network are demonstrated for scalar transport PDEs on a family of parameterized periodic hill geometries. The vector-cloud neural network is a promising tool not only as nonlocal constitutive models and but also as general surrogate models for PDEs on irregular domains.
- Published
- 2022
13. Frame invariance and scalability of neural operators for partial differential equations
- Author
-
Zafar, Muhammad I., Han, Jiequn, Zhou, Xu-Hui, Xiao, Heng, Zafar, Muhammad I., Han, Jiequn, Zhou, Xu-Hui, and Xiao, Heng
- Abstract
Partial differential equations (PDEs) play a dominant role in the mathematical modeling of many complex dynamical processes. Solving these PDEs often requires prohibitively high computational costs, especially when multiple evaluations must be made for different parameters or conditions. After training, neural operators can provide PDEs solutions significantly faster than traditional PDE solvers. In this work, invariance properties and computational complexity of two neural operators are examined for transport PDE of a scalar quantity. Neural operator based on graph kernel network (GKN) operates on graph-structured data to incorporate nonlocal dependencies. Here we propose a modified formulation of GKN to achieve frame invariance. Vector cloud neural network (VCNN) is an alternate neural operator with embedded frame invariance which operates on point cloud data. GKN-based neural operator demonstrates slightly better predictive performance compared to VCNN. However, GKN requires an excessively high computational cost that increases quadratically with the increasing number of discretized objects as compared to a linear increase for VCNN.
- Published
- 2021
- Full Text
- View/download PDF
14. DeepHAM: A Global Solution Method for Heterogeneous Agent Models with Aggregate Shocks
- Author
-
Han, Jiequn, Yang, Yucheng, E, Weinan, Han, Jiequn, Yang, Yucheng, and E, Weinan
- Abstract
An efficient, reliable, and interpretable global solution method, the Deep learning-based algorithm for Heterogeneous Agent Models (DeepHAM), is proposed for solving high dimensional heterogeneous agent models with aggregate shocks. The state distribution is approximately represented by a set of optimal generalized moments. Deep neural networks are used to approximate the value and policy functions, and the objective is optimized over directly simulated paths. In addition to being an accurate global solver, this method has three additional features. First, it is computationally efficient in solving complex heterogeneous agent models, and it does not suffer from the curse of dimensionality. Second, it provides a general and interpretable representation of the distribution over individual states, which is crucial in addressing the classical question of whether and how heterogeneity matters in macroeconomics. Third, it solves the constrained efficiency problem as easily as it solves the competitive equilibrium, which opens up new possibilities for studying optimal monetary and fiscal policies in heterogeneous agent models with aggregate shocks., Comment: Slides available at https://users.flatironinstitute.org/~jhan/files/DeepHAM_slides.pdf
- Published
- 2021
15. Perturbational Complexity by Distribution Mismatch: A Systematic Analysis of Reinforcement Learning in Reproducing Kernel Hilbert Space
- Author
-
Long, Jihao, Han, Jiequn, Long, Jihao, and Han, Jiequn
- Abstract
Most existing theoretical analysis of reinforcement learning (RL) is limited to the tabular setting or linear models due to the difficulty in dealing with function approximation in high dimensional space with an uncertain environment. This work offers a fresh perspective into this challenge by analyzing RL in a general reproducing kernel Hilbert space (RKHS). We consider a family of Markov decision processes $\mathcal{M}$ of which the reward functions lie in the unit ball of an RKHS and transition probabilities lie in a given arbitrary set. We define a quantity called perturbational complexity by distribution mismatch $\Delta_{\mathcal{M}}(\epsilon)$ to characterize the complexity of the admissible state-action distribution space in response to a perturbation in the RKHS with scale $\epsilon$. We show that $\Delta_{\mathcal{M}}(\epsilon)$ gives both the lower bound of the error of all possible algorithms and the upper bound of two specific algorithms (fitted reward and fitted Q-iteration) for the RL problem. Hence, the decay of $\Delta_\mathcal{M}(\epsilon)$ with respect to $\epsilon$ measures the difficulty of the RL problem on $\mathcal{M}$. We further provide some concrete examples and discuss whether $\Delta_{\mathcal{M}}(\epsilon)$ decays fast or not in these examples. As a byproduct, we show that when the reward functions lie in a high dimensional RKHS, even if the transition probability is known and the action space is finite, it is still possible for RL problems to suffer from the curse of dimensionality.
- Published
- 2021
16. A Class of Dimension-free Metrics for the Convergence of Empirical Measures
- Author
-
Han, Jiequn, Hu, Ruimeng, Long, Jihao, Han, Jiequn, Hu, Ruimeng, and Long, Jihao
- Abstract
This paper concerns the convergence of empirical measures in high dimensions. We propose a new class of probability metrics and show that under such metrics, the convergence is free of the curse of dimensionality (CoD). Such a feature is critical for high-dimensional analysis and stands in contrast to classical metrics ({\it e.g.}, the Wasserstein metric). The proposed metrics fall into the category of integral probability metrics, for which we specify criteria of test function spaces to guarantee the property of being free of CoD. Examples of the selected test function spaces include the reproducing kernel Hilbert spaces, Barron space, and flow-induced function spaces. Three applications of the proposed metrics are presented: 1. The convergence of empirical measure in the case of random variables; 2. The convergence of $n$-particle system to the solution to McKean-Vlasov stochastic differential equation; 3. The construction of an $\varepsilon$-Nash equilibrium for a homogeneous $n$-player game by its mean-field limit. As a byproduct, we prove that, given a distribution close to the target distribution measured by our metric and a certain representation of the target distribution, we can generate a distribution close to the target one in terms of the Wasserstein metric and relative entropy. Overall, we show that the proposed class of metrics is a powerful tool to analyze the convergence of empirical measures in high dimensions without CoD.
- Published
- 2021
17. An $L^2$ Analysis of Reinforcement Learning in High Dimensions with Kernel and Neural Network Approximation
- Author
-
Long, Jihao, Han, Jiequn, E, Weinan, Long, Jihao, Han, Jiequn, and E, Weinan
- Abstract
Reinforcement learning (RL) algorithms based on high-dimensional function approximation have achieved tremendous empirical success in large-scale problems with an enormous number of states. However, most analysis of such algorithms gives rise to error bounds that involve either the number of states or the number of features. This paper considers the situation where the function approximation is made either using the kernel method or the two-layer neural network model, in the context of a fitted Q-iteration algorithm with explicit regularization. We establish an $\tilde{O}(H^3|\mathcal {A}|^{\frac14}n^{-\frac14})$ bound for the optimal policy with $Hn$ samples, where $H$ is the length of each episode and $|\mathcal {A}|$ is the size of action space. Our analysis hinges on analyzing the $L^2$ error of the approximated Q-function using $n$ data points. Even though this result still requires a finite-sized action space, the error bound is independent of the dimensionality of the state space.
- Published
- 2021
18. Frame-independent vector-cloud neural network for nonlocal constitutive modeling on arbitrary grids
- Author
-
Zhou, Xu-Hui, Han, Jiequn, Xiao, Heng, Zhou, Xu-Hui, Han, Jiequn, and Xiao, Heng
- Abstract
Constitutive models are widely used for modeling complex systems in science and engineering, where first-principle-based, well-resolved simulations are often prohibitively expensive. For example, in fluid dynamics, constitutive models are required to describe nonlocal, unresolved physics such as turbulence and laminar-turbulent transition. However, traditional constitutive models based on partial differential equations (PDEs) often lack robustness and are too rigid to accommodate diverse calibration datasets. We propose a frame-independent, nonlocal constitutive model based on a vector-cloud neural network that can be learned with data. The model predicts the closure variable at a point based on the flow information in its neighborhood. Such nonlocal information is represented by a group of points, each having a feature vector attached to it, and thus the input is referred to as vector cloud. The cloud is mapped to the closure variable through a frame-independent neural network, invariant both to coordinate translation and rotation and to the ordering of points in the cloud. As such, the network can deal with any number of arbitrarily arranged grid points and thus is suitable for unstructured meshes in fluid simulations. The merits of the proposed network are demonstrated for scalar transport PDEs on a family of parameterized periodic hill geometries. The vector-cloud neural network is a promising tool not only as nonlocal constitutive models and but also as general surrogate models for PDEs on irregular domains.
- Published
- 2021
- Full Text
- View/download PDF
19. Actor-Critic Method for High Dimensional Static Hamilton--Jacobi--Bellman Partial Differential Equations based on Neural Networks
- Author
-
Zhou, Mo, Han, Jiequn, Lu, Jianfeng, Zhou, Mo, Han, Jiequn, and Lu, Jianfeng
- Abstract
We propose a novel numerical method for high dimensional Hamilton--Jacobi--Bellman (HJB) type elliptic partial differential equations (PDEs). The HJB PDEs, reformulated as optimal control problems, are tackled by the actor-critic framework inspired by reinforcement learning, based on neural network parametrization of the value and control functions. Within the actor-critic framework, we employ a policy gradient approach to improve the control, while for the value function, we derive a variance reduced least-squares temporal difference method using stochastic calculus. To numerically discretize the stochastic control problem, we employ an adaptive step size scheme to improve the accuracy near the domain boundary. Numerical examples up to $20$ spatial dimensions including the linear quadratic regulators, the stochastic Van der Pol oscillators, the diffusive Eikonal equations, and fully nonlinear elliptic PDEs derived from a regulator problem are presented to validate the effectiveness of our proposed method., Comment: 23 pages, 4 figures. Add a fully nonlinear example
- Published
- 2021
- Full Text
- View/download PDF
20. Recurrent Neural Networks for Stochastic Control Problems with Delay
- Author
-
Han, Jiequn, Hu, Ruimeng, Han, Jiequn, and Hu, Ruimeng
- Abstract
Stochastic control problems with delay are challenging due to the path-dependent feature of the system and thus its intrinsic high dimensions. In this paper, we propose and systematically study deep neural networks-based algorithms to solve stochastic control problems with delay features. Specifically, we employ neural networks for sequence modeling (\emph{e.g.}, recurrent neural networks such as long short-term memory) to parameterize the policy and optimize the objective function. The proposed algorithms are tested on three benchmark examples: a linear-quadratic problem, optimal consumption with fixed finite delay, and portfolio optimization with complete memory. Particularly, we notice that the architecture of recurrent neural networks naturally captures the path-dependent feature with much flexibility and yields better performance with more efficient and stable training of the network compared to feedforward networks. The superiority is even evident in the case of portfolio optimization with complete memory, which features infinite delay.
- Published
- 2021
21. Convergence of Deep Fictitious Play for Stochastic Differential Games
- Author
-
Han, Jiequn, Hu, Ruimeng, Long, Jihao, Han, Jiequn, Hu, Ruimeng, and Long, Jihao
- Abstract
Stochastic differential games have been used extensively to model agents' competitions in Finance, for instance, in P2P lending platforms from the Fintech industry, the banking system for systemic risk, and insurance markets. The recently proposed machine learning algorithm, deep fictitious play, provides a novel efficient tool for finding Markovian Nash equilibrium of large $N$-player asymmetric stochastic differential games [J. Han and R. Hu, Mathematical and Scientific Machine Learning Conference, pages 221-245, PMLR, 2020]. By incorporating the idea of fictitious play, the algorithm decouples the game into $N$ sub-optimization problems, and identifies each player's optimal strategy with the deep backward stochastic differential equation (BSDE) method parallelly and repeatedly. In this paper, we prove the convergence of deep fictitious play (DFP) to the true Nash equilibrium. We can also show that the strategy based on DFP forms an $\eps$-Nash equilibrium. We generalize the algorithm by proposing a new approach to decouple the games, and present numerical results of large population games showing the empirical convergence of the algorithm beyond the technical assumptions in the theorems.
- Published
- 2020
22. Integrating Machine Learning with Physics-Based Modeling
- Author
-
E, Weinan, Han, Jiequn, Zhang, Linfeng, E, Weinan, Han, Jiequn, and Zhang, Linfeng
- Abstract
Machine learning is poised as a very powerful tool that can drastically improve our ability to carry out scientific research. However, many issues need to be addressed before this becomes a reality. This article focuses on one particular issue of broad interest: How can we integrate machine learning with physics-based modeling to develop new interpretable and truly reliable physical models? After introducing the general guidelines, we discuss the two most important issues for developing machine learning-based physical models: Imposing physical constraints and obtaining optimal datasets. We also provide a simple and intuitive explanation for the fundamental reasons behind the success of modern machine learning, as well as an introduction to the concurrent machine learning framework needed for integrating machine learning with physics-based modeling. Molecular dynamics and moment closure of kinetic equations are used as examples to illustrate the main issues discussed. We end with a general discussion on where this integration will lead us to, and where the new frontier will be after machine learning is successfully integrated into scientific modeling.
- Published
- 2020
23. Escaping Saddle Points Efficiently with Occupation-Time-Adapted Perturbations
- Author
-
Guo, Xin, Han, Jiequn, Tajrobehkar, Mahan, Tang, Wenpin, Guo, Xin, Han, Jiequn, Tajrobehkar, Mahan, and Tang, Wenpin
- Abstract
Motivated by the super-diffusivity of self-repelling random walk, which has roots in statistical physics, this paper develops a new perturbation mechanism for optimization algorithms. In this mechanism, perturbations are adapted to the history of states via the notion of occupation time. After integrating this mechanism into the framework of perturbed gradient descent (PGD) and perturbed accelerated gradient descent (PAGD), two new algorithms are proposed: perturbed gradient descent adapted to occupation time (PGDOT) and its accelerated version (PAGDOT). PGDOT and PAGDOT are shown to converge to second-order stationary points at least as fast as PGD and PAGD, respectively, and thus they are guaranteed to avoid getting stuck at non-degenerate saddle points. The theoretical analysis is corroborated by empirical studies in which the new algorithms consistently escape saddle points and outperform not only their counterparts, PGD and PAGD, but also other popular alternatives including stochastic gradient descent, Adam, AMSGrad, and RMSProp., Comment: 17 pages, 6 figures
- Published
- 2020
24. Solving high-dimensional eigenvalue problems using deep neural networks: A diffusion Monte Carlo like approach
- Author
-
Han, Jiequn, Lu, Jianfeng, Zhou, Mo, Han, Jiequn, Lu, Jianfeng, and Zhou, Mo
- Abstract
We propose a new method to solve eigenvalue problems for linear and semilinear second order differential operators in high dimensions based on deep neural networks. The eigenvalue problem is reformulated as a fixed point problem of the semigroup flow induced by the operator, whose solution can be represented by Feynman-Kac formula in terms of forward-backward stochastic differential equations. The method shares a similar spirit with diffusion Monte Carlo but augments a direct approximation to the eigenfunction through neural-network ansatz. The criterion of fixed point provides a natural loss function to search for parameters via optimization. Our approach is able to provide accurate eigenvalue and eigenfunction approximations in several numerical examples, including Fokker-Planck operator and the linear and nonlinear Schr\"odinger operators in high dimensions., Comment: 18 pages, 6 figures, 5 tables
- Published
- 2020
- Full Text
- View/download PDF
25. Optimal Policies for a Pandemic: A Stochastic Game Approach and a Deep Learning Algorithm
- Author
-
Xuan, Yao, Balkin, Robert, Han, Jiequn, Hu, Ruimeng, Ceniceros, Hector D., Xuan, Yao, Balkin, Robert, Han, Jiequn, Hu, Ruimeng, and Ceniceros, Hector D.
- Abstract
Game theory has been an effective tool in the control of disease spread and in suggesting optimal policies at both individual and area levels. In this paper, we propose a multi-region SEIR model based on stochastic differential game theory, aiming to formulate optimal regional policies for infectious diseases. Specifically, we enhance the standard epidemic SEIR model by taking into account the social and health policies issued by multiple region planners. This enhancement makes the model more realistic and powerful. However, it also introduces a formidable computational challenge due to the high dimensionality of the solution space brought by the presence of multiple regions. This significant numerical difficulty of the model structure motivates us to generalize the deep fictitious algorithm introduced in [Han and Hu, MSML2020, pp.221--245, PMLR, 2020] and develop an improved algorithm to overcome the curse of dimensionality. We apply the proposed model and algorithm to study the COVID-19 pandemic in three states: New York, New Jersey, and Pennsylvania. The model parameters are estimated from real data posted by the Centers for Disease Control and Prevention (CDC). We are able to show the effects of the lockdown/travel ban policy on the spread of COVID-19 for each state and how their policies affect each other.
- Published
- 2020
26. On the Curse of Memory in Recurrent Neural Networks: Approximation and Optimization Analysis
- Author
-
Li, Zhong, Han, Jiequn, E, Weinan, Li, Qianxiao, Li, Zhong, Han, Jiequn, E, Weinan, and Li, Qianxiao
- Abstract
We study the approximation properties and optimization dynamics of recurrent neural networks (RNNs) when applied to learn input-output relationships in temporal data. We consider the simple but representative setting of using continuous-time linear RNNs to learn from data generated by linear relationships. Mathematically, the latter can be understood as a sequence of linear functionals. We prove a universal approximation theorem of such linear functionals, and characterize the approximation rate and its relation with memory. Moreover, we perform a fine-grained dynamical analysis of training linear RNNs, which further reveal the intricate interactions between memory and learning. A unifying theme uncovered is the non-trivial effect of memory, a notion that can be made precise in our framework, on approximation and optimization: when there is long term memory in the target, it takes a large number of neurons to approximate it. Moreover, the training process will suffer from slow downs. In particular, both of these effects become exponentially more pronounced with memory - a phenomenon we call the "curse of memory". These analyses represent a basic step towards a concrete mathematical understanding of new phenomenon that may arise in learning temporal relationships using recurrent architectures., Comment: Published version
- Published
- 2020
27. Algorithms for Solving High Dimensional PDEs: From Nonlinear Monte Carlo to Machine Learning
- Author
-
E, Weinan, Han, Jiequn, Jentzen, Arnulf, E, Weinan, Han, Jiequn, and Jentzen, Arnulf
- Abstract
In recent years, tremendous progress has been made on numerical algorithms for solving partial differential equations (PDEs) in a very high dimension, using ideas from either nonlinear (multilevel) Monte Carlo or deep learning. They are potentially free of the curse of dimensionality for many different applications and have been proven to be so in the case of some nonlinear Monte Carlo methods for nonlinear parabolic PDEs. In this paper, we review these numerical and theoretical advances. In addition to algorithms based on stochastic reformulations of the original problem, such as the multilevel Picard iteration and the Deep BSDE method, we also discuss algorithms based on the more traditional Ritz, Galerkin, and least square formulations. We hope to demonstrate to the reader that studying PDEs as well as control and variational problems in very high dimensions might very well be among the most promising new directions in mathematics and scientific computing in the near future.
- Published
- 2020
- Full Text
- View/download PDF
28. Deep Fictitious Play for Finding Markovian Nash Equilibrium in Multi-Agent Games
- Author
-
Han, Jiequn, Hu, Ruimeng, Han, Jiequn, and Hu, Ruimeng
- Abstract
We propose a deep neural network-based algorithm to identify the Markovian Nash equilibrium of general large $N$-player stochastic differential games. Following the idea of fictitious play, we recast the $N$-player game into $N$ decoupled decision problems (one for each player) and solve them iteratively. The individual decision problem is characterized by a semilinear Hamilton-Jacobi-Bellman equation, to solve which we employ the recently developed deep BSDE method. The resulted algorithm can solve large $N$-player games for which conventional numerical methods would suffer from the curse of dimensionality. Multiple numerical examples involving identical or heterogeneous agents, with risk-neutral or risk-sensitive objectives, are tested to validate the accuracy of the proposed algorithm in large group games. Even for a fifty-player game with the presence of common noise, the proposed algorithm still finds the approximate Nash equilibrium accurately, which, to our best knowledge, is difficult to achieve by other numerical algorithms.
- Published
- 2019
29. Universal approximation of symmetric and anti-symmetric functions
- Author
-
Han, Jiequn, Li, Yingzhou, Lin, Lin, Lu, Jianfeng, Zhang, Jiefu, Zhang, Linfeng, Han, Jiequn, Li, Yingzhou, Lin, Lin, Lu, Jianfeng, Zhang, Jiefu, and Zhang, Linfeng
- Abstract
We consider universal approximations of symmetric and anti-symmetric functions, which are important for applications in quantum physics, as well as other scientific and engineering computations. We give constructive approximations with explicit bounds on the number of parameters with respect to the dimension and the target accuracy $\epsilon$. While the approximation still suffers from the curse of dimensionality, to the best of our knowledge, these are the first results in the literature with explicit error bounds for functions with symmetry or anti-symmetry constraints.
- Published
- 2019
30. Convergence of the Deep BSDE Method for Coupled FBSDEs
- Author
-
Han, Jiequn, Long, Jihao, Han, Jiequn, and Long, Jihao
- Abstract
The recently proposed numerical algorithm, deep BSDE method, has shown remarkable performance in solving high-dimensional forward-backward stochastic differential equations (FBSDEs) and parabolic partial differential equations (PDEs). This article lays a theoretical foundation for the deep BSDE method in the general case of coupled FBSDEs. In particular, a posteriori error estimation of the solution is provided and it is proved that the error converges to zero given the universal approximation capability of neural networks. Numerical results are presented to demonstrate the accuracy of the analyzed algorithm in solving high-dimensional coupled FBSDEs.
- Published
- 2018
- Full Text
- View/download PDF
31. A Mean-Field Optimal Control Formulation of Deep Learning
- Author
-
E, Weinan, Han, Jiequn, Li, Qianxiao, E, Weinan, Han, Jiequn, and Li, Qianxiao
- Abstract
Recent work linking deep neural networks and dynamical systems opened up new avenues to analyze deep learning. In particular, it is observed that new insights can be obtained by recasting deep learning as an optimal control problem on difference or differential equations. However, the mathematical aspects of such a formulation have not been systematically explored. This paper introduces the mathematical formulation of the population risk minimization problem in deep learning as a mean-field optimal control problem. Mirroring the development of classical optimal control, we state and prove optimality conditions of both the Hamilton-Jacobi-Bellman type and the Pontryagin type. These mean-field results reflect the probabilistic nature of the learning problem. In addition, by appealing to the mean-field Pontryagin's maximum principle, we establish some quantitative relationships between population and empirical learning problems. This serves to establish a mathematical foundation for investigating the algorithmic and theoretical connections between optimal control and deep learning., Comment: 44 pages
- Published
- 2018
- Full Text
- View/download PDF
32. Deep Potential Molecular Dynamics: a scalable model with the accuracy of quantum mechanics
- Author
-
Zhang, Linfeng, Han, Jiequn, Wang, Han, Car, Roberto, E, Weinan, Zhang, Linfeng, Han, Jiequn, Wang, Han, Car, Roberto, and E, Weinan
- Abstract
We introduce a scheme for molecular simulations, the Deep Potential Molecular Dynamics (DeePMD) method, based on a many-body potential and interatomic forces generated by a carefully crafted deep neural network trained with ab initio data. The neural network model preserves all the natural symmetries in the problem. It is "first principle-based" in the sense that there are no ad hoc components aside from the network model. We show that the proposed scheme provides an efficient and accurate protocol in a variety of systems, including bulk materials and molecules. In all these cases, DeePMD gives results that are essentially indistinguishable from the original data, at a cost that scales linearly with system size.
- Published
- 2017
- Full Text
- View/download PDF
33. Solving high-dimensional partial differential equations using deep learning
- Author
-
Han, Jiequn, Jentzen, Arnulf, E, Weinan, Han, Jiequn, Jentzen, Arnulf, and E, Weinan
- Abstract
Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as the "curse of dimensionality". This paper introduces a deep learning-based approach that can handle general high-dimensional parabolic PDEs. To this end, the PDEs are reformulated using backward stochastic differential equations and the gradient of the unknown solution is approximated by neural networks, very much in the spirit of deep reinforcement learning with the gradient acting as the policy function. Numerical results on examples including the nonlinear Black-Scholes equation, the Hamilton-Jacobi-Bellman equation, and the Allen-Cahn equation suggest that the proposed algorithm is quite effective in high dimensions, in terms of both accuracy and cost. This opens up new possibilities in economics, finance, operational research, and physics, by considering all participating agents, assets, resources, or particles together at the same time, instead of making ad hoc assumptions on their inter-relationships., Comment: 13 pages, 6 figures
- Published
- 2017
- Full Text
- View/download PDF
34. Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations
- Author
-
E, Weinan, Han, Jiequn, Jentzen, Arnulf, E, Weinan, Han, Jiequn, and Jentzen, Arnulf
- Abstract
We propose a new algorithm for solving parabolic partial differential equations (PDEs) and backward stochastic differential equations (BSDEs) in high dimension, by making an analogy between the BSDE and reinforcement learning with the gradient of the solution playing the role of the policy function, and the loss function given by the error between the prescribed terminal condition and the solution of the BSDE. The policy function is then approximated by a neural network, as is done in deep reinforcement learning. Numerical results using TensorFlow illustrate the efficiency and accuracy of the proposed algorithms for several 100-dimensional nonlinear PDEs from physics and finance such as the Allen-Cahn equation, the Hamilton-Jacobi-Bellman equation, and a nonlinear pricing model for financial derivatives., Comment: 39 pages, 15 figures
- Published
- 2017
- Full Text
- View/download PDF
35. DeePMD-kit: A deep learning package for many-body potential energy representation and molecular dynamics
- Author
-
Wang, Han, Zhang, Linfeng, Han, Jiequn, E, Weinan, Wang, Han, Zhang, Linfeng, Han, Jiequn, and E, Weinan
- Abstract
Recent developments in many-body potential energy representation via deep learning have brought new hopes to addressing the accuracy-versus-efficiency dilemma in molecular simulations. Here we describe DeePMD-kit, a package written in Python/C++ that has been designed to minimize the effort required to build deep learning based representation of potential energy and force field and to perform molecular dynamics. Potential applications of DeePMD-kit span from finite molecules to extended systems and from metallic systems to chemically bonded systems. DeePMD-kit is interfaced with TensorFlow, one of the most popular deep learning frameworks, making the training process highly automatic and efficient. On the other end, DeePMD-kit is interfaced with high-performance classical molecular dynamics and quantum (path-integral) molecular dynamics packages, i.e., LAMMPS and the i-PI, respectively. Thus, upon training, the potential energy and force field models can be used to perform efficient molecular simulations for different purposes. As an example of the many potential applications of the package, we use DeePMD-kit to learn the interatomic potential energy and forces of a water model using data obtained from density functional theory. We demonstrate that the resulted molecular dynamics model reproduces accurately the structural information contained in the original model.
- Published
- 2017
- Full Text
- View/download PDF
36. Deep Learning Approximation for Stochastic Control Problems
- Author
-
Han, Jiequn, E, Weinan, Han, Jiequn, and E, Weinan
- Abstract
Many real world stochastic control problems suffer from the "curse of dimensionality". To overcome this difficulty, we develop a deep learning approach that directly solves high-dimensional stochastic control problems based on Monte-Carlo sampling. We approximate the time-dependent controls as feedforward neural networks and stack these networks together through model dynamics. The objective function for the control problem plays the role of the loss function for the deep neural network. We test this approach using examples from the areas of optimal trading and energy storage. Our results suggest that the algorithm presented here achieves satisfactory accuracy and at the same time, can handle rather high dimensional problems.
- Published
- 2016
37. Income and wealth distribution in macroeconomics: a continuous-time approach
- Author
-
Achdou, Yves, Han, Jiequn, Lasry, Jean Michel, Lions, Pierre Louis, Moll, Ben, Achdou, Yves, Han, Jiequn, Lasry, Jean Michel, Lions, Pierre Louis, and Moll, Ben
- Abstract
We recast the Aiyagari–Bewley–Huggett model of income and wealth distribution in continuous time. This workhorse model—as well as heterogeneous agent models more generally—then boils down to a system of partial differential equations, a fact we take advantage of to make two types of contributions. First, a number of new theoretical results: (1) an analytic characterization of the consumption and saving behaviour of the poor, particularly their marginal propensities to consume; (2) a closed-form solution for the wealth distribution in a special case with two income types; (3) a proof that there is a unique stationary equilibrium if the intertemporal elasticity of substitution is weakly greater than one. Second, we develop a simple, efficient and portable algorithm for numerically solving for equilibria in a wide class of heterogeneous agent models, including—but not limited to—the Aiyagari–Bewley–Huggett model.
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.