11 results on '"Fairbank, Michael"'
Search Results
2. Deep Learning in Target Space.
- Author
-
Fairbank, Michael, Samothrakis, Spyridon, and Citi, Luca
- Subjects
- *
DEEP learning , *WEIGHT training , *GENERALIZATION - Abstract
Deep learning uses neural networks which are parameterised by their weights. The neural networks are usually trained by tuning the weights to directly minimise a given loss function. In this paper we propose to re-parameterise the weights into targets for the firing strengths of the individual nodes in the network. Given a set of targets, it is possible to calculate the weights which make the firing strengths best meet those targets. It is argued that using targets for training addresses the problem of exploding gradients, by a process which we call cascade untangling, and makes the loss-function surface smoother to traverse, and so leads to easier, faster training, and also potentially better generalisation, of the neural network. It also allows for easier learning of deeper and recurrent network structures. The necessary conversion of targets to weights comes at an extra computational expense, which is in many cases manageable. Learning in target space can be combined with existing neural-network optimisers, for extra gain. Experimental results show the speed of using target space, and examples of improved generalisation, for fully-connected networks and convolutional networks, and the ability to recall and process long time sequences and perform naturallanguage processing with recurrent networks. [ABSTRACT FROM AUTHOR]
- Published
- 2022
3. Back to optimality: a formal framework to express the dynamics of learning optimal behavior.
- Author
-
Alonso, Eduardo, Fairbank, Michael, and Mondragón, Esther
- Subjects
- *
REINFORCEMENT learning , *ALGORITHMS , *CONTROL theory (Engineering) , *MATHEMATICAL models , *MATHEMATICAL optimization - Abstract
Whether animals behave optimally is an open question of great importance, both theoretically and in practice. Attempts to answer this question focus on two aspects of the optimization problem, the quantity to be optimized and the optimization process itself. In this paper, we assume the abstract concept of cost as the quantity to be minimized and propose a reinforcement learning algorithm, called Value-Gradient Learning (VGL), as a computational model of behavior optimality. We prove that, unlike standard models of Reinforcement Learning, Temporal Difference in particular, VGL is guaranteed to converge to optimality under certain conditions. The core of the proof is the mathematical equivalence of VGL and Pontryagin’s Minimum Principle, a well-known optimization technique in systems and control theory. Given the similarity between VGL’s formulation and regulatory models of behavior, we argue that our algorithm may provide psychologists with a tool to formulate such models in optimization terms. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
4. An adaptive recurrent neural-network controller using a stabilization matrix and predictive inputs to solve a tracking problem under disturbances.
- Author
-
Fairbank, Michael, Li, Shuhui, Fu, Xingang, Alonso, Eduardo, and Wunsch, Donald
- Subjects
- *
RECURRENT neural networks , *CONTROLLER area network (Computer network) , *ERROR analysis in mathematics , *ARTIFICIAL satellite tracking , *GRID computing , *INTEGRATORS , *STOCHASTIC convergence - Abstract
Abstract: We present a recurrent neural-network (RNN) controller designed to solve the tracking problem for control systems. We demonstrate that a major difficulty in training any RNN is the problem of exploding gradients, and we propose a solution to this in the case of tracking problems, by introducing a stabilization matrix and by using carefully constrained context units. This solution allows us to achieve consistently lower training errors, and hence allows us to more easily introduce adaptive capabilities. The resulting RNN is one that has been trained off-line to be rapidly adaptive to changing plant conditions and changing tracking targets. The case study we use is a renewable-energy generator application; that of producing an efficient controller for a three-phase grid-connected converter. The controller we produce can cope with the random variation of system parameters and fluctuating grid voltages. It produces tracking control with almost instantaneous response to changing reference states, and virtually zero oscillation. This compares very favorably to the classical proportional integrator (PI) controllers, which we show produce a much slower response and settling time. In addition, the RNN we propose exhibits better learning stability and convergence properties, and can exhibit faster adaptation, than has been achieved with adaptive critic designs. [Copyright &y& Elsevier]
- Published
- 2014
- Full Text
- View/download PDF
5. Clipping in Neurocontrol by Adaptive Dynamic Programming.
- Author
-
Fairbank, Michael, Prokhorov, Danil, and Alonso, Eduardo
- Subjects
- *
DYNAMIC programming , *REINFORCEMENT learning , *HEURISTIC programming , *MACHINE learning , *BACK propagation - Abstract
In adaptive dynamic programming, neurocontrol, and reinforcement learning, the objective is for an agent to learn to choose actions so as to minimize a total cost function. In this paper, we show that when discretized time is used to model the motion of the agent, it can be very important to do clipping on the motion of the agent in the final time step of the trajectory. By clipping, we mean that the final time step of the trajectory is to be truncated such that the agent stops exactly at the first terminal state reached, and no distance further. We demonstrate that when clipping is omitted, learning performance can fail to reach the optimum, and when clipping is done properly, learning performance can improve significantly. The clipping problem we describe affects algorithms that use explicit derivatives of the model functions of the environment to calculate a learning gradient. These include backpropagation through time for control and methods based on dual heuristic programming. However, the clipping problem does not significantly affect methods based on heuristic dynamic programming, temporal differences learning, or policy-gradient learning algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
6. Efficient Calculation of the Gauss-Newton Approximation of the Hessian Matrix in Nueral Networks.
- Author
-
Fairbank, Michael, Alonso, Eduardo, and Schraudolph, Nicol
- Subjects
- *
MATRICES (Mathematics) , *GAUSS-Newton method , *APPROXIMATION theory , *BIOLOGICAL neural networks , *MACHINE learning , *COMPUTATIONAL neuroscience , *MULTIPLICATION - Abstract
The Levenberg-Marquardt (LM) learning algorithm is a popular algorithm for training neural networks; however, for large neural networks, it becomes prohibitively expensive in terms of running time and memory requirements. The most time-critical step of the algorithm is the calculation of the Gauss-Newton matrix, which is formed by multiplying two large Jacobian matrices together. We propose a method that uses back-propagation to reduce the time of this matrix-matrix multiplication. This reduces the overall asymptotic running time of the LM algorithm by a factor of the order of the number of output nodes in the neural network. [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
7. Artificial Neural Networks for Control of a Grid-Connected Rectifier/Inverter Under Disturbance, Dynamic and Power Converter Switching Conditions.
- Author
-
Li, Shuhui, Fairbank, Michael, Johnson, Cameron, Wunsch, Donald C., Alonso, Eduardo, and Proao, Julio L.
- Subjects
- *
ARTIFICIAL neural networks , *ELECTRIC inverters , *ELECTRIC power systems , *VECTOR control , *DYNAMICAL systems , *DYNAMIC programming - Abstract
Three-phase grid-connected converters are widely used in renewable and electric power system applications. Traditionally, grid-connected converters are controlled with standard decoupled d-q vector control mechanisms. However, recent studies indicate that such mechanisms show limitations in their applicability to dynamic systems. This paper investigates how to mitigate such restrictions using a neural network to control a grid-connected rectifier/inverter. The neural network implements a dynamic programming algorithm and is trained by using backpropagation through time. To enhance performance and stability under disturbance, additional strategies are adopted, including the use of integrals of error signals to the network inputs and the introduction of grid disturbance voltage to the outputs of a well-trained network. The performance of the neural-network controller is studied under typical vector control conditions and compared against conventional vector control methods, which demonstrates that the neural vector control strategy proposed in this paper is effective. Even in dynamic and power converter switching environments, the neural vector controller shows strong ability to trace rapidly changing reference commands, tolerate system disturbances, and satisfy control requirements for a faulted power system. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
8. An Equivalence Between Adaptive Dynamic Programming With a Critic and Backpropagation Through Time.
- Author
-
Fairbank, Michael, Alonso, Eduardo, and Prokhorov, Danil
- Subjects
- *
DYNAMIC programming , *BACK propagation , *HEURISTIC programming , *ARTIFICIAL neural networks , *APPROXIMATION algorithms , *MATHEMATICAL optimization - Abstract
We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP into a new algorithm that we call Value-Gradient Learning, VGL(\lambda), and prove equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy. Not only does this equivalence provide a link between these two different approaches, but it also enables our variant of DHP to have guaranteed convergence, under certain smoothness conditions and a greedy policy, when using a general smooth nonlinear function approximator for the critic. We consider several experimental scenarios including some that prove divergence of DHP under a greedy policy, which contrasts against our proven-convergent algorithm. [ABSTRACT FROM PUBLISHER]
- Published
- 2013
- Full Text
- View/download PDF
9. Shooting Tigers.
- Author
-
Fairbank, Michael
- Subjects
- *
LETTERS to the editor , *TIGERS - Abstract
A letter to the editor in response to an article by Susie Green about tigers is presented.
- Published
- 2006
10. Training Recurrent Neural Networks With the Levenberg–Marquardt Algorithm for Optimal Control of a Grid-Connected Converter.
- Author
-
Fu, Xingang, Li, Shuhui, Fairbank, Michael, Wunsch, Donald C., and Alonso, Eduardo
- Subjects
- *
NEURAL circuitry , *ARTIFICIAL neural networks , *ARTIFICIAL intelligence , *MULTILAYER perceptrons , *NEURAL chips - Abstract
This paper investigates how to train a recurrent neural network (RNN) using the Levenberg–Marquardt (LM) algorithm as well as how to implement optimal control of a grid-connected converter (GCC) using an RNN. To successfully and efficiently train an RNN using the LM algorithm, a new forward accumulation through time (FATT) algorithm is proposed to calculate the Jacobian matrix required by the LM algorithm. This paper explores how to incorporate FATT into the LM algorithm. The results show that the combination of the LM and FATT algorithms trains RNNs better than the conventional backpropagation through time algorithm. This paper presents an analytical study on the optimal control of GCCs, including theoretically ideal optimal and suboptimal controllers. To overcome the inapplicability of the optimal GCC controller under practical conditions, a new RNN controller with an improved input structure is proposed to approximate the ideal optimal controller. The performance of an ideal optimal controller and a well-trained RNN controller was compared in close to real-life power converter switching environments, demonstrating that the proposed RNN controller can achieve close to ideal optimal control performance even under low sampling rate conditions. The excellent performance of the proposed RNN controller under challenging and distorted system conditions further indicates the feasibility of using an RNN to approximate optimal control in practical applications. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
11. Control of a Buck DC/DC Converter Using Approximate Dynamic Programming and Artificial Neural Networks.
- Author
-
Dong, Weizhen, Li, Shuhui, Fu, Xingang, Li, Zhongwen, Fairbank, Michael, and Gao, Yixiang
- Subjects
- *
ARTIFICIAL neural networks , *DYNAMIC programming , *RECURRENT neural networks , *PREDICTIVE control systems , *ONLINE education , *COST functions , *TIME perspective - Abstract
This paper proposes a novel artificial neural network (ANN) based control method for a dc/dc buck converter. The ANN is trained to implement optimal control based on approximate dynamic programming (ADP). Special characteristics of the proposed ANN control include: 1) The inputs to the ANN contain error signals and integrals of the error signals, enabling the ANN to have PI control ability; 2) The ANN receives voltage feedback signals from the dc/dc converter, making the combined system equivalent to a recurrent neural network; 3) The ANN is trained to minimize a cost function over a long time horizon, making the ANN have a stronger predictive control ability than a conventional predictive controller; 4) The ANN is trained offline, preventing the instability of the network caused by weight adjustments of an on-line training algorithm. The ANN performance is evaluated through simulation and hardware experiments and compared with conventional control methods, which shows that the ANN controller has a strong ability to track rapidly changing reference commands, maintain stable output voltage for a variable load, and manage maximum duty-ratio and current constraints properly. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.