Database: Academic Search Index / Publisher: mdpi / Search Limiters: Peer Reviewed / Topic: 4 selected - Searchworks@Jio Institute Digital Library Search Results

Showing total 3 results

Start Over Search Limiters Peer Reviewed Topic algorithms Topic artificial intelligence Topic deep learning Topic reinforcement learning Database Academic Search Index Publisher mdpi

3 results

1. An Improved Distributed Sampling PPO Algorithm Based on Beta Policy for Continuous Global Path Planning Scheme.

Author: Xiao, Qianhao, Jiang, Li, Wang, Manman, and Zhang, Xin
Subjects: *DISTRIBUTED algorithms, *REINFORCEMENT learning, *NAVIGATION in shipping, *ALGORITHMS
Abstract: Traditional path planning is mainly utilized for path planning in discrete action space, which results in incomplete ship navigation power propulsion strategies during the path search process. Moreover, reinforcement learning experiences low success rates due to its unbalanced sample collection and unreasonable design of reward function. In this paper, an environment framework is designed, which is constructed using the Box2D physics engine and employs a reward function, with the distance between the agent and arrival point as the main, and the potential field superimposed by boundary control, obstacles, and arrival point as the supplement. We also employ the state-of-the-art PPO (Proximal Policy Optimization) algorithm as a baseline for global path planning to address the issue of incomplete ship navigation power propulsion strategy. Additionally, a Beta policy-based distributed sample collection PPO algorithm is proposed to overcome the problem of unbalanced sample collection in path planning by dividing sub-regions to achieve distributed sample collection. The experimental results show the following: (1) The distributed sample collection training policy exhibits stronger robustness in the PPO algorithm; (2) The introduced Beta policy for action sampling results in a higher path planning success rate and reward accumulation than the Gaussian policy at the same training time; (3) When planning a path of the same length, the proposed Beta policy-based distributed sample collection PPO algorithm generates a smoother path than traditional path planning algorithms, such as A*, IDA*, and Dijkstra. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

2. A Multi-Agent Reinforcement Learning Approach to Price and Comfort Optimization in HVAC-Systems.

Author: Blad, Christian, Bøgh, Simon, and Kallesøe, Carsten
Subjects: *REINFORCEMENT learning, *ARTIFICIAL intelligence, *HEATING control, *HEAT pumps, *DEEP learning, *ALGORITHMS
Abstract: This paper addresses the challenge of minimizing training time for the control of Heating, Ventilation, and Air-conditioning (HVAC) systems with online Reinforcement Learning (RL). This is done by developing a novel approach to Multi-Agent Reinforcement Learning (MARL) to HVAC systems. In this paper, the environment formed by the HVAC system is formulated as a Markov Game (MG) in a general sum setting. The MARL algorithm is designed in a decentralized structure, where only relevant states are shared between agents, and actions are shared in a sequence, which are sensible from a system's point of view. The simulation environment is a domestic house located in Denmark and designed to resemble an average house. The heat source in the house is an air-to-water heat pump, and the HVAC system is an Underfloor Heating system (UFH). The house is subjected to weather changes from a data set collected in Copenhagen in 2006, spanning the entire year except for June, July, and August, where heat is not required. It is shown that: (1) When comparing Single Agent Reinforcement Learning (SARL) and MARL, training time can be reduced by 70% for a four temperature-zone UFH system, (2) the agent can learn and generalize over seasons, (3) the cost of heating can be reduced by 19% or the equivalent to 750 kWh of electric energy per year for an average Danish domestic house compared to a traditional control method, and (4) oscillations in the room temperature can be reduced by 40% when comparing the RL control methods with a traditional control method. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

3. Deep-Reinforcement-Learning-Based Two-Timescale Voltage Control for Distribution Systems.

Author: Zhang, Jing, Li, Yiqi, Wu, Zhi, Rong, Chunyan, Wang, Tao, Zhang, Zhang, and Zhou, Suyang
Subjects: *VOLTAGE control, *ARTIFICIAL intelligence, *DEEP learning, *REINFORCEMENT learning, *ALGORITHMS, *VOLTAGE
Abstract: Because of the high penetration of renewable energies and the installation of new control devices, modern distribution networks are faced with voltage regulation challenges. Recently, the rapid development of artificial intelligence technology has introduced new solutions for optimal control problems with high dimensions and dynamics. In this paper, a deep reinforcement learning method is proposed to solve the two-timescale optimal voltage control problem. All control variables are assigned to different agents, and discrete variables are solved by a deep Q network (DQN) agent while the continuous variables are solved by a deep deterministic policy gradient (DDPG) agent. All agents are trained simultaneously with specially designed reward aiming at minimizing long-term average voltage deviation. Case study is executed on a modified IEEE-123 bus system, and the results demonstrate that the proposed algorithm has similar or even better performance than the model-based optimal control scheme and has high computational efficiency and competitive potential for online application. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

3 results

1. An Improved Distributed Sampling PPO Algorithm Based on Beta Policy for Continuous Global Path Planning Scheme.

2. A Multi-Agent Reinforcement Learning Approach to Price and Comfort Optimization in HVAC-Systems.

3. Deep-Reinforcement-Learning-Based Two-Timescale Voltage Control for Distribution Systems.

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

3 results

Search Results

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources