Publication Type: Academic Journals and eBooks / Publication Year Range: Last 50 years / Search Limiters: Available in Library Collection / Topic: artificial neural networks and reinforcement learning - Searchworks@Jio Institute Digital Library Search Results

1. Guest Editorial: Operational and structural resilience of power grids with high penetration of renewables.

Author: Lei, Shunbo, Zhang, Yichen, Shahidehpour, Mohammad, Hou, Yunhe, Panteli, Mathaios, Chen, Xia, Aydin, Nazli Yonca, Liang, Liang, Wang, Cheng, Wang, Chong, and She, Buxin
Subjects: MICROGRIDS, ELECTRIC power distribution grids, CYBER physical systems, MIXED integer linear programming, DEEP reinforcement learning, ARTIFICIAL neural networks, REINFORCEMENT learning, ELECTRIC power
Published: 2024
Full Text: View/download PDF

2. Guest Editorial: Special issue on computational methods and artificial intelligence applications in low‐carbon energy systems.

Author: Wang, Yishen, Zhou, Fei, Guerrero, Josep M., Baker, Kyri, Chen, Yize, Wang, Hao, Xu, Bolun, Xu, Qianwen, Zhu, Hong, and Agwan, Utkarsha
Subjects: ARTIFICIAL intelligence, ARTIFICIAL neural networks, MACHINE learning, REINFORCEMENT learning, DEEP reinforcement learning, DEEP learning
Abstract: This document is a guest editorial for a special issue on computational methods and artificial intelligence applications in low-carbon energy systems. The editorial highlights the urgent need for advanced computing and artificial intelligence in the clean energy transition to improve system reliability, economics, and sustainability. The special issue includes 19 original research articles covering topics such as energy forecasting, situational awareness, multi-energy system dispatch, and power system operation. The articles present state-of-the-art methods and techniques in these areas, including wind power forecasting, demand-side flexibility, fault diagnosis of photovoltaic strings, and energy management strategies. The authors express their gratitude to the participating authors and anonymous reviewers for their contributions to the special section. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

3. Editorial for the Special Issue "Data Science and Big Data in Biology, Physical Science and Engineering".

Author: Mahmoud, Mohammed
Subjects: PHYSICAL sciences, BIG data, DEEP learning, ARTIFICIAL neural networks, DATA science, MACHINE learning, REINFORCEMENT learning
Abstract: This document is an editorial for a special issue of the journal "Technologies" focused on data science and big data in various fields such as biology, physical science, and engineering. The editorial highlights the importance of analyzing large amounts of data generated by digital technologies and the need for data scientists to use artificial intelligence and machine learning to extract valuable knowledge. The special issue includes 12 papers covering topics such as machine learning techniques for customer churn prediction, agile program management in the U.S. Navy, deep learning for cybersecurity in Industry 5.0, self-directed learning during the COVID-19 era, decision tree-based neural networks for data classification, data-driven governance in technology companies, and more. The papers explore different approaches, models, and tools in the context of data science and big data. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

4. Guest Editorial on "Computational intelligence in analysis and integration of complex systems".

Author: Zhao, Bo, Zeng, Wenyi, Gao, Weinan, and Zhang, Qichao
Subjects: COMPUTATIONAL intelligence, SYSTEM integration, MULTIAGENT systems, DIFFERENTIAL evolution, REINFORCEMENT learning, MANIPULATORS (Machinery), CONVOLUTIONAL neural networks, ARTIFICIAL neural networks
Abstract: In the third paper, I Deep transfer learning: a novel glucose prediction framework for new subjects with Type 2 diabetes i , the authors designed a novel cross-subject glucose prediction framework by integrating instance-based and network-based deep transfer learning via segmented continuous glucose monitoring time series. Control and optimization Six papers have devoted to the computational intelligence-based decision-making and analysis of complex systems. Complex systems, which are composed of many interconnected and interactive functional parts, widely exist in the nature and human society. [Extracted from the article]
Published: 2022
Full Text: View/download PDF

5. Intelligent maneuver strategy for hypersonic vehicles in three-player pursuit-evasion games via deep reinforcement learning.

Author: Tian Yan, Zijian Jiang, Tong Li, Mengjing Gao, and Can Liu
Subjects: REINFORCEMENT learning, DEEP reinforcement learning, ARTIFICIAL neural networks, REWARD (Psychology), MARKOV processes, SIGNAL convolution
Abstract: Aiming at the rapid development of anti-hypersonic collaborative interception technology, this paper designs an intelligent maneuver strategy of hypersonic vehicles (HV) based on deep reinforcement learning (DRL) to evade the collaborative interception by two interceptors. Under the meticulously designed collaborative interception strategy, the uncertainty and difficulty of evasion are significantly increased and the opportunity for maneuvers is further compressed. This paper, accordingly, selects the twin delayed deep deterministic gradient (TD3) strategy acting on the continuous action space and makes targeted improvements combining deep neural networks to grasp the maneuver strategy and achieve successful evasion. Focusing on the time-coordinated interception strategy of two interceptors, the three-player pursuit and evasion (PE) problem is modeled as the Markov decision process, and the double training strategy is proposed to juggle both interceptors. In reward functions of the training process, the energy saving factor is set to achieve the trade-off between miss distance and energy consumption. In addition, the regression neural network is introduced into the deep neural network of TD3 to enhance intelligent maneuver strategies’ generalization. Finally, numerical simulations are conducted to verify that the improved TD3 algorithm can effectively evade the collaborative interception of two interceptors under tough situations, and the improvements of the algorithm in terms of convergence speed, generalization, and energysaving effect are verified. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. Advances in Machine Learning.

Author: Yang, Jihoon and Park, Unsang
Subjects: MACHINE learning, BOOSTING algorithms, DEEP learning, ARTIFICIAL neural networks, REINFORCEMENT learning, COGNITIVE science, ARTIFICIAL intelligence
Abstract: Several deep learning models are presented: Ref. [[4]] designs a network intrusion detection model, DLNID, that combines an attention mechanism, a bidirectional long short-term memory (LSTM), and an adaptive synthetic sampling, and demonstrates improved performance for severely imbalanced data. In addition, there are a couple of papers on HPO: Ref. [[11]] introduces a greedy I k i -fold cross validation method that vastly reduces the average time required to find the best-performing model with or without a computational budget, and experimentally verifies its improved performance over existing methods. Since its inception as a branch of Artificial Intelligence, Machine Learning (ML) has flourished in recent years. [Extracted from the article]
Published: 2022
Full Text: View/download PDF

7. Online modeling method for composite load model including EVs and battery storage based on measurement data.

Author: Yin, Yanhe, Zhong, Yi, He, Yi, Li, Guohao, Li, Zhuohuan, Pan, Shixian, Xiao, Dongliang, Han, Jintao, Zhang, Fei, and Kiranmayi, R.
Subjects: ELECTRIC charge, STORAGE batteries, DEEP learning, ARTIFICIAL neural networks, REINFORCEMENT learning, DEEP reinforcement learning, MACHINE learning
Abstract: Load models have a significant influence on power system simulation. However, current load modeling approaches can hardly satisfy the diversity and time- varying characteristics of loads [including electric vehicles (EVs) and battery storage] in terms of model accuracy and computing efficiency. An online modeling method for composite load models based on measurement information is proposed in this paper. Firstly, the dominant factors in load model output are analyzed based on the active subspace of parameter space. Then the clustering algorithm is applied to cluster the large number of underlying loads based on the characteristics of load daily output curves. Finally, the underlying loads are equivalently aggregated from the low voltage levels to the high voltage levels to construct the composite load model. Simulation results obtained based on PSCAD/EMTDC demonstrate that the load model constructed by the proposed approach can accurately reflect the actual load characteristics of a power system. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. Application of Multiple Deep Neural Networks to Multi-Solution Synthesis of Linkage Mechanisms.

Author: Chen, Chiu-Hung
Subjects: ARTIFICIAL neural networks, REINFORCEMENT learning, INDUSTRIAL design, DIE-casting
Abstract: This paper studies the problem of linkage-bar synthesis by means of multiple deep neural networks (DNNs), which requires the inverse solution of linkage parameters based on a desired trajectory curve. This problem is highly complex due to the fact that the solution space is nonlinear and may contain multiple solutions, while a good quality of learning cannot be obtained by a single neural network approach. Therefore, this paper proposes employing Fourier descriptors to represent trajectory curves in a systematic and normalized form, developing a multi-solution distribution evaluation by random restart local searches (MDE-RRLS) to examine a better solution-space partitioning scheme, utilizing multiple DNNs to learn subspace regions separately, and creating a multi-facet query (MFQuery) to cooperatively predict multiple solutions. The experiments demonstrate that the proposed approach can obtain better or at least competitive outcomes compared to previous work in the literature. Furthermore, to verify the effectiveness and applicability, this paper investigates the design problem of an industrial six-linkage-bar ladle mechanism used in a die-casting system, and the proposed method can obtain several superior design solutions and offer alternatives in a short period of time when faced with redesign requirements. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

9. Optimizing Performance of Hybrid Electrochemical Energy Storage Systems through Effective Control: A Comprehensive Review.

Author: Clemente, Alejandro, Arias, Paula, Gevorkov, Levon, Trilla, Lluís, Obrador Rey, Sergi, Roger, Xavier Sanchez, Domínguez-García, José Luis, and Filbà Martínez, Àlber
Subjects: ENERGY storage, ARTIFICIAL neural networks, REINFORCEMENT learning, BATTERY storage plants, DEEP reinforcement learning, ELECTRIC vehicle batteries, ELECTRIC automobiles
Abstract: The implementation of energy storage system (ESS) technology with an appropriate control system can enhance the resilience and economic performance of power systems. However, none of the storage options available today can perform at their best in every situation. As a matter of fact, an isolated storage solution's energy and power density, lifespan, cost, and response time are its primary performance constraints. Batteries are the essential energy storage component used in electric mobility, industries, and household applications nowadays. In general, the battery energy storage systems (BESS) currently available on the market are based on a homogeneous type of electrochemical battery. However, a hybrid energy storage system (HESS) based on a mixture of various types of electrochemical batteries can potentially provide a better option for high-performance electric cars, heavy-duty electric vehicles, industries, and residential purposes. A hybrid energy storage system combines two or more electrochemical energy storage systems to provide a more reliable and efficient energy storage solution. At the same time, the integration of multiple energy storage systems in an HESS requires advanced control strategies to ensure optimal performance and longevity of the system. This review paper aims to provide a comprehensive overview of the control systems used in HESSs for a wide range of applications. An overview of the various control strategies used in HESSs is offered, including traditional control methods such as proportional–integral–derivative (PID) control, and advanced control methods such as model predictive control (MPC), droop control (DC), sliding mode control (SMC), rule-based control (RBC), fuzzy logic control (FLC), and artificial neural network (ANN) control are discussed. The paper also highlights the recent developments in HESS control systems, including the use of machine learning techniques such as deep reinforcement learning (DRL) and genetic algorithms (GA). The paper provides not only a description and classification of various control approaches but also a comparison between control strategies from the evaluation of performance point of view. The review concludes by summarizing the key findings and future research directions for HESS control systems, which is directly linked to the research on machine learning and the mix of different control type strategies. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. Water level control of nuclear steam generators using intelligent hierarchical autonomous controller.

Author: Peng, Binsen, Ma, Xintong, Xia, Hong, Lin, Linyu, and Liu, Xiaojing
Subjects: STEAM generators, DEEP learning, MICROGRIDS, ARTIFICIAL neural networks, DEEP reinforcement learning, REINFORCEMENT learning, WATER levels, MACHINE learning
Abstract: The challenge of water level control in steam generators, particularly at low power levels, has always been a critical aspect of nuclear power plant operation. To address this issue, this paper introduces an IHA controller. This controller employs a CPI controller as the primary controller for direct water level control, coupled with an agent-based controller optimized through a DRL algorithm. The agent dynamically optimizes the parameters of the CPI controller in real-time based on the system's state, resulting in improved control performance. Firstly, a new observer information is obtained to get the accurate state of the system, and a new reward function is constructed to evaluate the status of the system and guide the agent's learning process. Secondly, a deep ResNet with good generalization performance is used as the approximator of action value function and policy function. Then, the DDPG algorithm is used to train the agent-based controller, and an advanced controller with good performance is obtained after training. Finally, the popular UTSG model is used to verify the effectiveness of the algorithm. The results demonstrate that the proposed method achieves rise times of 73.9 s, 13.6 s, and 16.4 s at low, medium, and high power levels, respectively. Particularly, at low power levels, the IHA controller can restore the water level to its normal state within 200 s. These performances surpass those of the comparative methods, indicating that the proposed method excels not only in water level tracking but also in anti-interference capabilities. In essence, the IHA controller can autonomously learn the control strategy and reduce its reliance on the expert system, achieving true autonomous control and delivering excellent control performance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. Dynamic Traffic Management Using Reinforcement Learning.

Author: Shaikh, Aryaan, Bhalekar, Babasaheb, and Futane, Pravin
Subjects: REINFORCEMENT learning, ARTIFICIAL neural networks, TRAFFIC signs & signals, TRAFFIC engineering, TRAFFIC flow
Abstract: Traffic congestion has become a major problem in this rapidly growing world. Everyone operating a vehicle, as well as the traffic police in charge of managing the traffic, finds it difficult to become stuck in heavy traffic. For this a set, predetermined timing for traffic flow for each direction at the junction is utilized by traditional traffic light controllers. However, the concept of a fixed time traffic signal controller does not work well in places with uneven traffic. A dynamic traffic control system is therefore required, which regulates the traffic signals in accordance with the volume of traffic. This paper proposes a model that uses reinforcement learning (RL) along with deep neural networks (DNN) to manage discretions (signal status) for an environment with the help of Simulation of Urban MObility (SUMO). A simulation of real-world environment consisting a network of Four-way crossroad junction that contains 4 arriving lanes and 4 exiting lanes is used to train the agent. The main objective of this research study is to construct a model that can independently determine the best course of action and aims to provide better traffic management that will decrease the average waiting time, cause lower congestion, and provide a smooth flow of traffic. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. Recursive Neural Network as a Multiple Input–Multiple Output Speed Controller for Electrical Drive of Three-Mass System.

Author: Zawirski, Krzysztof, Brock, Stefan, and Nowopolski, Krzysztof
Subjects: ADAPTIVE control systems, PID controllers, SPEED
Abstract: Electrical drive systems are commonly applied for the mechanisms of precise movement, where having a high-quality position and high-quality speed control is especially valuable. Very often, the mechanical part of these systems reveals resonant properties that are related to the limited stiffness of the interconnection between subsequent parts of the mechanism. In most cases, this sort of system may be described as a model of several linked masses. If only the structure of the mechanical part is known and the corresponding parameters are constant and identified, the demanded control quality may be obtained using a properly tuned ADRC or PID controller equipped with appropriate anti-resonance filtration. However, if the parameters of the mechanical part are variant, adaptive control may be considered as a solution. In this paper, artificial neural network (ANN) is considered to be a speed controller and its training method assures adaptation to the unknown mechanical parameters. The paper is particularly focused on a three-mass system, which possesses, due to its structure, two resonant frequencies. The unique property of the analyzed system is the application of drive units at both ends of the system, so that the controller has the ability to influence the resonant system from both sides. The coordination of the drive unit is performed by the aforementioned ANN, from which two outputs affect the drive units independently. The derivation of the mathematical model is followed by its implementation in a computer simulation and finally the evaluation in a dedicated laboratory setup, the construction of which is also presented in the paper. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. Special Issue "Neural Network for Traffic Forecasting".

Author: Jiang, Weiwei
Subjects: COMPUTER network traffic, TRAFFIC estimation, DEEP reinforcement learning, REINFORCEMENT learning, ARTIFICIAL neural networks
Abstract: Traffic forecasting is an important research topic in intelligent transportation systems and smart cities. This Special Issue aims to collect the state-of-the-art results of applying neural networks for traffic forecasting. A collection of five papers are accepted and included in this Special Issue, covering the latest methods such as Artificial Neural Networks (ANN), Physics-Informed Neural Networks (PINNs), spatio-temporal attention-boosted autoencoder and Deep Reinforcement Learning (DRL). [Extracted from the article]
Published: 2023
Full Text: View/download PDF

14. Multimodel Collaboration to Combat Malicious Domain Fluxing.

Author: Nie, Yuanping, Liu, Shuangshuang, Qian, Cheng, Deng, Congyi, Li, Xiang, Wang, Zhi, and Kuang, Xiaohui
Subjects: ARTIFICIAL neural networks, DEEP learning, STATISTICAL learning, REINFORCEMENT learning, MACHINE learning
Abstract: This paper proposes a novel domain-generation-algorithm detection framework based on statistical learning that integrates the detection capabilities of multiple heterogeneous models. The framework includes both traditional machine learning methods based on artificial features and deep learning methods, comprehensively analyzing 34 artificial features and advanced features extracted from deep neural networks. Additionally, the framework evaluates the predictions of the base models based on the fit of the samples to each type of sample set and a predefined significance level. The predictions of the base models are statistically analyzed, and the final decision is made using strategies such as voting, confidence, and credibility. Experimental results demonstrate that the DGA detection framework based on statistical learning achieves a higher detection rate compared to the underlying base models, with accuracy, precision, recall, and F1 scores reaching 0.979, 0.977, 0.981, and 0.979, respectively. The framework also exhibits a stronger adaptability to unknown domains and a certain level of robustness against concept drift attacks. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

15. Self reward design with fine-grained interpretability.

Author: Tjoa, Erico and Guan, Cuntai
Subjects: REWARD (Psychology), REINFORCEMENT learning, ARTIFICIAL neural networks
Abstract: The black-box nature of deep neural networks (DNN) has brought to attention the issues of transparency and fairness. Deep Reinforcement Learning (Deep RL or DRL), which uses DNN to learn its policy, value functions etc, is thus also subject to similar concerns. This paper proposes a way to circumvent the issues through the bottom-up design of neural networks with detailed interpretability, where each neuron or layer has its own meaning and utility that corresponds to humanly understandable concept. The framework introduced in this paper is called the Self Reward Design (SRD), inspired by the Inverse Reward Design, and this interpretable design can (1) solve the problem by pure design (although imperfectly) and (2) be optimized like a standard DNN. With deliberate human designs, we show that some RL problems such as lavaland and MuJoCo can be solved using a model constructed with standard NN components with few parameters. Furthermore, with our fish sale auction example, we demonstrate how SRD is used to address situations that will not make sense if black-box models are used, where humanly-understandable semantic-based decision is required. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

16. Autonomous Generation of Service Strategy for Household Tasks: A Progressive Learning Method With A Priori Knowledge and Reinforcement Learning.

Author: Zhang, Mengyang, Tian, Guohui, Gao, Huanbing, and Zhang, Ying
Subjects: REINFORCEMENT learning, REWARD (Psychology), ARTIFICIAL neural networks
Abstract: Human beings tend to learn unknown knowledge in a gradual process, from the basic to the complex. Based on this point, we propose a progressive learning method for producing service strategies according to requests, with a hierarchical priori knowledge and reinforcement learning. Service strategy aims to guide how to perform home services and takes into consideration the relationship between actions and objects in home environment. In this paper, strategy generation is regarded as a text generation problem in question answering (QA). Firstly, a hierarchical priori knowledge with service-object correlation at the bottom and action-object correlation at the top is constructed to assist the understanding on the relationship of objects and actions in service strategies. Service-object correlation guides how to select proper objects with the correct order, while action-object correlation associates actions in strategies according to selected objects. Based on the hierarchical priori knowledge, a progressive learning method is proposed to make the model produce effective strategies with a sequential cognition, from service-object correlation (objects) to action-object correlation (actions). After that, reinforcement learning is employed to enhance the progressive guidance, by designing rewards in terms of the hierarchical priori knowledge. Finally, the proposed method is tested with both comparative experiments and ablation studies, and the experimental results demonstrate the superiority in producing comprehensive and logical strategies, indicating that the progressive learning method in our paper can further improve the QA performance. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

17. A novel method-based reinforcement learning with deep temporal difference network for flexible double shop scheduling problem.

Author: Wang, Xiao, Zhong, Peisi, Liu, Mei, Zhang, Chao, and Yang, Shihao
Subjects: DEEP reinforcement learning, ARTIFICIAL neural networks, REINFORCEMENT learning, TIME-varying networks, PRODUCTION scheduling, FLOW shops, SCHEDULING
Abstract: This paper studies the flexible double shop scheduling problem (FDSSP) that considers simultaneously job shop and assembly shop. It brings about the problem of scheduling association of the related tasks. To this end, a reinforcement learning algorithm with a deep temporal difference network is proposed to minimize the makespan. Firstly, the FDSSP is defined as the mathematical model of the flexible job-shop scheduling problem joined to the assembly constraint level. It is translated into a Markov decision process that directly selects behavioral strategies according to historical machining state data. Secondly, the proposed ten generic state features are input into the deep neural network model to fit the state value function. Similarly, eight simple constructive heuristics are used as candidate actions for scheduling decisions. From the greedy mechanism, optimally combined actions of all machines are obtained for each decision step. Finally, a deep temporal difference reinforcement learning framework is established, and a large number of comparative experiments are designed to analyze the basic performance of this algorithm. The results showed that the proposed algorithm was better than most other methods, which contributed to solving the practical production problem of the manufacturing industry. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. Escaping Stagnation through Improved Orca Predator Algorithm with Deep Reinforcement Learning for Feature Selection.

Author: Olivares, Rodrigo, Ravelo, Camilo, Soto, Ricardo, and Crawford, Broderick
Subjects: REINFORCEMENT learning, DEEP reinforcement learning, FEATURE selection, ARTIFICIAL neural networks, MACHINE learning, BIOLOGICALLY inspired computing
Abstract: Stagnation at local optima represents a significant challenge in bio-inspired optimization algorithms, often leading to suboptimal solutions. This paper addresses this issue by proposing a hybrid model that combines the Orca predator algorithm with deep Q-learning. The Orca predator algorithm is an optimization technique that mimics the hunting behavior of orcas. It solves complex optimization problems by exploring and exploiting search spaces efficiently. Deep Q-learning is a reinforcement learning technique that combines Q-learning with deep neural networks. This integration aims to turn the stagnation problem into an opportunity for more focused and effective exploitation, enhancing the optimization technique's performance and accuracy. The proposed hybrid model leverages the biomimetic strengths of the Orca predator algorithm to identify promising regions nearby in the search space, complemented by the fine-tuning capabilities of deep Q-learning to navigate these areas precisely. The practical application of this approach is evaluated using the high-dimensional Heartbeat Categorization Dataset, focusing on the feature selection problem. This dataset, comprising complex electrocardiogram signals, provided a robust platform for testing the feature selection capabilities of our hybrid model. Our experimental results are encouraging, showcasing the hybrid strategy's capability to identify relevant features without significantly compromising the performance metrics of machine learning models. This analysis was performed by comparing the improved method of the Orca predator algorithm against its native version and a set of state-of-the-art algorithms. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. Deep learning-powered malware detection in cyberspace: a contemporary review.

Author: Redhu, Ananya, Choudhary, Prince, Srinivasan, Kathiravan, Das, Tapan Kumar, Kumar, Rajesh, Din, Ikram Ud, and Naqvi, Nuzhat
Subjects: DEEP learning, ARTIFICIAL neural networks, REINFORCEMENT learning, DEEP reinforcement learning, MACHINE learning, RECURRENT neural networks
Abstract: This article explores deep learning models in the field of malware detection in cyberspace, aiming to provide insights into their relevance and contributions. The primary objective of the study is to investigate the practical applications and effectiveness of deep learning models in detecting malware. By carefully analyzing the characteristics of malware samples, these models gain the ability to accurately categorize them into distinct families or types, enabling security researchers to swiftly identify and counter emerging threats. The PRISMA 2020 guidelines were used for paper selection and the time range of review study is January 2015 to Dec 2023. In the review, various deep learning models such as Recurrent Neural Networks, Deep Autoencoders, LSTM, Deep Neural Networks, Deep Belief Networks, Deep Convolutional Neural Networks, Deep Generative Models, Deep Boltzmann Machines, Deep Reinforcement Learning, Extreme Learning Machine, and others are thoroughly evaluated. It highlights their individual strengths and real-world applications in the domain of malware detection in cyberspace. The review also emphasizes that deep learning algorithms consistently demonstrate exceptional performance, exhibiting high accuracy and low false positive rates in real-world scenarios. Thus, this article aims to contribute to a better understanding of the capabilities and potential of deep learning models in enhancing cybersecurity efforts. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Connectivity conservation planning through deep reinforcement learning.

Author: Equihua, Julián, Beckmann, Michael, and Seppelt, Ralf
Subjects: REINFORCEMENT learning, DEEP reinforcement learning, ARTIFICIAL neural networks, MOLECULAR connectivity index, FRAGMENTED landscapes, LANDSCAPE assessment
Abstract: Copyright of Methods in Ecology & Evolution is the property of Wiley-Blackwell and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

21. On the optimality of quantum circuit initial mapping using reinforcement learning.

Author: Elsayed Amer, Norhan, Gomaa, Walid, Kimura, Keiji, Ueda, Kazunori, and El-Mahdy, Ahmed
Subjects: ARTIFICIAL neural networks, REINFORCEMENT learning, CIRCUIT complexity, TRANSFORMER models
Abstract: Quantum circuit optimization is an inevitable task with the current noisy quantum backends. This task is considered non-trivial due to the varying circuits' complexities in addition to hardware-specific noise, topology, and limited connectivity. The currently available methods either rely on heuristics for circuit optimization tasks or reinforcement learning with complex unscalable neural networks such as transformers. In this paper, we are concerned with optimizing the initial logical-to-physical mapping selection. Specifically, we investigate whether a reinforcement learning agent with simple scalable neural network is capable of finding a near-optimal logical-to-physical mapping, that would decrease as much as possible additional CNOT gates, only from a fixed-length feature vector. To answer this question, we train a Maskable Proximal Policy Optimization agent to progressively take steps towards a near-optimal logical-to-physical mapping on a 20-qubit hardware architecture. Our results show that our agent coupled with a simple routing evaluation is capable of outperforming other available reinforcement learning and heuristics approaches on 12 out of 19 test benchmarks, achieving geometric mean improvements of 2.2% and 15% over the best available related work and two heuristics approaches, respectively. Additionally, our neural network model scales linearly as the number of qubits increases. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. Advancements and Future Directions in the Application of Machine Learning to AC Optimal Power Flow: A Critical Review.

Author: Jiang, Bozhen, Wang, Qin, Wu, Shengyu, Wang, Yidi, and Lu, Gang
Subjects: ELECTRICAL load, MACHINE learning, ARTIFICIAL neural networks, DISTRIBUTED power generation, ALTERNATING currents
Abstract: Optimal power flow (OPF) is a crucial tool in the operation and planning of modern power systems. However, as power system optimization shifts towards larger-scale frameworks, and with the growing integration of distributed generations, the computational time and memory requirements of solving the alternating current (AC) OPF problems can increase exponentially with system size, posing computational challenges. In recent years, machine learning (ML) has demonstrated notable advantages in efficient computation and has been extensively applied to tackle OPF challenges. This paper presents five commonly employed OPF transformation techniques that leverage ML, offering a critical overview of the latest applications of advanced ML in solving OPF problems. The future directions in the application of machine learning to AC OPF are also discussed. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. Optimal economic dispatch of a virtual power plant based on gated recurrent unit proximal policy optimization.

Author: Gao, Zhiping, Kang, Wenwen, Chen, Xinghua, Gong, Siru, Liu, Zongxiong, He, Degang, Shi, Shen, Shangguan, Xing-Chen, Wang, Weiyu, and Yingping, Cao
Subjects: REINFORCEMENT learning, DEEP reinforcement learning, ARTIFICIAL neural networks, PARTIALLY observable Markov decision processes, BATTERY storage plants, POWER plants
Abstract: The intermittent renewable energy in a virtual power plant (VPP) brings generation uncertainties, which prevents the VPP from providing a reliable and user-friendly power supply. To address this issue, this paper proposes a gated recurrent unit proximal policy optimization (GRUPPO)-based optimal VPP economic dispatch method. First, electrical generation, storage, and consumption are established to form a VPP framework by considering the accessibility of VPP state information. The optimal VPP economic dispatch can then be expressed as a partially observable Markov decision process (POMDP) problem. A novel deep reinforcement learning method called GRUPPO is further developed based on VPP time series characteristics. Finally, case studies are conducted over a 24-h period based on the actual historical data. The test results illustrate that the proposed economic dispatch can achieve a maximum operation cost reduction of 6.5% and effectively smooth the supply-demand uncertainties. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. Research on Reactive Power Optimization Strategy under the Intelligent Improvement Model of the Distribution Network.

Author: Yu, Menglin
Subjects: REACTIVE power, POWER distribution networks, ARTIFICIAL neural networks, MACHINE learning, REINFORCEMENT learning, WIND speed, POWER transmission, MULTIAGENT systems
Abstract: In order to improve the reactive power optimization effect of the distribution network, this paper combines the multiagent deep reinforcement learning algorithm to analyze the reactive power optimization strategy of the distribution network and constructs an intelligent optimization model. Moreover, the simulation models of power conversion elements, power transmission elements, control elements, and measurement elements in the platform are described, and the program structure and interactive functions are analyzed. In addition, this paper proposes a reactive power optimization method for distribution networks based on data-driven thinking. Finally, by using historical data and an artificial neural network, this paper extracts electrical quantity data such as load power and distributed power output and environmental data such as temperature and wind speed to perform multiagent analysis. The experimental verification shows that the reactive power optimization effect of the distribution network based on multiagent and multiagent deep reinforcement learning proposed in this paper is very good. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

25. Research on Anomaly Identification and Screening and Metallogenic Prediction Based on Semisupervised Neural Network.

Author: Zhang, Rongqing and Xi, Zhenzhu
Subjects: SUPERVISED learning, ARTIFICIAL neural networks, REINFORCEMENT learning, CONVOLUTIONAL neural networks
Abstract: This paper firstly introduces the background of the research on neural network and anomaly identification screening and mineralization prediction under semisupervised learning, then introduces supervised learning, semisupervised learning, unsupervised learning, and reinforcement learning, analyzes and compares their advantages and disadvantages, and concludes that unsupervised learning is the best way to process the data. In the research method, this paper classifies the obtained geochemical data by using semisupervised learning and then trains the obtained samples using the convolutional neural network model to obtain the mineralization prediction model and check its correctness, which finally provides the direction for the subsequent mineralization prediction research. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

26. AngoraPy: A Python toolkit for modeling anthropomorphic goal-driven sensorimotor systems.

Author: Weidler, Tonio, Goebel, Rainer, and Senden, Mario
Subjects: ARTIFICIAL neural networks, LARGE-scale brain networks, PYTHON programming language, GOAL (Psychology), CONVOLUTIONAL neural networks, COMPUTATIONAL neuroscience, NEUROSCIENCES
Abstract: Goal-driven deep learning increasingly supplements classical modeling approaches in computational neuroscience. The strength of deep neural networks as models of the brain lies in their ability to autonomously learn the connectivity required to solve complex and ecologically valid tasks, obviating the need for hand-engineered or hypothesis-driven connectivity patterns. Consequently, goal-driven models can generate hypotheses about the neurocomputations underlying cortical processing that are grounded in macro- and mesoscopic anatomical properties of the network's biological counterpart. Whereas, goal-driven modeling is already becoming prevalent in the neuroscience of perception, its application to the sensorimotor domain is currently hampered by the complexity of the methods required to train models comprising the closed sensation-action loop. This paper describes AngoraPy, a Python library that mitigates this obstacle by providing researchers with the tools necessary to train complex recurrent convolutional neural networks that model the human sensorimotor system. To make the technical details of this toolkit more approachable, an illustrative example that trains a recurrent toy model on in-hand object manipulation accompanies the theoretical remarks. An extensive benchmark on various classical, 3D robotic, and anthropomorphic control tasks demonstrates AngoraPy's general applicability to a wide range of tasks. Together with its ability to adaptively handle custom architectures, the flexibility of this toolkit demonstrates its power for goal-driven sensorimotor modeling. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Vehicle Simulation Algorithm for Observations with Variable Dimensions Based on Deep Reinforcement Learning.

Author: Liu, Yunzhuo, Zhang, Ruoning, and Zhou, Shijie
Subjects: DEEP reinforcement learning, ARTIFICIAL neural networks, DEEP learning, ALGORITHMS, REINFORCEMENT learning, TRAFFIC safety
Abstract: Vehicle simulation algorithms play a crucial role in enhancing traffic efficiency and safety by predicting and evaluating vehicle behavior in various traffic scenarios. Recently, vehicle simulation algorithms based on reinforcement learning have demonstrated excellent performance in practical tasks due to their ability to exhibit superior performance with zero-shot learning. However, these algorithms face challenges in field adaptation problems when deployed in task sets with variable-dimensional observations, primarily due to the inherent limitations of neural network models. In this paper, we propose a neural network structure accommodating variations in specific dimensions to enhance existing reinforcement learning methods. Building upon this, a scene-compatible vehicle simulation algorithm is designed. We conducted experiments on multiple tasks and scenarios using the Highway-Env traffic environment simulator. The results of our experiments demonstrate that the algorithm can successfully operate on all tasks using a neural network model with fixed shape, even with variable-dimensional observations. Our model exhibits no degradation in simulation performance when compared to the baseline algorithm. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

28. A Review of Deep Reinforcement Learning Algorithms for Mobile Robot Path Planning.

Author: Singh, Ramanjeet, Ren, Jing, and Lin, Xianke
Subjects: REINFORCEMENT learning, DEEP reinforcement learning, MACHINE learning, ROBOTIC path planning, MOBILE robots, ARTIFICIAL neural networks, PEDESTRIANS, SPACE robotics
Abstract: Path planning is the most fundamental necessity for autonomous mobile robots. Traditionally, the path planning problem was solved using analytical methods, but these methods need perfect localization in the environment, a fully developed map to plan the path, and cannot deal with complex environments and emergencies. Recently, deep neural networks have been applied to solve this complex problem. This review paper discusses path-planning methods that use neural networks, including deep reinforcement learning, and its different types, such as model-free and model-based, Q-value function-based, policy-based, and actor-critic-based methods. Additionally, a dedicated section delves into the nuances and methods of robot interactions with pedestrians, exploring these dynamics in diverse environments such as sidewalks, road crossings, and indoor spaces, underscoring the importance of social compliance in robot navigation. In the end, the common challenges faced by these methods and applied solutions such as reward shaping, transfer learning, parallel simulations, etc. to optimize the solutions are discussed. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

29. Off-policy and on-policy reinforcement learning with the Tsetlin machine.

Author: Rahimi Gorji, Saeed and Granmo, Ole-Christoffer
Subjects: MACHINE learning, REINFORCEMENT learning, ARTIFICIAL neural networks, SUPERVISED learning, PROPOSITION (Logic)
Abstract: The Tsetlin Machine is a recent supervised learning algorithm that has obtained competitive accuracy- and resource usage results across several benchmarks. It has been used for convolution, classification, and regression, producing interpretable rules in propositional logic. In this paper, we introduce the first framework for reinforcement learning based on the Tsetlin Machine. Our framework integrates the value iteration algorithm with the regression Tsetlin Machine as the value function approximator. To obtain accurate off-policy state-value estimation, we propose a modified Tsetlin Machine feedback mechanism that adapts to the dynamic nature of value iteration. In particular, we show that the Tsetlin Machine is able to unlearn and recover from the misleading experiences that often occur at the beginning of training. A key challenge that we address is mapping the intrinsically continuous nature of state-value learning to the propositional Tsetlin Machine architecture, leveraging probabilistic updates. While accurate off-policy, this mechanism learns significantly slower than neural networks on-policy. However, by introducing multi-step temporal-difference learning in combination with high-frequency propositional logic patterns, we are able to close the performance gap. Several gridworld instances document that our framework can outperform comparable neural network models, despite being based on simple one-level AND-rules in propositional logic. Finally, we propose how the class of models learnt by our Tsetlin Machine for the gridworld problem can be translated into a more understandable graph structure. The graph structure captures the state-value function approximation and the corresponding policy found by the Tsetlin Machine. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

30. Hosting Capacity Assessment Strategies and Reinforcement Learning Methods for Coordinated Voltage Control in Electricity Distribution Networks: A Review.

Author: Suchithra, Jude, Robinson, Duane, and Rajabi, Amin
Subjects: ELECTRIC power distribution, VOLTAGE control, REINFORCEMENT learning, LEARNING strategies, INFRASTRUCTURE (Economics)
Abstract: Increasing connection rates of rooftop photovoltaic (PV) systems to electricity distribution networks has become a major concern for the distribution network service providers (DNSPs) due to the inability of existing network infrastructure to accommodate high levels of PV penetration while maintaining voltage regulation and other operational requirements. The solution to this dilemma is to undertake a hosting capacity (HC) study to identify the maximum penetration limit of rooftop PV generation and take necessary actions to enhance the HC of the network. This paper presents a comprehensive review of two topics: HC assessment strategies and reinforcement learning (RL)-based coordinated voltage control schemes. In this paper, the RL-based coordinated voltage control schemes are identified as a means to enhance the HC of electricity distribution networks. RL-based algorithms have been widely used in many power system applications in recent years due to their precise, efficient and model-free decision-making capabilities. A large portion of this paper is dedicated to reviewing RL concepts and recently published literature on RL-based coordinated voltage control schemes. A non-exhaustive classification of RL algorithms for voltage control is presented and key RL parameters for the voltage control problem are identified. Furthermore, critical challenges and risk factors of adopting RL-based methods for coordinated voltage control are discussed. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

31. Application of Multilayer Perceptron Genetic Algorithm Neural Network in Chinese-English Parallel Corpus Noise Processing.

Author: Li, Bing, Tuo, Anxie, Kong, Hanyue, Liu, Sujiao, and Chen, Jia
Subjects: GENETIC algorithms, REINFORCEMENT learning, ONLINE algorithms, ARTIFICIAL neural networks, REWARD (Psychology), NATURAL selection
Abstract: This paper uses neural network as a predictive model and genetic algorithm as an online optimization algorithm to simulate the noise processing of Chinese-English parallel corpus. At the same time, according to the powerful random global search mechanism of genetic algorithm, this paper studied the principle and process of noise processing in Chinese-English parallel corpus. Aiming at the task of identifying isolated words for unspecified persons, taking into account the inadequacies of the algorithms in standard genetic algorithms and neural networks, this paper proposes a fast algorithm for training the network using genetic algorithms. Through simulation calculations, different characteristic parameters, the number of training samples, background noise, and whether a specific person affects the recognition result were analyzed and discussed and compared with the traditional dynamic time comparison method. This paper introduces the idea of reinforcement learning, uses different reward mechanisms to solve the inconsistency of loss function and evaluation index measurement methods, and uses different decoding methods to alleviate the problem of exposure bias. It uses various simple genetic operations and the survival of the fittest selection mechanism to guide the learning process and determine the direction of the search, and it can search multiple regions in the solution space at the same time. In addition, it also has the advantage of not being restricted by the restrictive conditions of the search space (such as differentiable, continuous, and unimodal). At the same time, a method of using English subword vectors to initialize the parameters of the translation model is given. The research results show that the neural network recognition method based on genetic algorithm which is given in this paper shows its ability of quickly learning network weights and it is superior to the standard in all aspects. The performance of the algorithm in genetic algorithm and neural network, with high recognition rate and unique application advantages, can achieve a win-win of time and efficiency. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

32. Airborne Radar Anti-Jamming Waveform Design Based on Deep Reinforcement Learning.

Author: Zheng, Zexin, Li, Wei, and Zou, Kun
Subjects: DEEP learning, REINFORCEMENT learning, RADAR interference, RADAR in aeronautics, ARTIFICIAL neural networks, MACHINE learning, MARKOV processes
Abstract: Airborne radars are susceptible to a large number of clutter, noise and variable jamming signals in the real environment, especially when faced with active main lobe jamming, as the waveform shortcut technology in the traditional regime can no longer meet the actual battlefield radar anti-jamming requirements. Therefore, it is necessary to study anti-main-lobe jamming techniques for airborne radars in complex environments to improve their battlefield survivability. In this paper, we propose an airborne radar waveform design method based on a deep reinforcement learning (DRL) algorithm under clutter and jamming conditions, after previous research on reinforcement-learning (RL)-based airborne radar anti-jamming waveform design methods that have improved the anti-jamming performance of airborne radars. The method uses a Markov decision process (MDP) to describe the complex operating environment of airborne radars, calculates the value of the radar anti-jamming waveform strategy under various jamming states using deep neural networks and designs the optimal anti-jamming waveform strategy for airborne radars based on the duelling double deep Q network (D3QN) algorithm. In addition, the method uses an iterative transformation method (ITM) to generate the time domain signals of the optimal waveform strategy. Simulation results show that the airborne radar waveform designed based on the deep reinforcement learning algorithm proposed in this paper improves the signal-to-jamming plus noise ratio (SJNR) by 2.08 dB and 3.03 dB, and target detection probability by 26.79% and 44.25%, respectively, compared with the waveform designed based on the reinforcement learning algorithm and the conventional linear frequency modulation (LFM) signal at a radar transmit power of 5 W. The airborne radar waveform design method proposed in this paper helps airborne radars to enhance anti-jamming performance in complex environments while further improving target detection performance. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

33. Time-Sensitive and Resource-Aware Concurrent Workflow Scheduling for Edge Computing Platforms Based on Deep Reinforcement Learning.

Author: Zhang, Jiaming, Wang, Tao, and Cheng, Lianglun
Subjects: DEEP reinforcement learning, COMPUTING platforms, EDGE computing, ARTIFICIAL neural networks, REINFORCEMENT learning, PARSING (Computer grammar), WORKFLOW
Abstract: The workflow scheduling on edge computing platforms in industrial scenarios aims to efficiently utilize the computing resources of edge platforms to meet user service requirements. Compared to ordinary task scheduling, tasks in workflow scheduling come with predecessor and successor constraints. The solutions to scheduling problems typically include traditional heuristic methods and modern deep reinforcement learning approaches. For heuristic methods, an increase in constraints complicates the design of scheduling rules, making it challenging to devise suitable algorithms. Additionally, whenever the environment undergoes updates, it necessitates the redesign of the scheduling algorithms. For existing deep reinforcement learning-based scheduling methods, there are often challenges related to training difficulty and computation time. The addition of constraints makes it challenging for neural networks to make decisions while satisfying those constraints. Furthermore, previous methods mainly relied on RNN and its variants to construct neural network models, lacking a computation time advantage. In response to these issues, this paper introduces a novel workflow scheduling method based on reinforcement learning, which utilizes neural networks for direct decision-making. On the one hand, this approach leverages deep reinforcement learning, eliminating the need for researchers to define complex scheduling rules. On the other hand, it separates the parsing of the workflow and constraint handling from the scheduling decisions, allowing the neural network model to focus on learning how to schedule without the necessity of learning how to handle workflow definitions and constraints among sub-tasks. The method optimizes resource utilization and response time, as its objectives and the network are trained using the PPO algorithm combined with Self-Critic, and the parameter transfer strategy is utilized to find the balance point for multi-objective optimization. Leveraging the advantages of reinforcement learning, the network can be trained and tested using randomly generated datasets. The experimental results indicate that the proposed method can generate different scheduling outcomes to meet various scenario requirements without modifying the neural network. Furthermore, when compared to other deep reinforcement learning methods, the proposed approach demonstrates certain advantages in scheduling performance and computation time. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

34. AdaXod: a new adaptive and momental bound algorithm for training deep neural networks.

Author: Liu, Yuanxuan and Li, Dequan
Subjects: ARTIFICIAL neural networks, OPTIMIZATION algorithms, REINFORCEMENT learning, ALGORITHMS, IMAGE recognition (Computer vision), DEEP learning
Abstract: Adaptive algorithms are widely used in deep learning because of their fast convergence. Among them, Adam is the most widely used algorithm. However, studies have shown that Adam's generalization ability is weak. AdaX is a variant of Adam, which introduces a novel second-order momentum, modifies the second-order moment of Adam, and has good generalization ability. However, these algorithms may fail to converge due to instability and extreme learning rates during training. In this paper, we propose a new adaptive and momental bound algorithm, called AdaXod, which characterizes of exponentially averaging the learning rate and is particularly useful for training deep neural networks. By setting an adaptively limited learning rate in the AdaX algorithm, the resultant AdaXod can effectively eliminate the problem of excessive learning rate in the later stage of neural networks training and thus results in stable training. We conduct extensive experiments on different datasets and verify the advantages of the AdaXod algorithm by comparing with other advanced adaptive optimization algorithms. AdaXod eliminates large learning rates during neural networks training and outperforms other optimizers, especially for some neural networks with complex structures, such as DenseNet. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

35. Deep Q-Network Approach for Train Timetable Rescheduling Based on Alternative Graph.

Author: Kim, Kyung-Min, Rho, Hag-Lae, Park, Bum-Hwan, and Min, Yun-Hong
Subjects: ARTIFICIAL neural networks, TRAIN schedules, MIXED integer linear programming, REINFORCEMENT learning, JOINT use of railroad facilities, TIME perspective, TRAFFIC density, MARKOV processes
Abstract: The disturbance of local areas with complex railway networks and high traffic density not only impedes the efficient use of rail networks in those areas, but also propagates delays to the entire railway network. This has motivated research on train rescheduling problems in high-density local areas to minimize train delays by modifying their planned arrival and departure times. In this paper, we present a train rescheduling method based on Q-learning in reinforcement learning. More specifically, we used deep neural networks to approximate the action-value function, and the underlying Markov decision process (MDP) is based on the alternative graph formulation for the train rescheduling problem. In the proposed MDP formulation, the status of the alternative graph corresponding to the current schedule is defined as the state, and the alternative arc corresponds to the action the agent can take. The MDP is approximately solved via deep Q-learning in which deep neural networks are used to approximate the action-value function in Q-learning. Although the size of the alternative graph depends on the number of trains, our MDP formulation is independent of the number of trains, which makes the proposed method more scalable. The evaluation of the method was performed on a simple railway network and a real-world example in Seoul, South Korea, with randomly generated initial train schedules and train delays. The experimental result showed that the proposed method is comparable to the mixed-integer-linear-programming (MILP)-based exact approach with respect to the quality of the solution. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

36. Recognizing handwritten digits using spiking neural networks with learning algorithms based on sliding mode control theory.

Author: ÖNİZ, Yeşim and AYYILDIZ, Mehmet
Subjects: ARTIFICIAL neural networks, SLIDING mode control, MACHINE learning, LYAPUNOV stability, REINFORCEMENT learning, HANDWRITING recognition (Computer science), STABILITY theory
Abstract: In this paper, a spiking neural network (SNN) has been proposed for recognizing the digits written on the LCD screen of an experimental setup. The convergence of the learning algorithm has been ensured by using sliding mode control (SMC) theory and the Lyapunov stability method for the adaptation of the network parameters. The spike response model (SRM) has been utilized in the design of the SNN. The performance of the proposed learning scheme has been evaluated both on the experimental data and on the MNIST dataset. The simulated and experimental results of the SNN structure have been compared with the responses of a conventional neural network (ANN) for which the weight update rules have been also derived using SMC theory. The conducted simulations and experimental studies reveal that convergence can be ensured for the proposed learning scheme and the SNN yields higher recognition accuracy compared to a conventional ANN. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

37. Joint DNN partitioning and task offloading in mobile edge computing via deep reinforcement learning.

Author: Zhang, Jianbing, Ma, Shufang, Yan, Zexiao, and Huang, Jiwei
Subjects: ARTIFICIAL neural networks, MOBILE computing, EDGE computing, REINFORCEMENT learning, ARTIFICIAL intelligence, POWER resources, MARKOV processes
Abstract: As Artificial Intelligence (AI) becomes increasingly prevalent, Deep Neural Networks (DNNs) have become a crucial tool for developing and advancing AI applications. Considering limited computing and energy resources on mobile devices (MDs), it is a challenge to perform compute-intensive DNN tasks on MDs. To attack this challenge, mobile edge computing (MEC) provides a viable solution through DNN partitioning and task offloading. However, as the communication conditions between different devices change over time, DNN partitioning on different devices must also change synchronously. This is a dynamic process, which aggravates the complexity of DNN partitioning. In this paper, we delve into the issue of jointly optimizing energy and delay for DNN partitioning and task offloading in a dynamic MEC scenario where each MD and the server adopt the pre-trained DNNs for task inference. Taking advantage of the characteristics of DNN, we first propose a strategy for layered partitioning of DNN tasks to divide the task of each MD into subtasks that can be either processed on the MD or offloaded to the server for computation. Then, we formulate the trade-off between energy and delay as a joint optimization problem, which is further represented as a Markov decision process (MDP). To solve this, we design a DNN partitioning and task offloading (DPTO) algorithm utilizing deep reinforcement learning (DRL), which enables MDs to make optimal offloading decisions. Finally, experimental results demonstrate that our algorithm outperforms existing non-DRL and DRL algorithms with respect to processing delay and energy consumption, and can be applied to different DNN types. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

38. Transhumeral Arm Reaching Motion Prediction through Deep Reinforcement Learning-Based Synthetic Motion Cloning.

Author: Ahmed, Muhammad Hannan, Kutsuzawa, Kyo, and Hayashibe, Mitsuhiro
Subjects: REINFORCEMENT learning, DEEP reinforcement learning, ELBOW joint, ARTIFICIAL neural networks, RESIDUAL limbs, ELBOW, DATA augmentation
Abstract: The lack of intuitive controllability remains a primary challenge in enabling transhumeral amputees to control a prosthesis for arm reaching with residual limb kinematics. Recent advancements in prosthetic arm control have focused on leveraging the predictive capabilities of artificial neural networks (ANNs) to automate elbow joint motion and wrist pronation–supination during target reaching tasks. However, large quantities of human motion data collected from different subjects for various activities of daily living (ADL) tasks are required to train these ANNs. For example, the reaching motion can be altered when the height of the desk is changed; however, it is cumbersome to conduct human experiments for all conditions. This paper proposes a framework for cloning motion datasets using deep reinforcement learning (DRL) to cater to training data requirements. DRL algorithms have been demonstrated to create human-like synergistic motion in humanoid agents to handle redundancy and optimize movements. In our study, we collected real motion data from six individuals performing multi-directional arm reaching tasks in the horizontal plane. We generated synthetic motion data that mimicked similar arm reaching tasks by utilizing a physics simulation and DRL-based arm manipulation. We then trained a CNN-LSTM network with different configurations of training motion data, including DRL, real, and hybrid datasets, to test the efficacy of the cloned motion data. The results of our evaluation showcase the effectiveness of the cloned motion data in training the ANN to predict natural elbow motion accurately across multiple subjects. Furthermore, motion data augmentation through combining real and cloned motion datasets has demonstrated the enhanced robustness of the ANN by supplementing and diversifying the limited training data. These findings have significant implications for creating synthetic dataset resources for various arm movements and fostering strategies for automatized prosthetic elbow motion. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

39. BoxStacker: Deep Reinforcement Learning for 3D Bin Packing Problem in Virtual Environment of Logistics Systems.

Author: Murdivien, Shokhikha Amalana and Um, Jumyung
Subjects: DEEP reinforcement learning, BIN packing problem, ARTIFICIAL neural networks, REINFORCEMENT learning, VIRTUAL reality, ARTIFICIAL intelligence
Abstract: Manufacturing systems need to be resilient and self-organizing to adapt to unexpected disruptions, such as product changes or rapid order, in supply chain changes while increasing the automation level of robotized logistics processes to cope with the lack of human experts. Deep Reinforcement Learning is a potential solution to solve more complex problems by introducing artificial neural networks in Reinforcement Learning. In this paper, a game engine was used for Deep Reinforcement Learning training, which allows visualization of view learning and result processes more intuitively than other tools, as well as a physical engine for a more realistic problem-solving environment. The present research demonstrates that a Deep Reinforcement Learning model can effectively address the real-time sequential 3D bin packing problem by utilizing a game engine to visualize the environment. The results indicate that this approach holds promise for tackling complex logistical challenges in dynamic settings. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

40. An Implementation of Actor-Critic Algorithm on Spiking Neural Network Using Temporal Coding Method.

Author: Lu, Junqi, Wu, Xinning, Cao, Su, Wang, Xiangke, and Yu, Huangchao
Subjects: ARTIFICIAL neural networks, REINFORCEMENT learning, TIME-varying networks, ALGORITHMS, FLIGHT testing, GRIDS (Cartography)
Abstract: Featured Application: Rapid decision-making on micro drones. Taking advantage of faster speed, less resource consumption and better biological interpretability of spiking neural networks, this paper developed a novel spiking neural network reinforcement learning method using actor-critic architecture and temporal coding. The simple improved leaky integrate-and-fire (LIF) model was used to describe the behavior of a spike neuron. Then the actor-critic network structure and the update formulas using temporally encoded information were provided. The current model was finally examined in the decision-making task, the gridworld task, the UAV flying through a window task and the avoiding a flying basketball task. In the 5 × 5 grid map, the value function learned was close to the ideal situation and the quickest way from one state to another was found. A UAV trained by this method was able to fly through the window quickly in simulation. An actual flight test of a UAV avoiding a flying basketball was conducted. With this model, the success rate of the test was 96% and the average decision time was 41.3 ms. The results show the effectiveness and accuracy of the temporal coded spiking neural network RL method. In conclusion, an attempt was made to provide insights into developing spiking neural network reinforcement learning methods for decision-making and autonomous control of unmanned systems. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

41. Occupancy Reward-Driven Exploration with Deep Reinforcement Learning for Mobile Robot System.

Author: Kamalova, Albina, Lee, Suk Gyu, and Kwon, Soon Hak
Subjects: REINFORCEMENT learning, MOBILE robots, ARTIFICIAL neural networks, LINEAR velocity, ANGULAR velocity
Abstract: This paper investigates the solution to a mobile-robot exploration problem following autonomous driving principles. The exploration task is formulated in this study as a process of building a map while a robot moves in an indoor environment beginning from full uncertainties. The sequence of robot decisions of how to move defines the strategy of the exploration that this paper aims to investigate, applying one of the Deep Reinforcement Learning methods, known as the Deep Deterministic Policy Gradient (DDPG) algorithm. A custom environment is created representing the mapping process with a map visualization, a robot model, and a reward function. The actor-critic network receives and sends input and output data, respectively, to the custom environment. The input is the data from the laser sensor, which is equipped on the robot. The output is the continuous actions of the robot in terms of linear and angular velocities. The training results of this study show the strengths and weaknesses of the DDPG algorithm for the robotic mapping problem. The implementation was developed in MATLAB platform using its corresponding toolboxes. A comparison with another exploration algorithm is also provided. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

42. The Synergy of Double Neural Networks for Bridge Bidding.

Author: Zhang, Xiaoyu, Lin, Rongheng, Bo, Yuchang, and Yang, Fangchun
Subjects: ARTIFICIAL neural networks, REINFORCEMENT learning, BIDS, BRIDGES, RULES of games, ARTIFICIAL intelligence
Abstract: Artificial intelligence (AI) has made many breakthroughs in the perfect information game. Nevertheless, Bridge, a multiplayer imperfect information game, is still quite challenging. Bridge consists of two parts: bidding and playing. Bidding accounts for about 75% of the game and playing for about 25%. Expert-level teams are generally indistinguishable at the playing level, so bidding is the more decisive factor in winning or losing. The two teams can communicate using different systems during the bidding phase. However, existing bridge bidding models focus on at most one bidding system, which does not conform to the real game rules. This paper proposes a deep reinforcement learning model that supports multiple bidding systems, which can compete with players using different bidding systems and exchange hand information normally. The model mainly comprises two deep neural networks: a bid selection network and a state evaluation network. The bid selection network can predict the probabilities of all bids, and the state evaluation network can directly evaluate the optional bids and make decisions based on the evaluation results. Experiments show that the bidding model is not limited by a single bidding system and has superior bidding performance. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

43. Unmanned aerial vehicle–human collaboration route planning for intelligent infrastructure inspection.

Author: Pan, Yue, Li, Linfeng, Qin, Jianjun, Chen, Jin‐Jian, and Gardoni, Paolo
Subjects: *ARTIFICIAL neural networks, *DEEP reinforcement learning, *REINFORCEMENT learning, *INFRASTRUCTURE (Economics), *COMBINATORIAL optimization
Abstract: Motivated by the strengths of unmanned aerial vehicle (UAV), the UAV–human collaboration route planning (UHCRP) for intelligent infrastructure inspection is a problem worthy of discussion to help reduce human costs and minimize the risk of noninspected infrastructures under limited resources. To facilitate UHCRP, this paper proposes a novel deep reinforcement learning (DRL)‐based approach to well handle multi‐source uncertain features and constraints at a fast speed. To begin with, UHCRP is mathematically described and reformulated as a dual interdependent deep reinforcement learning (diDRL) framework to reflect real‐world scenarios. Afterward, a novel policy network named the attention‐based deep neural network (A‐DNN) is introduced to learn the route planning decisions for the combinatorial optimization problem. In particular, A‐DNN is made up of an encoder and a dual decoder for UAV and human inspection, where the multi‐head attention mechanism is incorporated to generate richer representations for model performance improvement. Performance of the proposed dual multi‐head attention model (DAM) has been tested in simulations and a real‐world case study regarding wind farm inspection. Results indicate that DAM under the sampling decoding strategy can deliver a high‐quality path plan and show better generalizability for larger scale problem sizes compared to single‐head attention model (SAM), multi‐head attention model (AM), and two baseline models, namely OR‐Tools and genetic algorithm. Moreover, DAM trained by randomly generated data can be directly employed to solve the practical problem with standardization of inputs. Overall, DRL integrates decision‐making for inspection method selection and inspected infrastructure selection, providing adaptive and intelligent inspection path planning for UAV and human in complex and dynamic engineering environments. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. Machine learning and logic: a new frontier in artificial intelligence.

Author: Ganesh, Vijay, Seshia, Sanjit A., and Jha, Somesh
Subjects: MACHINE learning, ARTIFICIAL intelligence, REINFORCEMENT learning, ARTIFICIAL neural networks, LOGIC
Abstract: Machine learning and logical reasoning have been the two foundational pillars of Artificial Intelligence (AI) since its inception, and yet, until recently the interactions between these two fields have been relatively limited. Despite their individual success and largely independent development, there are new problems on the horizon that seem solvable only via a combination of ideas from these two fields of AI. These problems can be broadly characterized as follows: how can learning be used to make logical reasoning and synthesis/verification engines more efficient and powerful, and in the reverse direction, how can we use reasoning to improve the accuracy, generalizability, and trustworthiness of learning. In this perspective paper, we address the above-mentioned questions with an emphasis on certain paradigmatic trends at the intersection of learning and reasoning. Our intent here is not to be a comprehensive survey of all the ways in which learning and reasoning have been combined in the past. Rather we focus on certain recent paradigms where corrective feedback loops between learning and reasoning seem to play a particularly important role. Specifically, we observe the following three trends: first, the use of learning techniques (especially, reinforcement learning) in sequencing, selecting, and initializing proof rules in solvers/provers; second, combinations of inductive learning and deductive reasoning in the context of program synthesis and verification; and third, the use of solver layers in providing corrective feedback to machine learning models in order to help improve their accuracy, generalizability, and robustness with respect to partial specifications or domain knowledge. We believe that these paradigms are likely to have significant and dramatic impact on AI and its applications for a long time to come. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

45. Deep Neural Networks in Power Systems: A Review.

Author: Khodayar, Mahdi and Regan, Jacob
Subjects: ARTIFICIAL neural networks, DEEP learning, REINFORCEMENT learning, ARTIFICIAL intelligence, TRENDS, ELECTRIC power distribution grids
Abstract: Identifying statistical trends for a wide range of practical power system applications, including sustainable energy forecasting, demand response, energy decomposition, and state estimation, is regarded as a significant task given the rapid expansion of power system measurements in terms of scale and complexity. In the last decade, deep learning has arisen as a new kind of artificial intelligence technique that expresses power grid datasets via an extensive hypothesis space, resulting in an outstanding performance in comparison with the majority of recent algorithms. This paper investigates the theoretical benefits of deep data representation in the study of power networks. We examine deep learning techniques described and deployed in a variety of supervised, unsupervised, and reinforcement learning scenarios. We explore different scenarios in which discriminative deep frameworks, such as Stacked Autoencoder networks and Convolution Networks, and generative deep architectures, including Deep Belief Networks and Variational Autoencoders, solve problems. This study's empirical and theoretical evaluation of deep learning encourages long-term studies on improving this modern category of methods to accomplish substantial advancements in the future of electrical systems. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

46. Artificial‐Intelligence‐Powered Lower Limb Assistive Devices: Future of Home Care Technologies.

Author: Mehr, Javad K., Akbari, Mojtaba, Faridi, Pouria, Xing, Hongjun, Mushahwar, Vivian K., and Tavakoli, Mahdi
Subjects: ASSISTIVE technology, CENTRAL pattern generators, ROBOTIC exoskeletons, ARTIFICIAL neural networks, HUMAN-robot interaction
Abstract: Healthcare systems are burdened by mobility impairments resulting from aging and neurological conditions. One of the recent advances in robotics is lower limb assistive/rehabilitative devices that can make independent living possible. Nonetheless, some limitations need to be addressed before robotics can be used in home‐based applications. This paper describes the current state of the art in intelligent motion planning and control of lower limb assistive devices, which have addressed some of these challenges. Adaptable central pattern generators and the divergent component of motion are introduced as methods for personalized motion planning based on physical human–robot interaction (pHRI). Uncertainty analysis for neural networks is introduced to increase safety in motion planning based on pHRI. For the case that a user cannot apply physical interaction, a reinforcement‐learning‐based approach is introduced to switch between different modes of walking based on the user's input via a push button embedded in a walker. Moreover, a smart walker is introduced as a device that can be synchronized with the lower limb exoskeleton to assist users with their daily activities. Also, a roadmap for future steps that can make lower limb assistive/rehabilitative devices a good fit for home use is introduced. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

47. Artificial Intelligence Applications in Project Scheduling: A Systematic Review, Bibliometric Analysis, and Prospects for Future Research.

Author: Bahroun, Zied, Tanash, Moayad, As'ad, Rami, and Alnajar, Mohamad
Subjects: ARTIFICIAL intelligence, BIBLIOMETRICS, ARTIFICIAL neural networks, EVIDENCE gaps, REINFORCEMENT learning
Abstract: The availability of digital infrastructures and the fast-paced development of accompanying revolutionary technologies have triggered an unprecedented reliance on Artificial intelligence (AI) techniques both in theory and practice. Within the AI domain, Machine Learning (ML) techniques stand out as essential facilitator largely enabling machines to possess human-like cognitive and decision making capabilities. This paper provides a focused review of the literature addressing applications of emerging ML tools to solve various Project Scheduling Problems (PSPs). In particular, it employs bibliometric and network analysis tools along with a systematic literature review to analyze a pool of 104 papers published between 1985 and August 2021. The conducted analysis unveiled the top contributing authors, the most influential papers as well as the existing research tendencies and thematic research topics within this field of study. A noticeable growth in the number of relevant studies is seen recently with a steady increase as of the year 2018. Most of the studies adopted Artificial Neural Networks, Bayesian Network and Reinforcement Learning techniques to tackle PSPs under a stochastic environment, where these techniques are frequently hybridized with classical metaheuristics. The majority of works (57%) addressed basic Resource Constrained PSPs and only 15% are devoted to the project portfolio management problem. Furthermore, this study clearly indicates that the application of AI techniques to efficiently handle PSPs is still in its infancy stage bringing out the need for further research in this area. This work also identifies current research gaps and highlights a multitude of promising avenues for future research. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

48. Multi-Lane Differential Variable Speed Limit Control via Deep Neural Networks Optimized by an Adaptive Evolutionary Strategy.

Author: Feng, Jianshuai, Shi, Tianyu, Wu, Yuankai, Xie, Xiang, He, Hongwen, and Tan, Huachun
Subjects: ARTIFICIAL neural networks, REINFORCEMENT learning, SPEED limits, REWARD (Psychology), TRAVEL time (Traffic engineering), GLOBAL optimization
Abstract: In advanced transportation-management systems, variable speed limits are a crucial application. Deep reinforcement learning methods have been shown to have superior performance in many applications, as they are an effective approach to learning environment dynamics for decision-making and control. However, they face two significant difficulties in traffic-control applications: reward engineering with delayed reward and brittle convergence properties with gradient descent. To address these challenges, evolutionary strategies are well suited as a class of black-box optimization techniques inspired by natural evolution. Additionally, the traditional deep reinforcement learning framework struggles to handle the delayed reward setting. This paper proposes a novel approach using covariance matrix adaptation evolution strategy (CMA-ES), a gradient-free global optimization method, to handle the task of multi-lane differential variable speed limit control. The proposed method uses a deep-learning-based method to dynamically learn optimal and distinct speed limits among lanes. The parameters of the neural network are sampled using a multivariate normal distribution, and the dependencies between the variables are represented by a covariance matrix that is optimized dynamically by CMA-ES based on the freeway's throughput. The proposed approach is tested on a freeway with simulated recurrent bottlenecks, and the experimental results show that it outperforms deep reinforcement learning-based approaches, traditional evolutionary search methods, and the no-control scenario. Our proposed method demonstrates a 23% improvement in average travel time and an average of a 4% improvement in CO, HC, and NOx emission.Furthermore, the proposed method produces explainable speed limits and has desirable generalization power. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

49. An AI-Based Power Reserve Control Strategy for Photovoltaic Power Generation Systems Participating in Frequency Regulation of Microgrids.

Author: Zhou, Sihan, Qin, Liang, Ruan, Jiangjun, Wang, Jing, Liu, Haofeng, Tang, Xu, Wang, Xiaole, and Liu, Kaipei
Subjects: PHOTOVOLTAIC power generation, PHOTOVOLTAIC power systems, ARTIFICIAL intelligence, REINFORCEMENT learning, MAXIMUM power point trackers, MICROGRIDS
Abstract: In this paper, a novel AI-based power reserve control strategy is proposed for photovoltaic (PV) power generation systems participating in the frequency regulation (FR) of microgrids. The proposed strategy uses a frequency response module to determine the target power reserve ratio of the PV system based on microgrid frequency deviation, as well as a power reserve control module to obtain the target duty cycle, which is input to the BOOST converter. The use of artificial neural networks (ANN) in the power reserve control module enables the PV system to work at a specified power reserve ratio, producing appropriate power and mitigating frequency fluctuations in the microgrid. Additionally, a deep reinforcement learning (DRL) algorithm is employed as the decision maker for variable step-size control and initial power reserve ratio determination. Simulations were performed to validate the effectiveness of the proposed method, demonstrating a significant reduction in average frequency deviation by 72.36% when subjected to random variations in irradiance intensity and load conditions. Overall, the proposed AI-based power reserve control strategy has good potential for practical applications in real-world microgrids, promoting the absorption of new energy led by PV and reducing the phenomenon of light abandonment. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

50. Deep Learning Movement Intent Decoders Trained With Dataset Aggregation for Prosthetic Limb Control.

Author: Dantas, Henrique, Warren, David J., Wendelken, Suzanne M., Davis, Tyler S., Clark, Gregory A., and Mathews, V John
Subjects: MULTILAYER perceptrons, ARTIFICIAL neural networks, DEEP learning, ARTIFICIAL hands, KALMAN filtering, BIOMEDICAL signal processing
Abstract: Significance: The performance of traditional approaches to decoding movement intent from electromyograms (EMGs) and other biological signals commonly degrade over time. Furthermore, conventional algorithms for training neural network based decoders may not perform well outside the domain of the state transitions observed during training. The work presented in this paper mitigates both these problems, resulting in an approach that has the potential to substantially improve the quality of life of the people with limb loss. Objective: This paper presents and evaluates the performance of four decoding methods for volitional movement intent from intramuscular EMG signals. Methods: The decoders are trained using the dataset aggregation (DAgger) algorithm, in which the training dataset is augmented during each training iteration based on the decoded estimates from previous iterations. Four competing decoding methods, namely polynomial Kalman filters (KFs), multilayer perceptron (MLP) networks, convolutional neural networks (CNN), and long short-term memory (LSTM) networks, were developed. The performances of the four decoding methods were evaluated using EMG datasets recorded from two human volunteers with transradial amputation. Short-term analyses, in which the training and cross-validation data came from the same dataset, and long-term analyses, in which the training and testing were done in different datasets, were performed. Results: Short-term analyses of the decoders demonstrated that CNN and MLP decoders performed significantly better than KF and LSTM decoders, showing an improvement of up to 60% in the normalized mean-square decoding error in cross-validation tests. Long-term analyses indicated that the CNN, MLP, and LSTM decoders performed significantly better than a KF-based decoder at most analyzed cases of temporal separations (0–150 days) between the acquisition of the training and testing datasets. Conclusion: The short-term and long-term performances of MLP- and CNN-based decoders trained with DAgger demonstrated their potential to provide more accurate and naturalistic control of prosthetic hands than alternate approaches. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

503 results

Search Results

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources