Search

Your search keyword '"policy gradient"' showing total 421 results

Search Constraints

Start Over You searched for: Descriptor "policy gradient" Remove constraint Descriptor: "policy gradient"
421 results on '"policy gradient"'

Search Results

1. Relabeling and policy distillation of hierarchical reinforcement learning.

2. A multi-step on-policy deep reinforcement learning method assisted by off-policy policy evaluation.

3. Experimental Implementation of a TD3 Agent Based Speed Controller for Direct Torque Control of PMSM Drives.

4. Sample complexity of variance-reduced policy gradient: weaker assumptions and lower bounds.

5. Investigating the Efficacy of Deep Reinforcement Learning Models in Detecting and Mitigating Cyber-attacks: a Novel Approach.

6. 基于强化学习的多智能体协同 电子对抗方法.

7. 策略梯度的超启发算法求解带容量约束车辆路径问题.

8. Anti-conflict AGV path planning in automated container terminals based on multi-agent reinforcement learning.

9. Landscape Analysis of Stochastic Policy Gradient Methods

10. Enhancing Adversarial Robustness for Deep Metric Learning via Attention-Aware Knowledge Guidance

11. A Reinforcement Learning Framework for Lung Segmentation of COVID-19 and Pneumonia Affected Chest X-Ray Image

13. Enhancing Policy Gradient for Traveling Salesman Problem with Data Augmented Behavior Cloning

16. Reinforce Model Tracklet for Multi-Object Tracking

17. List-Based Workflow Scheduling Utilizing Deep Reinforcement Learning

19. Reinforcement learning with dynamic convex risk measures.

20. Optimal Power Allocation in Optical GEO Satellite Downlinks Using Model-Free Deep Learning Algorithms.

21. Adaptive bias-variance trade-off in advantage estimator for actor–critic algorithms.

22. Vision-based control in the open racing car simulator with deep and reinforcement learning.

23. Reinforcement learning-based cost-sensitive classifier for imbalanced fault classification.

24. FeMIP: detector-free feature matching for multimodal images with policy gradient.

25. 用于连续时间中策略梯度算法的 动作稳定更新算法.

26. Combining Neural Networks with Logic Rules.

27. Modeling limit order trading with a continuous action policy for deep reinforcement learning.

28. Reinforced mixture learning.

29. Regret Analysis of a Markov Policy Gradient Algorithm for Multiarm Bandits.

30. Exploring the Use of Invalid Action Masking in Reinforcement Learning: A Comparative Study of On-Policy and Off-Policy Algorithms in Real-Time Strategy Games.

31. BLOCK POLICY MIRROR DESCENT.

32. Credit assignment with predictive contribution measurement in multi-agent reinforcement learning.

33. A task allocation algorithm based on reinforcement learning in spatio-temporal crowdsourcing.

34. DrugEx v3: scaffold-constrained drug design with graph transformer-based reinforcement learning

35. Standardising policy and technology responses in the immediate aftermath of a pandemic: a comparative and conceptual framework

36. Policy Gradient for Arabic to English Neural Machine Translation

37. RLPassGAN: Password Guessing Model Based on GAN with Policy Gradient

38. Policy Gradient Reinforcement Learning Method for Backward Motion Control of Tractor-Trailer Mobile Robot

39. An Open Domain Question Answering System Trained by Reinforcement Learning

40. Robust reinforcement learning algorithm based on pigeon-inspired optimization

41. Policy gradient adaptive dynamic programming for nonlinear discrete-time zero-sum games with unknown dynamics.

42. Decentralized multi-task reinforcement learning policy gradient method with momentum over networks.

43. Multi-label sequence generating model via label semantic attention mechanism.

44. Reinforcement Learning-Based Approach for Minimizing Energy Loss of Driving Platoon Decisions †.

45. Human Pathogenic Monkeypox Disease Recognition Using Q-Learning Approach.

46. 基于粒子群优化和深度强化学习的策略搜索方法.

47. Research on Wargame Decision-Making Method Based on Multi-Agent Deep Deterministic Policy Gradient.

48. DEEP REINFORCEMENT LEARNING FOR AIRCRAFT LONGITUDINAL CONTROL AUGMENTATION SYSTEM.

49. An Initial Residual Stress Inference Method by Incorporating Monitoring Data and Mechanism Model

50. DrugEx v3: scaffold-constrained drug design with graph transformer-based reinforcement learning.

Catalog

Books, media, physical & digital resources