谷歌浏览器插件
订阅小程序
在清言上使用

Sparse Variational Deterministic Policy Gradient for Continuous Real-Time Control.

IEEE transactions on industrial electronics(2021)

引用 13|浏览14
暂无评分
摘要
Recent advancements in deep reinforcement learning for real control tasks have received interest from many researchers and field engineers in a variety of industrial areas. However, in most cases, optimal policies obtained by deep reinforcement learning are difficult to implement on cost-effective and lightweight platforms such as mobile devices. This can be attributed to their computational complexity and excessive memory usage. For this reason, this article proposes an off-policy deep reinforcement learning algorithm called the sparse variational deterministic policy gradient (SVDPG). SVDPG provides highly efficient policy network compression under the standard reinforcement learning framework. The proposed SVDPG integrates Bayesian pruning, which is known as a state-of-the-art neural network compression technique, with the policy update in an actor-critic architecture for reinforcement learning. It is demonstrated that SVDPG achieves a high compression rate of policy networks for continuous control benchmark tasks while preserving a competitive performance. The superiority of SVDPG in low-computing power devices is proven by comparing the level of compression in terms of the memory requirements and computation time on a commercial microcontroller unit. Finally, it is confirmed that the proposed SVDPG is also reliable in real-world scenarios since it can be applied to the swing-up control of an inverted pendulum system.
更多
查看译文
关键词
Bayes methods,Machine learning,Neural networks,Learning (artificial intelligence),Computational modeling,Standards,Optimization,Bayesian compression,deep reinforcement learning,inverted pendulum system,sparse Bayesian deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要