Cognitive Optimal-Setting Control of AIoT Industrial Applications With Deep Reinforcement Learning

IEEE Transactions on Industrial Informatics(2021)

引用 26|浏览100
暂无评分
摘要
For industrial applications of the artificial intelligence of things, mechanical control usually affects the overall product output and production schedule. Recently, more and more engineers have applied the deep reinforcement learning method to mechanical control to improve the company's profit. However, the problem of deep reinforcement learning training stage is that overfitting often occurs, which results in accidental control and increases the risk of overcontrol. In order to address this problem, in this article, an expected advantage learning method is proposed for moderating the maximum value of expectation-based deep reinforcement learning for industrial applications. With the tanh softmax policy of the softmax function, we replace the sigmod function with the tanh function as the softmax function activation value. It makes it so that the proposed expectation-based method can successfully decrease the value overfitting in cognitive computing. In the experimental results, the performance of the Deep Q Network algorithm, advantage learning algorithm, and propose expected advantage learning method were evaluated in every episodes with the four criteria: the total score, total step, average score, and highest score. Comparing with the AL algorithm, the total score of the proposed expected advantage learning method is increased by 6% in the same number of trainings. This shows that the action probability distribution of the proposed expected advantage learning method has better performance than the traditional soft-max strategy for the optimal setting control of industrial applications.
更多
查看译文
关键词
Cognitive learning,deep reinforcement learning,expectation-based method,overfitting
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要