Multi-view reinforcement learning for sequential decision-making with insufficient state information

International Journal of Machine Learning and Cybernetics（2024）

引用 0|浏览5

暂无评分

摘要

Most reinforcement learning methods describe sequential decision-making as a Markov decision process where the effect of action is only decided by the current state. But this is reasonable only if the state is correctly defined and the state information is sufficiently observed. Thus the learning efficiency of reinforcement learning methods based on Markov decision process is limited when the state information is insufficient. Partially observable Markov decision process and history-based decision process are respectively proposed to describe sequential decision-making with insufficient state information. However, these two processes are easy to ignore the important information from the current observed state. Therefore, the learning efficiency of reinforcement learning methods based on these two processes is also limited when the state information is insufficient. In this paper, we propose a multi-view reinforcement learning method to solve this problem. The motivation is that the interaction information between the agent and its environment should be considered from the views of history, present, and future to overcome the insufficiency of state information. Based on these views, we construct a multi-view decision process to describe sequential decision-making with insufficient state information. A multi-view reinforcement learning method is proposed by combining the multi-view decision process and the actor-critic framework. In the proposed method, multi-view clustering is performed to ensure that each type of sample can be sufficiently exploited. Experiments illustrate that the proposed method is more effective than the compared state-of-the-arts. The source code can be downloaded from https://github.com/jamieliuestc/MVRL.

查看译文

关键词

Multi-view reinforcement learning,Multi-view clustering,Insufficient state information,Multi-view decision process

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要