Federated reinforcement learning approach for detecting uncertain deceptive target using autonomous dual UAV system

Haythem Bany Salameh,Mohannad Alhafnawi,Ala'eddin Masadeh,Yaser Jararweh

Information Processing & Management（2023）

引用 6|浏览30

暂无评分

摘要

This paper develops a cooperative federated reinforcement learning (RL) strategy that enables two unmanned aerial vehicles (UAVs) to cooperate in learning and predicting the movements of an intelligent deceptive target in a given search area. The proposed strategy allows the UAVs to autonomously cooperate, through information exchange of the gained experience to maximize the target detection performance and accelerate the learning speed while maintaining privacy. Specifically, we consider a monitoring model that includes a search area, a charging station, two cooperative UAVs, an intelligent deceptive uncertain moving target, and a fake (false) target. Each UAV is equipped with a limited-capacity rechargeable battery and a communication unit for exchanging the gained experience. The problem of maximizing the detection probability of the uncertain deceptive target using cooperative UAVs is mathematically modeled as a search -benefit maximization problem, which is then reformulated as a Markov decision process (MDP) due to the uncertainty nature of the problem. Because there is no prior information on the targets' movement, a cooperative RL, is utilized to tackle the problem. The proposed cooperative RL-based algorithm is a distributed collaborative mechanism that enables the two UAVs, i.e., agents, to individually interact with the operating environment and maximize their cumulative rewards by converging to a shared policy while achieving privacy. Simulation results indicate that a cooperative RL-based dual UAV system can noticeably improve the target detection probability, reduce the detection performance, and accelerate the learning speed.

查看译文

关键词

Cooperative learning,Federated learning,Artificial intelligence,Emerging UAV,Indoor environment

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要