Conservative and Risk-Aware Offline Multi-Agent Reinforcement Learning for Digital Twins
CoRR(2024)
摘要
Digital twin (DT) platforms are increasingly regarded as a promising
technology for controlling, optimizing, and monitoring complex engineering
systems such as next-generation wireless networks. An important challenge in
adopting DT solutions is their reliance on data collected offline, lacking
direct access to the physical environment. This limitation is particularly
severe in multi-agent systems, for which conventional multi-agent reinforcement
(MARL) requires online interactions with the environment. A direct application
of online MARL schemes to an offline setting would generally fail due to the
epistemic uncertainty entailed by the limited availability of data. In this
work, we propose an offline MARL scheme for DT-based wireless networks that
integrates distributional RL and conservative Q-learning to address the
environment's inherent aleatoric uncertainty and the epistemic uncertainty
arising from limited data. To further exploit the offline data, we adapt the
proposed scheme to the centralized training decentralized execution framework,
allowing joint training of the agents' policies. The proposed MARL scheme,
referred to as multi-agent conservative quantile regression (MA-CQR) addresses
general risk-sensitive design criteria and is applied to the trajectory
planning problem in drone networks, showcasing its advantages.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要