Rethinking Out-of-Distribution Detection for Reinforcement Learning: Advancing Methods for Evaluation and Detection
arxiv(2024)
摘要
While reinforcement learning (RL) algorithms have been successfully applied
across numerous sequential decision-making problems, their generalization to
unforeseen testing environments remains a significant concern. In this paper,
we study the problem of out-of-distribution (OOD) detection in RL, which
focuses on identifying situations at test time that RL agents have not
encountered in their training environments. We first propose a clarification of
terminology for OOD detection in RL, which aligns it with the literature from
other machine learning domains. We then present new benchmark scenarios for OOD
detection, which introduce anomalies with temporal autocorrelation into
different components of the agent-environment loop. We argue that such
scenarios have been understudied in the current literature, despite their
relevance to real-world situations. Confirming our theoretical predictions, our
experimental results suggest that state-of-the-art OOD detectors are not able
to identify such anomalies. To address this problem, we propose a novel method
for OOD detection, which we call DEXTER (Detection via Extraction of Time
Series Representations). By treating environment observations as time series
data, DEXTER extracts salient time series features, and then leverages an
ensemble of isolation forest algorithms to detect anomalies. We find that
DEXTER can reliably identify anomalies across benchmark scenarios, exhibiting
superior performance compared to both state-of-the-art OOD detectors and
high-dimensional changepoint detectors adopted from statistics.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要