Generalized Predictive Model for Autonomous Driving
CVPR 2024(2024)
摘要
In this paper, we introduce the first large-scale video prediction model in
the autonomous driving discipline. To eliminate the restriction of high-cost
data collection and empower the generalization ability of our model, we acquire
massive data from the web and pair it with diverse and high-quality text
descriptions. The resultant dataset accumulates over 2000 hours of driving
videos, spanning areas all over the world with diverse weather conditions and
traffic scenarios. Inheriting the merits from recent latent diffusion models,
our model, dubbed GenAD, handles the challenging dynamics in driving scenes
with novel temporal reasoning blocks. We showcase that it can generalize to
various unseen driving datasets in a zero-shot manner, surpassing general or
driving-specific video prediction counterparts. Furthermore, GenAD can be
adapted into an action-conditioned prediction model or a motion planner,
holding great potential for real-world driving applications.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要