Fast, parallel implementation of particle filtering on the GPU architecture

EURASIP Journal on Advances in Signal Processing(2013)

引用 6|浏览2
暂无评分
摘要
In this paper, we introduce a modified cellular particle filter (CPF) which we mapped on a graphics processing unit (GPU) architecture. We developed this filter adaptation using a state-of-the art CPF technique. Mapping this filter realization on a highly parallel architecture entailed a shift in the logical representation of the particles. In this process, the original two-dimensional organization is reordered as a one-dimensional ring topology. We proposed a proof-of-concept measurement on two models with an NVIDIA Fermi architecture GPU. This design achieved a 411- μ s kernel time per state and a 77-ms global running time for all states for 16,384 particles with a 256 neighbourhood size on a sequence of 24 states for a bearing-only tracking model. For a commonly used benchmark model at the same configuration, we achieved a 266- μ s kernel time per state and a 124-ms global running time for all 100 states. Kernel time includes random number generation on the GPU with curand . These results attest to the effective and fast use of the particle filter in high-dimensional, real-time applications.
更多
查看译文
关键词
Graphic Processing Unit,Shared Memory,Particle Filter,Neighbourhood Size,Global Memory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要