Fast, parallel implementation of particle filtering on the GPU architecture

Anna Gelencsér-Horváth,Gábor János Tornai,András Horváth,György Cserey

EURASIP Journal on Advances in Signal Processing（2013）

引用 6|浏览2

暂无评分

摘要

In this paper, we introduce a modified cellular particle filter (CPF) which we mapped on a graphics processing unit (GPU) architecture. We developed this filter adaptation using a state-of-the art CPF technique. Mapping this filter realization on a highly parallel architecture entailed a shift in the logical representation of the particles. In this process, the original two-dimensional organization is reordered as a one-dimensional ring topology. We proposed a proof-of-concept measurement on two models with an NVIDIA Fermi architecture GPU. This design achieved a 411- μ s kernel time per state and a 77-ms global running time for all states for 16,384 particles with a 256 neighbourhood size on a sequence of 24 states for a bearing-only tracking model. For a commonly used benchmark model at the same configuration, we achieved a 266- μ s kernel time per state and a 124-ms global running time for all 100 states. Kernel time includes random number generation on the GPU with curand . These results attest to the effective and fast use of the particle filter in high-dimensional, real-time applications.

查看译文

关键词

Graphic Processing Unit,Shared Memory,Particle Filter,Neighbourhood Size,Global Memory

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要