FuseFPS: Accelerating Farthest Point Sampling with Fusing KD-tree Construction for Point Clouds
arxiv(2023)
摘要
Point cloud analytics has become a critical workload for embedded and mobile
platforms across various applications. Farthest point sampling (FPS) is a
fundamental and widely used kernel in point cloud processing. However, the
heavy external memory access makes FPS a performance bottleneck for real-time
point cloud processing. Although bucket-based farthest point sampling can
significantly reduce unnecessary memory accesses during the point sampling
stage, the KD-tree construction stage becomes the predominant contributor to
execution time. In this paper, we present FuseFPS, an architecture and
algorithm co-design for bucket-based farthest point sampling. We first propose
a hardware-friendly sampling-driven KD-tree construction algorithm. The
algorithm fuses the KD-tree construction stage into the point sampling stage,
further reducing memory accesses. Then, we design an efficient accelerator for
bucket-based point sampling. The accelerator can offload the entire
bucket-based FPS kernel at a low hardware cost. Finally, we evaluate our
approach on various point cloud datasets. The detailed experiments show that
compared to the state-of-the-art accelerator QuickFPS, FuseFPS achieves about
4.3× and about 6.1× improvements on speed and power efficiency,
respectively.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要