Enabling the Reflex Plane with the nanoPU

Stephen Ibanez,Alex Mallery,Serhat Arslan,Theo Jepsen,Muhammad Shahbaz,Changhoon Kim,Nick McKeown

arxiv（2022）

引用 0|浏览40

暂无评分

摘要

Many recent papers have demonstrated fast in-network computation using programmable switches, running many orders of magnitude faster than CPUs. The main limitation of writing software for switches is the constrained programming model and limited state. In this paper we explore whether a new type of CPU, called the nanoPU, offers a useful middle ground, with a familiar C/C++ programming model, and potentially many terabits/second of packet processing on a single chip, with an RPC response time less than 1 $\mu$s. To evaluate the nanoPU, we prototype and benchmark three common network services: packet classification, network telemetry report processing, and consensus protocols on the nanoPU. Each service is evaluated using cycle-accurate simulations on FPGAs in AWS. We found that packets are classified 2$\times$ faster and INT reports are processed more than an order of magnitude quickly than state-of-the-art approaches. Our production quality Raft consensus protocol, running on the nanoPU, writes to a 3-way replicated key-value store (MICA) in 3 $\mu$s, twice as fast as the state-of-the-art, with 99\% tail latency of only 3.26 $\mu$s. To understand how these services can be combined, we study the design and performance of a {\em network reflex plane}, designed to process telemetry data, make fast control decisions, and update consistent, replicated state within a few microseconds.

查看译文

关键词

reflex plane

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要