AgileWatts: An Energy-Efficient CPU Core Idle-State Architecture for Latency-Sensitive Server Applications

2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)(2022)

引用 1|浏览40
暂无评分
摘要
User-facing applications running in modern datacenters exhibit irregular request patterns and are implemented using a multitude of services with tight latency requirements (30–250$\mu$s). These characteristics render existing energy-conserving techniques ineffective when processors are idle due to the long transition time (order of 100$\mu$s) from a deep CPU core idle power state (C-state). While prior works propose management techniques to mitigate this inefficiency, we tackle it at its root with AgileWatts (AW): a new deep CPU core C-state architecture optimized for datacenter server processors targeting latency-sensitive applications.AW drastically reduces the transition latency from deep CPU core idle power states while retaining most of their power savings based on three key ideas. First, AW eliminates the latency (several microseconds) of savinglrestoring the core context when powering-off/-on the core in a deep idle state by i) implementing medium-grained power-gates, carefully distributed across the CPU core, and ii) reraining context in the power-ungated domain. Second, AW eliminates rhe flush latency (several tens of microseconds) of the LllL2 caches when entering a deep idle state by keeping LllL2 content power-ungated. A small control logic also remains ungated to serve cache coherence traffic. AW implements cache sleep-mode and leakage reduction for the power-ungated domain by lowering a core’s voltage to the minimum operational level. Third, using a state-of-the-art power efficient all-digital phase-locked loop (ADPLL) clock generator, AW keeps the PLL active and locked during the idle state, cutting microseconds of wake-up latency at negligible power cost.Our evaluation with an accurate industrial-grade simulator calibrated against an Intel Skylake server shows that AW reduces the energy consumprion of Memcached by up to 71% (35% on average) with<1% end-to-end performance degradation. We observe similar trends for other evaluated services (MySQL and Kafka). AW’s new deep C-states C6A and C6AE reduce transition-time by up to 900$\times$ as compared to the deepest existing idle state C6, while consuming only 7% and 5% of the active state (C0) power, respectively.
更多
查看译文
关键词
Energy Efficiency,power management,Latency Sensitive applications
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要