Designing Efficient Asynchronous Memory Operations Using Hardware Copy Engine: A Case Study with I/OAT

IPDPS(2007)

引用 23|浏览23
暂无评分
摘要
Memory copies for bulk data transport incur large overheads due to CPU stalling, small register-size data movement, etc. Intel's I/O Acceleration Tech- nology offers an asynchronous memory copy engine in kernel space which alleviates such overheads. In this paper, we propose a set of designs for asynchronous memory operations in user space for both single pro- cess (as an ofoaded memcpy()) and IPC using the copy engine. We analyze our design based on over- lap efciency , performance and cache utilization. Our microbenchmark results show that using the copy en- gine for performing memory copies can achieve close to 87% overlap with computation. Further, the copy engine improves the copy latency of bulk memory data transfers by 50% and avoids cache pollution effects. With the emergence of multi-core architectures, the support for asynchronous memory operations holds a lot of promise in reducing the gap between the memory and processor performance.
更多
查看译文
关键词
acceleration,space technology,data transfer,pollution,hardware,linear predictive coding,inter process communication,computer architecture,engines,dma,kernel
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要