Cod e Comp r e s s i on

Christopher W. Fraser, Steven Lucco,Todd A. Proebsting

semanticscholar（2007）

引用 0|浏览2

暂无评分

摘要

Current research in compiler optimization counts ma inly CPU time and perhaps the first cache level or two. This view has been important but is becoming myopic, at le st from a system-wide viewpoint, as the ratio of netwo rk and disk speeds to CPU speeds grows exponentially. For example, we have seen the CPU idle for most of the time during paging, so compressing pages can increa se total performance even though the CPU must decompre ss or interpret the page contents. Another profile sho ws that many functions are called just once, so reduced pag ing could pay for their interpretation overhead. This paper describes: • Measurements that show how code compression can save space and total time in some important real-world scenarios. • A compressed executable representation that is roug hly the same size as gzipped x86 programs and can be in terpreted without decompression. It can also be com piled to high-quality machine code at 2.5 megabytes per second on a 120MHz Pentium processor. • A compressed “wire” representation that must be decompressed before execution but is, for example, roughly 21% the size of SPARC code when compressing gcc. For correspondence: {jens,todd,will}@cs.arizona.edu, Dept of Computer Science, Gould Simpson Building, Tucson, AZ 85721. { cwfraser, steveluc}@microsoft.com, One Microsoft Way, Redmond, WA 98052. Originally published in Proceedings of the ACM SIGPLAN’97 Conference on Programming Language Design and Implementation (PLDI) :xxx-yyy. Copyright © 1997 by the Association for Computing Machinery, Inc. Permissi on to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or c ommercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honor ed. Abstracting with credit is permitted. To copy otherwise, to republish , to post on servers, or to redistribute to lists, requires prior specific permission a nd/or a fee. Request permissions from Publications Dept, ACM Inc., fax +1 (212) 869-0481, or permissions@acm.org. Introduction Computer programs are delivered to the CPU via netw orks, disks, and caches, all of which can be bottlenecks. In ome important scenarios, it can be significantly faster to send compressed code that is then interpreted or decompr essed and executed. This fact is self-evident when delive ring code over 28.8kbaud modems, but it can be true for faster networks, for paging from disk, and even for cache misses if the decompressor is fast enough. We consider two important bottlenecks: transmission and memory. When transmission is a bottleneck, we want the best possible compression, and we can afford to expand the co mpressed program before executing. We call such code s “wire” codes because a wire is the bottleneck. When memory is a bottleneck, the code — at least sel dom used code — must be stored and interpreted in compre ssed form. Code includes jumps and calls, so we need ran om access to at least the basic blocks. If some code m ust be compiled to run fast enough, the JIT (just in time) compilation rate must be very high. When both transmission and memory are bottlenecks, it may make sense to decompress a wire code into a com pressed interpretable form. The literature on general-purpose data compression [Bell et al] offers many techniques. Our tasks have been mai nly to find combinations of techniques that suit the speci alized problem of compressing virtual machine (VM) code, a nd to determine how to generate compact automata that acc urately predict the next VM operator or operand base d on the current context, so that tokens common in the curre nt context can be given the shortest encodings. This pape r concerns only VM code, though some of the techniques c l arly apply to machine-specific code as well. This paper describes two code compressors – the bes t that we’ve found for each of our two scenarios. The comp ressors are quite different, but both gather informati on about the common patterns that appear in the code, and bo th divide the stream of code into several smaller stream s, one holding the operators and one holding the literal o perands for each operator (or class of related operators) t hat needs a literal operand. The compressors are: • A wire VM code that yields programs almost one-fift h the size of SPARC code. • An interpretable VM code called “Byte-coded RISC” or “BRISC,” which is roughly 30% larger than the wire format but still about the same size as noninterpretable gzipped x86 programs. We can interpret BRISC code with a typical 12x time penalty while cutting working set size by over 40%. Al ternately, we can compile BRISC at over 2.5 megabytes per second, producing x86 machine code over 100 times f aster than, for example, all commercial JIT compilers kno w to us. This high compilation rate permits us to recomp ile the program at each execution for clients with no local disk cache. The delivery time from the network or disk c an mask some or even all of the recompilation time, an d the code runs within 1.08x of the speed of fully optimi zed machine code generated by Microsoft Visual C++ 5.0. B RISC can also trim memory requirements for large desktop ap lications and compress programs to fit within the mem ory requirements of embedded systems. Both codes support client-side and server-side compilation. Server-side compilation is necessary to efficiently deliver large application programs. For example, existing J IT compilers must allocate registers on the client, which is expensive and, for the best results, super-linear in the length of the input program. By performing code optimization before a program is downloaded, a mobile code system can dramatically reduce the time necessary to generate m chine code on the client.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要