Technical Reports (CIS)

Document Type

Technical Report

Date of this Version

December 2004

Comments

University of Pennsylvania Department of Computer and Information Science Technical Report No. MS-CIS-04-28.

Abstract

The effectiveness of static code optimizations--including static optimizations performed "just-in-time"--is limited by some basic constraints: (i) a limited number of logical registers, (ii) a function- or region-bounded optimization scope, and (iii) the requirement that transformations be valid along all possible paths.

RENO is a modified MIPS-R10000 style register renaming mechanism augmented with physical register reference counting that uses map-table "short-circuiting" to implement dynamic versions of several well-known static optimizations: move elimination, common subexpression elimination, register allocation, and constant folding. Because it implements these optimizations dynamically, RENO can overcome some of the limitations faced by static compilers and apply optimizations where static compilers cannot. RENO has many more registers at its disposal--the entire physical register file. Its optimizations naturally cross function or any other compilation region boundary. And RENO performs optimizations along the dynamic path without being impacted by other, non-taken paths. If the dynamic path proves incorrect due to mispeculations, RENO optimizations are naturally rolled back along with the code they optimize.

RENO unifies several previously proposed optimizations: dynamic move elimination [14] (RENOME), register integration [24] (RENOCSE), and speculative memory bypassing (the dynamic counterpart of register allocation) [14, 21, 22, 24] (RENORA). To this union, we add a new optimization: RENOCF a dynamic version of constant folding. RENOCF extends the map-table from logical -- register --> [physical -- register] to logical -- register --> [physical -- register : displacement]. RENOCF uses this extended map-table format to eliminate register-immediate additions--which account for a surprisingly high fraction of the dynamic instructions in SPECint and MediaBench programs--and fuse them to dependent instructions. The most common fusion scenario is the fusion of a register-immediate addition to another addition, e.g., a memory address calculation. RENOCF implements this fusion essentially "for free" using 3-input adders.

The RENO mechanism is works solely with physical register names and immediate values; it does not read or write the physical register file or use any non-immediate values for any purpose. This isolated structure allows us to implement RENO within a two-stage renaming pipeline.

Cycle-level simulation shows that RENO can dynamically eliminate or fold 22% of the dynamic instructions in both SPECint2000 and MediaBench, respectively; RENOCF itself is responsible for 12% and 16%. Because dataflow dependences are collapsed around eliminated instructions, RENO improves performance by averages of 8% and 13%. Alternatively, because eliminated instructions do not consume issue queue entries, physical registers, or issue, bypass, register file, and execution bandwidth, RENO can be used to absorb the performance impact of a significantly scaled-down execution core.

Share

COinS
 

Date Posted: 04 August 2005