Date of this Version
Andrew D. Hilton, Santosh Nagarakatte, and Amir Roth, "iCFP: Tolerating all-level cache misses in in-order processors", . February 2009.
Growing concerns about power have revived interest in in-order pipelines. In-order pipelines sacrifice single-thread performance. Specifically, they do not allow execution to flow freely around data cache misses. As a result, they have difficulties overlapping independent misses with one another. Previously proposed techniques like Runahead execution and Multipass pipelining have attacked this problem. In this paper, we go a step further and introduce iCFP (in-order Continual Flow Pipeline), an adaptation of the CFP concept to an in-order processor. When iCFP encounters a primary data cache or 12 miss, it checkpoints the register file and transitions into an "advance " execution mode. Miss-independent instructions execute as usual and even update register state. Miss- dependent instructions are diverted into a slice buffer, un-blocking the pipeline latches. When the miss returns, iCFP "rallies" and executes the contents of the slice buffer, merging miss-dependent state with miss- independent state along the way. An enhanced register dependence tracking scheme and a novel store buffer design facilitate the merging process. Cycle-level simulations show that iCFP out-performs Runahead, Multipass, and SLTP, another non-blocking in-order pipeline design.
multiprocessing systems, pipeline processing, Runahead execution, all-level cache, in-order continual flow pipeline, in-order pipelines, in-order processors, miss-independent instructions, multipass pipelining, register dependence tracking scheme, register file
Date Posted: 08 June 2009