iCFP: Tolerating all-level cache misses in in-order processors
Penn collection
Degree type
Discipline
Subject
pipeline processing
Runahead execution
all-level cache
in-order continual flow pipeline
in-order pipelines
in-order processors
miss-independent instructions
multipass pipelining
register dependence tracking scheme
register file
Funder
Grant number
License
Copyright date
Distributor
Related resources
Contributor
Abstract
Growing concerns about power have revived interest in in-order pipelines. In-order pipelines sacrifice single-thread performance. Specifically, they do not allow execution to flow freely around data cache misses. As a result, they have difficulties overlapping independent misses with one another. Previously proposed techniques like Runahead execution and Multipass pipelining have attacked this problem. In this paper, we go a step further and introduce iCFP (in-order Continual Flow Pipeline), an adaptation of the CFP concept to an in-order processor. When iCFP encounters a primary data cache or 12 miss, it checkpoints the register file and transitions into an "advance " execution mode. Miss-independent instructions execute as usual and even update register state. Miss- dependent instructions are diverted into a slice buffer, un-blocking the pipeline latches. When the miss returns, iCFP "rallies" and executes the contents of the slice buffer, merging miss-dependent state with miss- independent state along the way. An enhanced register dependence tracking scheme and a novel store buffer design facilitate the merging process. Cycle-level simulations show that iCFP out-performs Runahead, Multipass, and SLTP, another non-blocking in-order pipeline design.