iCFP: Tolerating all-level cache misses in in-order processors

Hilton, Andrew D; Nagarakatte, Santosh; Roth, Amir

iCFP: Tolerating all-level cache misses in in-order processors

Files

Hilton_2009.pdf (249.81 KB)

Penn collection

Departmental Papers (CIS)

Subject

multiprocessing systems
pipeline processing
Runahead execution
all-level cache
in-order continual flow pipeline
in-order pipelines
in-order processors
miss-independent instructions
multipass pipelining
register dependence tracking scheme
register file

Permalink

https://repository.upenn.edu/handle/20.500.14332/6459

View all metadata

Author

Hilton, Andrew D

Nagarakatte, Santosh

Roth, Amir

Abstract

Growing concerns about power have revived interest in in-order pipelines. In-order pipelines sacrifice single-thread performance. Specifically, they do not allow execution to flow freely around data cache misses. As a result, they have difficulties overlapping independent misses with one another. Previously proposed techniques like Runahead execution and Multipass pipelining have attacked this problem. In this paper, we go a step further and introduce iCFP (in-order Continual Flow Pipeline), an adaptation of the CFP concept to an in-order processor. When iCFP encounters a primary data cache or 12 miss, it checkpoints the register file and transitions into an "advance " execution mode. Miss-independent instructions execute as usual and even update register state. Miss- dependent instructions are diverted into a slice buffer, un-blocking the pipeline latches. When the miss returns, iCFP "rallies" and executes the contents of the slice buffer, merging miss-dependent state with miss- independent state along the way. An enhanced register dependence tracking scheme and a novel store buffer design facilitate the merging process. Cycle-level simulations show that iCFP out-performs Runahead, Multipass, and SLTP, another non-blocking in-order pipeline design.

Date of presentation

2009-02-14

Conference name

Departmental Papers (CIS)

Conference dates

2023-05-17T02:55:22.000

Comments

Copyright 2009 IEEE. Reprinted from: Hilton, A.; Nagarakatte, S.; Roth, A., "iCFP: Tolerating all-level cache misses in in-order processors," High Performance Computer Architecture, 2009. HPCA 2009. IEEE 15th International Symposium on , vol., no., pp.431-442, 14-18 Feb. 2009 URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=4798281&isnumber=4798227 This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of the University of Pennsylvania's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org. By choosing to view this document, you agree to all provisions of the copyright laws protecting it.

Collection

Presentations