GROK-FPGA: Generating Real on-Chip Knowledge for FPGA Fine-Grain Delays Using Timing Extraction
Electrical and Electronics
Circuit variation is one of the biggest problems to overcome if Moore's Law is to continue. It is no longer possible to maintain an abstraction of identical devices without huge yield losses, performance penalties, and energy costs. Current techniques such as margining and grade binning are used to deal with this problem. However, they tend to be conservative, offering limited solutions that will not scale as variation increases. Conventional circuits use limited tests and statistical models to determine the margining and binning required to counteract variation. If the limited tests fail, the whole chip is discarded. On the other hand, reconfigurable circuits, such as FPGAs, can use more fine-grained, aggressive techniques that carefully choose which resources to use in order to mitigate variation. Knowing which resources to use and avoid, however, requires measurement of underlying variation. We present Timing Extraction, a methodology that allows measurement of process variation without expensive testers nor highly invasive techniques, rather, relying only on resources already available on conventional FPGAs. It takes advantage of the fact that we can measure the delay of logic paths between any two registers. Measuring enough paths, provides the information necessary to decompose the delay of each path into individual components-essentially, forming a system of linear equations. Determining which paths to measure requires simple graph transformation algorithms applied to a representation of the FPGA circuit. Ultimately, this process decomposes the FPGA into individual components and identifies which paths to measure for computing the delay of individual components. We apply Timing Extraction to 18 commercially available Altera Cyclone III (65 nm) FPGAs. We measure 22×28 logic clusters and the interconnect within and between cluster. Timing Extraction decomposes this region into 1,356,182 components, classified into 10 categories, requiring 2,736,556 path measurements. With an accuracy of ±3.2 ps, our measurements reveal regional variation on the order of 50 ps, systematic variation from 30 ps to 70 ps, and random variation in the clusters with σ=15 ps and in the interconnect with σ=62 ps.