As we approach atomic-scale logic, we must accommodate an increased rate of manufacturing defects, transient upsets, and in-field persistent failures. High defect rates demand reconfiguration to avoid defective components, and transient upsets demand online error detection to catch failures. Combining these techniques we can detect in-field persistent failures when they occur and reconfigure around them. However, since failures may be logically masked for long periods of time, persistent failures may accumulate silently; this integration of errors over time means the effective failure rate for persistent errors can exceed transient upset rates. As a result, logic scrubbing is necessary to prevent the silent accumulation of an undetectable number of persistent errors. We provide simple analysis to illustrate quantitatively how this phenomena can be a concern.
Date of this Version
fault tolerance, logic scrubbing, reconfiguration, reliability, testing
Date Posted: 12 December 2008
This document has been peer reviewed.