Departmental Papers (ESE)


Aggressive pipelining allows FPGAs to achieve high throughput on many digital signal processing applications. However, cyclic data dependencies in the computation can limit pipelining and reduce the efficiency and speed of an FPGA implementation. Saturated accumulation is an important example where such a cycle limits the throughput of signal processing applications. We show how to reformulate saturated addition as an associative operation so that we can use a parallel prefix calculation to perform saturated accumulation at any data rate supported by the device. This allows us, for example, to design a 16-bit saturated accumulator which can operate at 280MHz on a Xilinx Spartan-3 (XC3S-5000-4), the maximum frequency supported by the component's DCM.

Document Type

Conference Paper

Date of this Version

December 2005


Copyright 2005 IEEE. Reprinted from Proceedings of the 2005 IEEE International Field Programmable Technology, December 2005, pages 19-26.

This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of the University of Pennsylvania's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to By choosing to view this document, you agree to all provisions of the copyright laws protecting it.

NOTE: At the time of publication, author André Dehon was affiliated with the California Institute of Technology. Currently, he is a faculty member in the School of Engineering at the University of Pennsylvania.



Date Posted: 11 July 2008

This document has been peer reviewed.