An Operand-Optimized Asynchronous IEEE 754 Double-Precision Floating-Point Adder
The authors present the design and implementation of an asynchronous high-performance IEEE 754 compliant double-precision Floating-Point Adder (FPA). They provide a detailed breakdown of the power consumption of the FPA datapath, and use it to motivate a number of different data-dependent optimizations for energy-efficiency. Their baseline asynchronous FPA has a throughput of 2.15 GHz while consuming 69.3 pJ per operation in a 65nm bulk process. For the same set of nonzero operands, their optimizations improve the FPA's energy-efficiency to 30.2 pJ per operation while preserving average throughput, a 56.7% reduction in energy relative to the baseline design.