National Technical University of Athens
In this paper, the authors present a design technique for coarse grained reconfigurable cores targeting mostly DSP applications. The proposed technique inlines flexibility into custom Carry-Save-Arithmetic (CSA) datapath exploiting a stable and canonical interconnection scheme. The canonical interconnection is revealed by a uniformity transformation imposed on the basic architectures of CSA multipliers and CSA chain-adders/subtracters. The design flow for the implementation of the core is analyzed in detail, and the advanced mapping opportunities are presented. The paper concludes with the experimental results showing that their architecture performs an average latency reduction of 32.63%, compared with datapath of primitive computational resources, with sufficient hardware utilization.