Heterogeneous Coarse-Grained Processing Elements: a Template Architecture for Embedded Processing Acceleration

Provided by: edaa
Topic: Hardware
Format: PDF
Reconfigurable architectures are good candidates for application accelerators that cannot be set in stone at production time. FPGAs however, often suffer from the area and performance penalty intrinsic in gate-level reconfigurability. To reduce this overhead, Coarse-Grained Reconfigurable Arrays (CGRAs) are reconfigurable at the ALU level, but a successful design needs more than computational power - the main bottleneck usually being memory transfers. Just like the integration of hardwired multiplier and memory blocks enabled FPGAs to efficiently implement digital signal processing applications, in this paper the authors study a customizable architecture template based on heterogeneous processing elements (multipliers, ALU clusters and memories) that provides enough flexibility to realize fast pipelined implementations of various loop kernels on a CGRA.

Find By Topic