Fast and Lightweight Support for Nested Parallelism on Cluster-Based Embedded Many-Cores

Download Now
Provided by: European Design and Automation Association
Format: PDF
Several recent many-core accelerators have been architected as fabrics of tightly-coupled shared memory clusters. A hierarchical interconnection system is used - with a crossbar-like medium inside each cluster and a Network-on-Chip (NoC) at the global level - which make memory operations non-uniform. Nested parallelism represents a powerful programming abstraction for these architectures, where a first level of parallelism can be used to distribute coarse-grained tasks to clusters, and additional levels of fine-grained parallelism can be distributed to processors within a cluster.
Download Now

Find By Topic