Programming Many-Core Architectures - A Case Study: Dense Matrix Computations on the Intel SCC Processor
A message passing, distributed-memory parallel computer on a chip is one possible design for future, many-core architectures. The authors discuss initial experiences with the Intel Single-chip Cloud Computer research processor, which is a prototype architecture that incorporates 48 cores on a single die that can communicate via a small, shared, on-die buffer. The experiment is to port a state-of-the-art, distributed-memory, dense matrix library, Elemental, to this architecture and gain insight from the experience. The authors show that programmability addressed by this library, especially the proper abstraction for collective communication, greatly aids the porting effort.