Forwardflow: Scalable, RAM-Based Dataflow Execution

Date Added: Sep 2009
Format: PDF

Power (and thermal) limits have forced an industry-wide shift from increasingly complex uniprocessors to multicore chips with 4, 8, and even 16 simpler processor cores. Yet Amdahl's Law suggests that these cores should not be too simple, lest they exacerbate even a parallel application's sequential bottlenecks. Furthermore, running all cores at full speed will soon exceed the chip's power envelope. Ideally, future CMPs should use cores that trade-off power and performance, allowing the system to scale up a core's Instruction-Level Parallelism (ILP) and Memory-Level Parallelism (MLP) to improve sequential performance.