In this paper, the authors present the profiling of EEMBC parallel benchmark programs that is designed for the evaluation and future development of scalable SMP architectures. Modern Multi-Processor System-on-Chip (MPSoC) includes tens of IP blocks, such as CPUs, memories, input/output devices, and HW accelerators,. Figure 1 shows an example of their 64-core target system. Each core has private I-cache and D-cache, and can access a shared L2 cache space through a L1-to-L2 network. The 16 L2 cache banks are distributed across the entire processor and are located together with 16 memory controllers.