Optimizing Latency and Throughput for Spawning Processes on Massively Multi-Core Processors
The execution of a SPMD application involves running multiple instances of a process with possibly varying arguments. With the widespread adoption of massively multi-core processors, there has been a focus towards harnessing the abundant compute resources effectively in a power-efficient manner. Although much work has been done towards optimizing distributed process launch using hierarchical techniques, there has been a void in studying the performance of spawning processes within a single node. Reducing the latency to spawn a new process locally results in faster global job launch. Further, emerging dynamic and resilient execution models are designed on the premise of maintaining process pools for fault isolation and launching several processes in a relatively shorter period of time.