Seoul metropolitan government
It becomes more and more interesting to construct multithreaded parallel machines using stock processors due to their high performance/price ratio. However, no quantitative analysis has been reported on the effectiveness of various node configurations and its impact on the overall performance. In this paper, the authors explore three different node configurations in detail and compare their dynamic characteristics through the instruction-level simulation with six benchmark programs. Their experiments show that employing a dedicated processor for communication and synchronization is a reasonable approach because it can almost double the performance.