Routed Inter-ALU Networks for ILP Scalability and Performance
Modern processors rely heavily on broadcast networks to bypass instruction results to dependent instructions in the pipeline. However, as clock rates increase, architectures get wider, and pipelines get deeper, broadcasting becomes more complex, slower, and more difficult to implement. This complexity is compounded by shrinking feature size, as the communication speed decreases relative to transistor switching speeds. This paper examines the fundamental needs of bypassing networks and proposes a method for classifying these Inter-ALU Networks based on how operands are routed from producers to consumers.