Institute of Electrical & Electronic Engineers
Most hardware compilers apply loop pipelining to increase the parallelism achieved, but pipelining is restricted to the only innermost level in a nested loop. In this paper, the authors extend and adapt an existing outer loop pipelining approach known as single dimension software pipelining to generate schedules for FPGA hardware coprocessors. Each loop level in nine test loops is pipelined and the resulting schedules are implemented in VHDL and targeted to an Altera Stratix II FPGA. The results show that the fastest solution for all but one of the loops occurs when pipelining is applied one to three levels above the innermost loop.