Provided by: Association for Computing Machinery
Automatic vectorization is critical to enhancing performance of compute-intensive programs on modern processors. However, there is much room for improvement over the auto-vectorization capabilities of current production compilers, through careful vector-code synthesis that utilizes a variety of loop transformations (e.g. unroll-and-jam, interchange, etc.). As the set of transformations considered is increased, the selection of the most effective combination of transformations becomes a significant challenge: currently used cost-models in vectorizing compilers are often unable to identify the best choices.