1. Outer-Loop Auto-Vectorization for SIMD Architectures Based on Open64 Compiler
- Author
-
Zhao Rong-cai, Li Yingying, Wang Qi, and Wang Dong
- Subjects
Computer science ,Loop inversion ,Data parallelism ,Loop fusion ,020208 electrical & electronic engineering ,02 engineering and technology ,Parallel computing ,020202 computer hardware & architecture ,Loop fission ,0202 electrical engineering, electronic engineering, information engineering ,Loop interchange ,SIMD ,Inner loop ,Loop dependence analysis - Abstract
SIMD (Single Instruction Multiple Data) extensions are acceleration components integrated in general processor, aiming at extracting instruction and data level parallelism of multimedia and scientific calculation programs. Currently, most of the automatic vectorization methods for SIMD architectures are based on innermost loops. Inner loop vectorization is the common approach for auto-vectorization. This method has been used for many years and its efficiency is widely accepted by people. In this paper, we put forward a better method than inner loop vectorization for some loop nests, which is outer loop vectorization. Outer loop vectorization method means vectorizing the outer loop directly. It can extract more data level parallelism and make the most use of the spatial locality to improve the program efficiency than the inner one which is more suitable for some loop nests. This paper presents the realization. In this paper, we first revisit the preliminary analysis of outer loop vectorization based on Open64 complier. And then, data type conversion and code generation is presented in detail. Finally, we propose two optimization methods, capable of boosting the performance of outer loop vectorization to achieve the acceleration of 20% on average, 50% at most.
- Published
- 2016