Back to Search Start Over

Exploiting Parallelism of Imperfect Nested Loops on Coarse-Grained Reconfigurable Architectures.

Authors :
Yin, Shouyi
Lin, Xinhan
Liu, Leibo
Wei, Shaojun
Source :
IEEE Transactions on Parallel & Distributed Systems. Nov2016, Vol. 27 Issue 11, p3199-3213. 15p.
Publication Year :
2016

Abstract

Coarse-grained reconfigurable architecture (CGRA) is a promising parallel computing platform that provides high performance, high power efficiency and flexibility. However, for imperfect nested loops, the existing loop mapping methods often result in low execution performance and poor hardware utilization. To tackle this problem, this paper makes three contributions: <bold>1)</bold> a highly effective and general approach to map imperfect loops on CGRA; <bold>2)</bold> a global optimization strategy to search the optimal initiation intervals (IIs); <bold>3)</bold> a powerful kernel compression method to reduce the oversized kernel. Experiment results show that our approach can reduce the total computing latency by 20.5, 58.5 and 73.2 percent compared to the state-of-the-art approaches on $2 \times 2$<alternatives> <inline-graphic xlink:type="simple" xlink:href="yin-ieq1-2531678.gif"/></alternatives>, $4 \times 4$<alternatives> <inline-graphic xlink:type="simple" xlink:href="yin-ieq2-2531678.gif"/></alternatives> and $8 \times 8$<alternatives> <inline-graphic xlink:type="simple" xlink:href="yin-ieq3-2531678.gif"/></alternatives> CGRA respectively. Moreover, the compilation time and configuration context size is acceptable in practice. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10459219
Volume :
27
Issue :
11
Database :
Academic Search Index
Journal :
IEEE Transactions on Parallel & Distributed Systems
Publication Type :
Academic Journal
Accession number :
118689413
Full Text :
https://doi.org/10.1109/TPDS.2016.2531678