Back to Search
Start Over
An Efficient Bottleneck Planes Exclusion Method for Reconfiguring 3D VLSI Arrays
- Source :
- IEEE Transactions on Parallel and Distributed Systems; February 2024, Vol. 35 Issue: 2 p250-263, 14p
- Publication Year :
- 2024
-
Abstract
- With the ever-increasing integration and parallel computing capabilities of 3D processor arrays, the occurrence of processor elements (PEs) failures caused by various factors has become more prevalent. Therefore, the implementation of a fault-tolerant mechanism that uses the remaining fault-free PEs to reconfigure sub-array becomes critical. In this paper, we study the problem of reconfiguring a 3D subarray with as many fault-free PEs as possible, which has been shown to be NP-complete in previous work. Although prior algorithms have been effective under low fault densities, they are severely limited when faced with high fault densities. To address this, we first define the bottleneck of the 3D processor array, proposed a novel method to identify the physical bottleneck plane that restricts the reconfigurable size of the logical sub-array and prove its correctness. Then, we propose an effective compensation strategy that can fully utilize the fault-free PEs in the bottleneck plane. Under this strategy, a sliding-window weight calculation method is proposed to determine the priority of compensation. Finally, we proposed a heuristic algorithm, which can construct the maximum target array from different dimensions in polynomial time. Experimental results demonstrate that the proposed algorithm exhibits favorable performance in terms of harvest and degradation. For the random-failure model, the improvement in the harvest for fault-free PEs is up to 32.03% on a <inline-formula><tex-math notation="LaTeX">$32 \times 32 \times 32$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>32</mml:mn><mml:mo>×</mml:mo><mml:mn>32</mml:mn><mml:mo>×</mml:mo><mml:mn>32</mml:mn></mml:mrow></mml:math><inline-graphic xlink:href="ding-ieq1-3339961.gif"/></alternatives></inline-formula> host array with a 20% fault density. And for the clustered fault model, the improvement in harvest is up to 70.63% on a <inline-formula><tex-math notation="LaTeX">$32 \times 32 \times 32$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>32</mml:mn><mml:mo>×</mml:mo><mml:mn>32</mml:mn><mml:mo>×</mml:mo><mml:mn>32</mml:mn></mml:mrow></mml:math><inline-graphic xlink:href="ding-ieq2-3339961.gif"/></alternatives></inline-formula> host array distributed with 12 cluster failures of size <inline-formula><tex-math notation="LaTeX">$6 \times 6 \times 6$</tex-math><alternatives><mml:math><mml:mrow><mml:mn>6</mml:mn><mml:mo>×</mml:mo><mml:mn>6</mml:mn><mml:mo>×</mml:mo><mml:mn>6</mml:mn></mml:mrow></mml:math><inline-graphic xlink:href="ding-ieq3-3339961.gif"/></alternatives></inline-formula>.
Details
- Language :
- English
- ISSN :
- 10459219 and 15582183
- Volume :
- 35
- Issue :
- 2
- Database :
- Supplemental Index
- Journal :
- IEEE Transactions on Parallel and Distributed Systems
- Publication Type :
- Periodical
- Accession number :
- ejs64995160
- Full Text :
- https://doi.org/10.1109/TPDS.2023.3339961