Back to Search
Start Over
Fine-grained bit-flip protection for relaxation methods
- Publication Year :
- 2019
-
Abstract
- [EN] Resilience is considered a challenging under-addressed issue that the high performance computing community (HPC) will have to face in order to produce reliable Exascale systems by the beginning of the next decade. As part of a push toward a resilient HPC ecosystem, in this paper we propose an error-resilient iterative solver for sparse linear systems based on stationary component-wise relaxation methods. Starting from a plain implementation of the Jacobi iteration, our approach introduces a low-cost component-wise technique that detects bit-flips, rejecting some component updates, and turning the initial synchronized solver into an asynchronous iteration. Our experimental study with sparse incomplete factorizations from a collection of real-world applications, and a practical GPU implementation, exposes the convergence delay incurred by the fault-tolerant implementation and its practical performance.
Details
- Database :
- OAIster
- Notes :
- TEXT, English
- Publication Type :
- Electronic Resource
- Accession number :
- edsoai.on1258889757
- Document Type :
- Electronic Resource