Back to Search
Start Over
SPSRG: a prediction approach for correlated failures in distributed computing systems.
- Source :
-
Cluster Computing . Dec2016, Vol. 19 Issue 4, p1703-1721. 19p. - Publication Year :
- 2016
-
Abstract
- Failure instances in distributed computing systems (DCSs) have exhibited temporal and spatial correlations, where a single failure instance can trigger a set of failure instances simultaneously or successively within a short time interval. In this work, we propose a correlated failure prediction approach (CFPA) to predict correlated failures of computing elements in DCSs. The approach models correlated-failure patterns using the concept of probabilistic shared risk groups and makes a prediction for correlated failures by exploiting an association rule mining approach in a parallel way. We conduct extensive experiments to evaluate the feasibility and effectiveness of CFPA using both failure traces from Los Alamos National Lab and simulated datasets. The experimental results show that the proposed approach outperforms other approaches in both the failure prediction performance and the execution time, and can potentially provide better prediction performance in a larger system. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 13867857
- Volume :
- 19
- Issue :
- 4
- Database :
- Academic Search Index
- Journal :
- Cluster Computing
- Publication Type :
- Academic Journal
- Accession number :
- 119755006
- Full Text :
- https://doi.org/10.1007/s10586-016-0633-2