Back to Search Start Over

Improving Availability of Multicore Real-Time Systems Suffering Both Permanent and Transient Faults.

Authors :
Zhou, Junlong
Hu, Xiaobo Sharon
Ma, Yue
Sun, Jin
Wei, Tongquan
Hu, Shiyan
Source :
IEEE Transactions on Computers; Dec2019, Vol. 68 Issue 12, p1785-1801, 17p
Publication Year :
2019

Abstract

CMOS scaling has greatly increased concerns for both lifetime reliability due to permanent faults and soft-error reliability due to transient faults. Most existing works only focus on one of the two reliability concerns, but often times techniques used to increase one type of reliability may adversely impact the other type. A few efforts do consider both types of reliability together and use two different metrics to quantify the two types of reliability. However, for many systems, the user's concern is to maximize system availability by improving the mean time to failure (MTTF), regardless of whether the failure is caused by permanent or transient faults. Addressing this concern requires a uniform metric to measure the effect due to both types of faults. This paper introduces a novel analytical expression for calculating the MTTF due to transient faults. Using this new formula and an existing method to evaluate system MTTF, we tackle the problem of maximizing availability for multicore real-time systems with consideration of permanent and transient faults. A framework is proposed to solve the system availability maximization problem. Experimental results on a hardware board and simulation results of synthetic tasks show that our scheme significantly improves system MTTF (and hence availability) compared with existing techniques. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00189340
Volume :
68
Issue :
12
Database :
Complementary Index
Journal :
IEEE Transactions on Computers
Publication Type :
Academic Journal
Accession number :
139649845
Full Text :
https://doi.org/10.1109/TC.2019.2935042