Back to Search Start Over

ACEDR: Automatic Compiler Error Detection and Recovery for COTS CPU and Caches.

Authors :
Nezzari, Y.
Bridges, C. P.
Source :
IEEE Transactions on Reliability. Sep2019, Vol. 68 Issue 3, p859-871. 13p.
Publication Year :
2019

Abstract

Recently there has been an increasing demand for more powerful processors for the next-generation space missions, such as communication and earth observation. The challenge is how to improve the reliability of the processor under the “single event effects” in orbit. We have previously proposed a new way of implementing any traditional software error detection and correction techniques at instruction level, capable of covering both the CPU and caches of “commercial off the shelf” processors. In this paper, a novel way of evaluation of the software protection is presented, based on a theoretical model and software injection experiments to predict the reliability of the whole processing architecture. The fault injection will evaluate the ability of the protection code to detect and recover errors in addition to the accuracy of the reliability models, by comparing the reliability of the theoretical predictions to the reliability of the injection experiments. Automatic compiler error detection and recovery improves the reliability of the system by reducing the error rate of “single event upsets.” In some benchmarks, the error rate was reduced to less than 1%. This research has been tested in two machines; Intel core i5-3470 and a Raspberry Pi 3. On the first processor, the overhead was less than 15%, and on the second one, the overhead was less than 17%. This research can also be ported to multiple high level languages, with the ability to cover multiple instructions and datatypes. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00189529
Volume :
68
Issue :
3
Database :
Academic Search Index
Journal :
IEEE Transactions on Reliability
Publication Type :
Academic Journal
Accession number :
138433579
Full Text :
https://doi.org/10.1109/TR.2019.2925086