1. Can we avoid rounding-error estimation in HPC codes and still get trustful results?
- Author
-
Jézéquel, Fabienne, Graillat, Stef, Mukunoki, Daichi, Imamura, Toshiyuki, Iakymchuk, Roman, Performance et Qualité des Algorithmes Numériques (PEQUAN), LIP6, Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Université Panthéon-Assas (UP2), RIKEN Center for Computational Science [Kobe] (RIKEN CCS), RIKEN - Institute of Physical and Chemical Research [Japon] (RIKEN), Fraunhofer Institute of Industrial Mathematics (Fraunhofer ITWM), Fraunhofer (Fraunhofer-Gesellschaft), European Union’s Horizon 2020 research, innovation programme under the Marie Curie grant agreement via the Robust project No. 842528Japan Society for the Promotion of Science (JSPS) KAKENHI Grant No. 19K20286, and JEZEQUEL, Fabienne
- Subjects
floating-point arithmetic ,BLAS ,rounding errors ,numerical validation ,[INFO]Computer Science [cs] ,Discrete Stochastic Arithmetic (DSA) ,[INFO] Computer Science [cs] - Abstract
Numerical validation enables one to improve the reliability of numerical computations that rely upon floating-point operations through obtaining trustful results. Discrete Stochastic Arithmetic (DSA) makes it possible to validate the accuracy of floating-point computations using random rounding. However, it may bring a large performance overhead compared with the standard floating-point operations. In this article, we show that with perturbed data it is possible to use standard floating-point arithmetic instead of DSA for the purpose of numerical validation. For instance, for codes including matrix multiplications, we can directly utilize the matrix multiplication routine (GEMM) of level-3 BLAS that is performed with standard floating-point arithmetic. Consequently, we can achieve a significant performance improvement by avoiding the performance overhead of DSA operations as well as by exploiting the speed of highly-optimized BLAS implementations. Finally, we demonstrate the performance gain using Intel MKL routines compared against the DSA version of BLAS routines.
- Published
- 2020