Search

Your search keyword '"Hager, Georg"' showing total 402 results

Search Constraints

Start Over You searched for: Author "Hager, Georg" Remove constraint Author: "Hager, Georg"
402 results on '"Hager, Georg"'

Search Results

1. Cache Blocking of Distributed-Memory Parallel Matrix Power Kernels

2. CloverLeaf on Intel Multi-Core CPUs: A Case Study in Write-Allocate Evasion

3. Physical Oscillator Model for Supercomputing

4. SPEChpc 2021 Benchmarks on Ice Lake and Sapphire Rapids Infiniband Clusters: A Performance and Energy Case Study

5. Algebraic Temporal Blocking for Sparse Iterative Solvers on Multi-Core CPUs

6. Making Applications Faster by Asynchronous Execution: Slowing Down Processes or Relaxing MPI Collectives

7. MD-Bench: Engineering the in-core performance of short-range molecular dynamics kernels from state-of-the-art simulation packages

8. Orthogonal layers of parallelism in large-scale eigenvalue computations

9. Exploring Techniques for the Analysis of Spontaneous Asynchronicity in MPI-Parallel Applications

10. The Role of Idle Waves, Desynchronization, and Bottleneck Evasion in the Performance of Parallel Programs

11. Level-based Blocking for Sparse Matrices: Sparse Matrix-Power-Vector Multiplication

12. Analytical Performance Estimation during Code Generation on Modern GPUs

13. Opening the Black Box: Performance Estimation during Code Generation for GPUs

14. Analytic Modeling of Idle Waves in Parallel Programs: Communication, Cluster Topology, and Noise Impact

15. ECM modeling and performance tuning of SpMV and Lattice QCD on A64FX

16. Exploring Techniques for the Analysis of Spontaneous Asynchronicity in MPI-Parallel Applications

18. An analytic performance model for overlapping execution of memory-bound loop kernels on multicore CPUs

19. Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX

20. Understanding HPC Benchmark Performance on Intel Broadwell and Cascade Lake Processors

21. Desynchronization and Wave Pattern Formation in MPI-Parallel and Hybrid Memory-Bound Programs

22. Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels

24. A Recursive Algebraic Coloring Technique for Hardware-Efficient Symmetric Sparse Matrix-Vector Multiplication

25. Bridging the Architecture Gap: Abstracting Performance-Relevant Properties of Modern Server Processors

26. Collecting and Presenting Reproducible Intranode Stencil Performance: INSPECT

27. Propagation and Decay of Injected One-Off Delays on Clusters: A Case Study

28. Performance Engineering for Real and Complex Tall & Skinny Matrix Multiplication Kernels on GPUs

29. Analytic Performance Modeling and Analysis of Detailed Neuron Simulations

31. Automated Instruction Stream Throughput Prediction for Intel and AMD Microarchitectures

32. Benefits from using mixed precision computations in the ELPA-AEO and ESSEX-II eigensolver projects

33. Chebyshev Filter Diagonalization on Modern Manycore Processors and GPGPUs

34. On the accuracy and usefulness of analytic energy models for contemporary multicore processors

35. Validation of hardware events for successful performance pattern identification in High Performance Computing

36. A domain-specific language and matrix-free stencil code for investigating electronic properties of Dirac and topological materials

37. CRAFT: A library for easier application-level Checkpoint/Restart and Automatic Fault Tolerance

38. LIKWID Monitoring Stack: A flexible framework enabling job specific performance monitoring for the masses

39. An analysis of core- and chip-level architectural features in four generations of Intel server processors

40. Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels

41. ESSEX: Equipping Sparse Solvers For Exascale

42. Performance Engineering for a Tall & Skinny Matrix Multiplication Kernels on GPUs

43. Performance analysis of the Kahan-enhanced scalar product on current multi- and manycore processors

44. Analysis of Intel's Haswell Microarchitecture Using The ECM Model and Microbenchmarks

45. Optimization of an electromagnetics code with multicore wavefront diamond blocking and multi-dimensional intra-tile parallelization

46. Multi-dimensional intra-tile parallelization for memory-starved stencil computations

47. High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations

48. Automatic Loop Kernel Analysis and Performance Modeling With Kerncraft

49. GHOST: Building blocks for high performance sparse linear algebra on heterogeneous systems

50. Short Note on Costs of Floating Point Operations on current x86-64 Architectures: Denormals, Overflow, Underflow, and Division by Zero

Catalog

Books, media, physical & digital resources