Search

Your search keyword '"Wellein, Gerhard"' showing total 388 results

Search Constraints

Start Over You searched for: Author "Wellein, Gerhard" Remove constraint Author: "Wellein, Gerhard"
388 results on '"Wellein, Gerhard"'

Search Results

1. Cache Blocking of Distributed-Memory Parallel Matrix Power Kernels

2. Charge-order melting in the one-dimensional Edwards model

3. Alya towards Exascale: Optimal OpenACC Performance of the Navier-Stokes Finite Element Assembly on GPUs

4. CloverLeaf on Intel Multi-Core CPUs: A Case Study in Write-Allocate Evasion

5. Physical Oscillator Model for Supercomputing

6. SPEChpc 2021 Benchmarks on Ice Lake and Sapphire Rapids Infiniband Clusters: A Performance and Energy Case Study

7. Algebraic Temporal Blocking for Sparse Iterative Solvers on Multi-Core CPUs

8. Making Applications Faster by Asynchronous Execution: Slowing Down Processes or Relaxing MPI Collectives

9. MD-Bench: Engineering the in-core performance of short-range molecular dynamics kernels from state-of-the-art simulation packages

10. MD-Bench: A generic proxy-app toolbox for state-of-the-art molecular dynamics algorithms

11. Exploring Techniques for the Analysis of Spontaneous Asynchronicity in MPI-Parallel Applications

12. The Role of Idle Waves, Desynchronization, and Bottleneck Evasion in the Performance of Parallel Programs

13. Level-based Blocking for Sparse Matrices: Sparse Matrix-Power-Vector Multiplication

14. Analytical Performance Estimation during Code Generation on Modern GPUs

15. Opening the Black Box: Performance Estimation during Code Generation for GPUs

16. Valley filtering in strain-induced $\alpha$-$\mathcal{T}_3$ quantum dots

17. Analytic Modeling of Idle Waves in Parallel Programs: Communication, Cluster Topology, and Noise Impact

18. ECM modeling and performance tuning of SpMV and Lattice QCD on A64FX

19. MD-Bench: A Generic Proxy-App Toolbox for State-of-the-Art Molecular Dynamics Algorithms

20. Exploring Techniques for the Analysis of Spontaneous Asynchronicity in MPI-Parallel Applications

22. An analytic performance model for overlapping execution of memory-bound loop kernels on multicore CPUs

23. Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX

24. Multiway $p$-spectral graph cuts on Grassmann manifolds

25. Understanding HPC Benchmark Performance on Intel Broadwell and Cascade Lake Processors

26. Desynchronization and Wave Pattern Formation in MPI-Parallel and Hybrid Memory-Bound Programs

27. Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels

30. A Recursive Algebraic Coloring Technique for Hardware-Efficient Symmetric Sparse Matrix-Vector Multiplication

31. Bridging the Architecture Gap: Abstracting Performance-Relevant Properties of Modern Server Processors

32. Collecting and Presenting Reproducible Intranode Stencil Performance: INSPECT

33. Propagation and Decay of Injected One-Off Delays on Clusters: A Case Study

34. Performance Engineering for Real and Complex Tall & Skinny Matrix Multiplication Kernels on GPUs

35. Analytic Performance Modeling and Analysis of Detailed Neuron Simulations

37. Automated Instruction Stream Throughput Prediction for Intel and AMD Microarchitectures

38. Benefits from using mixed precision computations in the ELPA-AEO and ESSEX-II eigensolver projects

39. Chebyshev Filter Diagonalization on Modern Manycore Processors and GPGPUs

40. Lattice Boltzmann Benchmark Kernels as a Testbed for Performance Analysis

41. Validation of hardware events for successful performance pattern identification in High Performance Computing

43. CRAFT: A library for easier application-level Checkpoint/Restart and Automatic Fault Tolerance

44. LIKWID Monitoring Stack: A flexible framework enabling job specific performance monitoring for the masses

45. An analysis of core- and chip-level architectural features in four generations of Intel server processors

46. Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels

47. Extreme Scale-out SuperMUC Phase 2 - lessons learned

48. ESSEX: Equipping Sparse Solvers For Exascale

49. EXASTEEL: Towards a Virtual Laboratory for the Multiscale Simulation of Dual-Phase Steel Using High-Performance Computing

50. Performance Engineering for a Tall & Skinny Matrix Multiplication Kernels on GPUs

Catalog

Books, media, physical & digital resources