Introduction to Performance Engineering
Performance engineering is a complex activity, which is based on existing, fundamental knowledge, but also requires active investigation.
The goal typically is to reduce the consumption of resources (be it time, compute time or energy - this is getting more and more important) for a given computational task.
This brief introductory courses aims to:
give an overview of the knowledge required to do performance engineering work, in the spirit of reducing the unknown unknowns for beginners, with a focus on HPC and scientific computing;
discuss the tools to analyze application behaviour in the most common programming languages used in HPC in the field of scientific research (profiling, tracing);
give elementary notions of computer architecture, including CPU, memory systems, GPU and Filesystem, which sets the maximum performance that can be achieved;
give a hands-on demonstration of the performance optimization loop.
The choice of topics is based on the sources of inspiration (see below) and on the personal experiences of the maintainers of this lesson, with a slight bias - which must be acknowledged and managed - towards interesting problems.
Episodes
- The closed loop of performance tuning
- Performance Characterization of applications
- Walkthrough: Dense Matrix-Matrix multiplications (Part 1)
- Elements of Computer Architecture and their effects
- A brief LIKWID demo: topology and microbenchmarks
- What a compiler (and the hardware) can do for you
- Reproducibility problems in performance measurements
Credits
The current course material is inspired by and based on:
The course Introduction to Performance Engineering on HPC by Holger Obermaier and Begatim Bytyqi from KIT
The material for Node Level Performance Engineering course at HLRS
The Algorithmica.org book/website by Sergej Slotin
Supplementary Material