diff --git a/_quarto.yml b/_quarto.yml index fd031fa15534a068f976979ff2dc2d19f6b63ddc..2d41e3284735a88a9afd18799629335a84ad53c4 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -36,8 +36,8 @@ website: - "lectures/user-experience/slides.qmd" # - "lectures/testing/slides.qmd" # - "lectures/git2/slides.qmd" - # - "lectures/parallelism/slides.qmd" - # - "lectures/hardware/slides.qmd" + - "lectures/parallelism/slides.qmd" + - "lectures/hardware/slides.qmd" # - "lectures/file-and-data-systems/slides.qmd" # - "lectures/memory-hierarchies/slides.qmd" # - "lectures/student-talks/slides.qmd" diff --git a/lectures/hardware/slides.qmd b/lectures/hardware/slides.qmd new file mode 100644 index 0000000000000000000000000000000000000000..6d7db39c2a57bc75f24a3f74ad4ddfb4f3a870b6 --- /dev/null +++ b/lectures/hardware/slides.qmd @@ -0,0 +1,54 @@ +--- +title: "Example lecture" +author: "Tobias Kölling, Florian Ziemen, and the teams at MPI-M and DKRZ" +--- + +# Preface + +* This is an example lecture for the generic computing skills course + +## Idea +*optimize output for **analysis** * + +::: {.smaller} +(not write throughput) +::: + + +## chunking & hierarchy + +:::: {.columns} + +::: {.column width="50%"} +| Grid | Cells | +|---------:|--------:| +| 1° by 1° | 0.06M | +| 10 km | 5.1M | +| 5 km | 20M | +| 1 km | 510M | +| 200 m | 12750M | +::: + +::: {.column width="50%"} +| Screen | Pixels | +|------------:|--------:| +| VGA | 0.3M | +| Full HD | 2.1M | +| MacBook 13' | 4.1M | +| 4K | 8.8M | +| 8K | 35.4M | +::: + +:::: + +It's **impossible** to look at the entire globe in full resolution. + + +## Load data at the resolution necessary for the analysis + +](https://easy.gems.dkrz.de/_images/gorski_f1.jpg) + +## Highlight! {background-color=var(--dark-bg-color)} + +* This slide is either important or has a special purpose. +* You can use it to ask the audience a question or to start a hands-on session. diff --git a/lectures/parallelism/slides.qmd b/lectures/parallelism/slides.qmd new file mode 100644 index 0000000000000000000000000000000000000000..7a29e47efbccd79864a939b5b5added6227e4118 --- /dev/null +++ b/lectures/parallelism/slides.qmd @@ -0,0 +1,157 @@ +--- +title: "Parallelism" +author: "CF, GM, JFE FIXME" +--- + +# Motivation + +* We have a serial code and want to make it faster +* Plan of action: + * Cut problem into smaller pieces + * Use independent compute resources for each piece +* Outlook for next week: The individual computing element does no longer + get much faster, but there are more of them +* FIXME: What else? + + + +## This lecture + +* Is mostly about parallelism as a concept +* Next week: Hardware using this concept + +[comment]: # (Thinking about it, I think we should not give a theoretical definition here, +but first give the example and explain parallelism there. Eventually, with the +task-parallelism we should probably give a real definition and different flavours.) + +# Our example problem + +:::: {.columns} + +::: {.column width="50%"} +* 1d Tsunami equation +* Korteweg–De Vries equation +* Discretization not numerically accurate +* [Wikipedia](https://en.wikipedia.org/wiki/Korteweg%E2%80%93De_Vries_equation) +* FIXME +::: + +::: {.column width="50%"} +FIXME show plot of a soliton +::: + +:::: + +# Our example problem +FIXME +show some central loop + +# Decomposing problem domains +## Our problem domain +FIXME + +## Other problem domains +* ICONs domain decomp +* maybe something totally different? + +FIXME + +# Introducing OpenMP +* A popular way to parallelize code +* Pragma-based parallelization API + * You annotate your code with parallel regions and the compiler does the rest +* OpenMP uses something called threads + * Wait until next week for a definition + +```c +#pragma omp parallel for + for (int i = 0; i < N; ++i) + a[i] = 2 * i; +``` + +## Hands-on Session! {background-color=var(--dark-bg-color) .leftalign} + +1. Compile and run the example serially. Use `time ./serial.x` to time the execution. +23. Compile and run the example using OpenMP. Use `OMP_NUM_THREADS=2 time ./omp.x` to time the execution. +42. Now add + * `schedule(static,1)` + * `schedule(static,10)` + * `schedule(FIXMEsomethingelse)` + * `schedule(FIXMEsomethingelse)` +and find out how the OpenMP runtime decomposes the problem domain. + +# Reductions FIXME title should be more generic +## What is happening here? +```c + int a[] = {2, 4, 6}; + for (int i = 0; i < N; ++i) + sum = sum + a[i]; +``` +## What is happening here? +```c + int a[] = {2, 4, 6}; +#pragma omp parallel for + for (int i = 0; i < N; ++i) + sum = sum + a[i]; +``` +[comment]: # (Can something go wrong?) + +## Solution +```c + int a[] = {2, 4, 6}; +#pragma omp parallel for reduction(+:sum) + for (int i = 0; i < N; ++i) + sum = sum + a[i]; +``` + +# Doing stuff wrong +## What is going wrong here? +```c + temp = 0; +#pragma omp parallel for + for (int i = 0; i < N; ++i) { + temp = 2 * a[i]; + b[i] = temp + 4; + } +``` +## Solution +```c + temp = 0; +#pragma omp parallel for private(temp) + for (int i = 0; i < N; ++i) { + temp = 2 * a[i]; + b[i] = temp + 4; + } +``` +The problem is called "data race". + +## Other common errors +* Race conditions + * The outcome of a program depends on the relative timing of multiple threads. +* Deadlocks + * Multiple threads wait for a resource that cannot be fulfilled. +* Inconsistency + * FIXME + +# Finally: A definition of parallelism +"Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously." +Wikipedia +FIXME: Citation correct?! + +## Types of parallelism +* Data parallelism (what we've been discussing) +* Task parallelism (Example: Atmosphere ocean coupling) +* Instruction level parallelism (see next week) + +## Summary: Preconditions for parallel execution +FIXME, if we want to do that + +# FIXME +* Homework: + * Do something where you run into hardware-constraints (i.e. Numa, too many threads, ...) + * Give some example with race condition or stuff and have them find it. +* Add maybe: + * Are there theoretical concept like Amdahl, which we should explain? (I don't like Amdahl) + * Strong/weak scaling? + +