diff --git a/lectures/memory-hierarchies/slides.qmd b/lectures/memory-hierarchies/slides.qmd index d7caad410b88702485a39c9d509ee7d3329fdf9a..ba9f8509bd6119170c3d9bf301b4f85e837cb284 100644 --- a/lectures/memory-hierarchies/slides.qmd +++ b/lectures/memory-hierarchies/slides.qmd @@ -446,11 +446,18 @@ Based on a typical Levante node (AMD EPYC 7763) ## Memory Mountain (1/2) -Describe what is done +:::{.smaller} + +Code for program contained in "Computer Systems": + +<https://csapp.cs.cmu.edu/3e/mountain.tar> + +::: - - Influence of block size - - Influence of stride - - Measurements done on Levante compute node + - Process a representative amount of data + - Use `stride` between array elements to control spatial locality + - Use `size` of array to control temporal locality + - Also warm up the cache before the actual measurements @@ -469,8 +476,10 @@ $\approx$ Factor 20 between best and worst access ## Hands-On {.handson} - - Either example with optimal and sub-optimal memory access, i.e. cache blocking (see `nproma`) - - Or OpenMP reduction (hand implementation), reference/continuation from parallelism lecture?! +1. Download and extract the C source code of the memory mountain program linked in the from previous slides +2. Compile the program and run it on your PC or a Levante compute node +3. Which factor do you get between best and worst performance? +4. (optional) Visualize your results