From d1b74a97d6611995e6fdc86839c1b1b2a6859003 Mon Sep 17 00:00:00 2001
From: Dominik Zobel <zobel@dkrz.de>
Date: Wed, 26 Jun 2024 13:24:44 +0200
Subject: [PATCH] Update memory mountain slides and exercise

---
 lectures/memory-hierarchies/slides.qmd | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/lectures/memory-hierarchies/slides.qmd b/lectures/memory-hierarchies/slides.qmd
index d7caad4..ba9f850 100644
--- a/lectures/memory-hierarchies/slides.qmd
+++ b/lectures/memory-hierarchies/slides.qmd
@@ -446,11 +446,18 @@ Based on a typical Levante node (AMD EPYC 7763)
 
 ## Memory Mountain (1/2)
 
-Describe what is done
+:::{.smaller}
+
+Code for program contained in "Computer Systems":
+
+<https://csapp.cs.cmu.edu/3e/mountain.tar>
+
+:::
 
- - Influence of block size
- - Influence of stride
- - Measurements done on Levante compute node
+ - Process a representative amount of data
+ - Use `stride` between array elements to control spatial locality
+ - Use `size` of array to control temporal locality
+ - Also warm up the cache before the actual measurements
 
 
 
@@ -469,8 +476,10 @@ $\approx$ Factor 20 between best and worst access
 
 ## Hands-On {.handson}
 
- - Either example with optimal and sub-optimal memory access, i.e. cache blocking (see `nproma`)
- - Or OpenMP reduction (hand implementation), reference/continuation from parallelism lecture?!
+1. Download and extract the C source code of the memory mountain program linked in the from previous slides
+2. Compile the program and run it on your PC or a Levante compute node
+3. Which factor do you get between best and worst performance?
+4. (optional) Visualize your results
 
 
 
-- 
GitLab