Skip to content
Snippets Groups Projects
Commit 2cc6b5c6 authored by Florian Ziemen's avatar Florian Ziemen
Browse files

cleanup

parent c90e19b0
No related branches found
No related tags found
1 merge request!11File and Data Systems
Pipeline #71521 passed
...@@ -5,11 +5,11 @@ author: "Florian Ziemen and Karl-Hermann Wieners" ...@@ -5,11 +5,11 @@ author: "Florian Ziemen and Karl-Hermann Wieners"
# Storing data # Storing data
* Recording of (more or less valuable) information in a medium * Recording of information in a medium
* Deoxyribonucleic acid (DNA)
* Hand writing
* Magnetic tapes * Magnetic tapes
* Hard disks * Hard disks
* Hand writing
* Deoxyribonucleic acid (DNA)
# Topics # Topics
...@@ -46,8 +46,8 @@ Disk quotas for prj 30001639 (pid 30001639): ...@@ -46,8 +46,8 @@ Disk quotas for prj 30001639 (pid 30001639):
## Latency ## Latency
* How long does it take until we get the first bit of data? * How long does it take until we get the first bit of data?
* Crucial when opening many small files (e.g. starting python) * Crucial when opening many small files (e.g. starting python)
* Less crucial when reading one big file start-to end. * Less crucial when reading one big file start-to-end.
* Largely determined by moving pieces in the storage medium. * Largely determined by moving parts in the storage medium.
## Continuous read / write ## Continuous read / write
...@@ -62,7 +62,7 @@ Disk quotas for prj 30001639 (pid 30001639): ...@@ -62,7 +62,7 @@ Disk quotas for prj 30001639 (pid 30001639):
## Caching ## Caching
* Keeping data *in memory* for frequent re-use. * Keeping data *in memory* for frequent re-use.
* Usually storage media like disks have small caches with better properties. * Usually storage media like disks have small caches with better properties.
* e.g. HDD of 16 TB with a 512 MB of RAM cache. * e.g. HDD of 16 TB with 512 MB of RAM cache.
* Operating systems also cache reads. * Operating systems also cache reads.
* Caching writes in RAM is trouble because of the risk of data loss due to power loss / system crash. * Caching writes in RAM is trouble because of the risk of data loss due to power loss / system crash.
...@@ -80,7 +80,7 @@ Disk quotas for prj 30001639 (pid 30001639): ...@@ -80,7 +80,7 @@ Disk quotas for prj 30001639 (pid 30001639):
| Tape | minutes | 300 MB/s | minimal | ~ 5 | | Tape | minutes | 300 MB/s | minimal | ~ 5 |
* All figures based on a quick google search in 06/2024 * All figures based on a quick google search in 06/2024.
* RAM needs electricity to keep the data (*volatile memory*). * RAM needs electricity to keep the data (*volatile memory*).
* All but tape usually remain powered in an HPC. * All but tape usually remain powered in an HPC.
...@@ -93,27 +93,27 @@ Disk quotas for prj 30001639 (pid 30001639): ...@@ -93,27 +93,27 @@ Disk quotas for prj 30001639 (pid 30001639):
## Solid-state disk/flash drives ## Solid-state disk/flash drives
* Non-volatile electronic medium. * Non-volatile electronic medium.
- keeping state (almost) without energy supply. - Keeps state (almost) without energy supply.
* High speed, also under random access. * High speed, also under random access.
## Hard disk ## Hard disk
* Stacks of magnetic disks with read/write heads.
* Spinning to make every point accessible by heads.
* Basically a stack of modern record players. * Basically a stack of modern record players.
* Stack of magnetic disks with read/write heads.
* Spinning to make every point accessible by heads.
* Good for bigger files, not ideal for random access. * Good for bigger files, not ideal for random access.
## Tape ## Tape
* Spools of magnetizable bands * Spool of magnetizable bands.
* Serialized access only * Serialized access only.
* Backup / long-term storage * Used for backup / long-term storage.
## Hands-on {.handson} ## Hands-on {.handson}
::: {.smaller} ::: {.smaller}
{{< embed timer.ipynb echo=true >}} {{< embed timer.ipynb echo=true >}}
::: :::
Take this set of calls, improve the code, and measure the write speed for different file sizes on your `/scratch/` Take this set of calls, and measure the write speed for different file sizes on your `/scratch/`
# Storage Architectures # Storage Architectures
...@@ -316,9 +316,10 @@ Add a similar function to the previous one for reading, and read the data you ju ...@@ -316,9 +316,10 @@ Add a similar function to the previous one for reading, and read the data you ju
* Data is presented as immutable "objects" ("BLOB") * Data is presented as immutable "objects" ("BLOB")
* Each object has a globally unique identifier (eg. UUID or hash) * Each object has a globally unique identifier (eg. UUID or hash)
* Objects may be assigned names, grouped in "buckets" * Objects may be assigned names, grouped in "buckets"
* Usually only supports creation ("put") and retrieval ("get") * Generally supports creation ("put") and retrieval ("get"), can support much more (versioning, etc).
* Focus on data distribution and replication, fast read access * Focus on data distribution and replication, fast read access
## Object storage -- metadata ## Object storage -- metadata
* Object metadata stored independent from data location * Object metadata stored independent from data location
...@@ -350,10 +351,9 @@ Protection against ...@@ -350,10 +351,9 @@ Protection against
* downtimes due to hardware failure * downtimes due to hardware failure
## Backups ## Backups
* Keep old states of the file system available.
* Keep old states of the file system available * Need at least as much space as the (compressed version of the) data being backuped.
* Need at least as much space as the (compressed version of the) data being backuped * Often low-freq full backups and hi-freq incremental backups
* Often low-freq full backups and hi-freq incremental backups<br>
to balance space requirements and restoring time to balance space requirements and restoring time
* Ideally at different locations * Ideally at different locations
* Automate them! * Automate them!
...@@ -365,7 +365,7 @@ Combining multiple harddisks into bigger / more secure combinations - often at c ...@@ -365,7 +365,7 @@ Combining multiple harddisks into bigger / more secure combinations - often at c
* RAID 0 distributes the blocks across all disks - more space, but data loss if one fails. * RAID 0 distributes the blocks across all disks - more space, but data loss if one fails.
* RAID 1 mirrors one disk on an identical copy. * RAID 1 mirrors one disk on an identical copy.
* RAID 5 is similar to 0, but with one extra disk for (distributed) parity info * RAID 5 is similar to 0, but with one extra disk for (distributed) parity info
* RAID 6 is similar to 5, but with two extro disks for parity info (levante uses 8+2 disks). * RAID 6 is similar to 5, but with two extra disks for parity info (levante uses 8+2 disks).
## Erasure coding ## Erasure coding
...@@ -389,14 +389,15 @@ The file system becomes an independent system. ...@@ -389,14 +389,15 @@ The file system becomes an independent system.
* All nodes see the same set of files. * All nodes see the same set of files.
* A set of central servers manages the file system. * A set of central servers manages the file system.
* All nodes accessing the lustre file system run local *clients*. * All nodes accessing the lustre file system run local *clients*.
* Many nodes can write into the same file at the same time (MPI-IO).
* Optimized for high traffic volumes in large files. * Optimized for high traffic volumes in large files.
## Metadata and storage servers ## Metadata and storage servers
* The index is spread over a group of Metadata servers (MDS, 8 for work on levante). * The index is spread over a group of Metadata servers (MDS, 8 for /work on levante).
* The files are spread over another group (40 OSS / 160 OST on levante). * The files are spread over another group (40 OSS / 160 OST on levante).
* Every directory is tied to one MDS * Every directory is tied to one MDS.
* A file is tied to one or more OSTs. * A file is tied to one or more OSTs.
* An OST contains many hard disks * An OST contains many hard disks.
## The work file system of levante in context ## The work file system of levante in context
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment