Add hsm archive procedure, that we can crawl metadata afterwards. (!1) · Merge requests · freva / archive-drs

Martin Bergemann requested to merge init into main Sep 14, 2022

This is a new repository that uses slk to arichve data. Currently the command line arguments and not the rest API. The data that is going to be updated needs to follow somewhat a DRS structure. By somewhat I mean that the code assumes that the data files reside in a directory and that all files in one directory are only different by time.

For example:

/work/ch1187/freva-regiklim/data/observations/grid/DWD/DWD/radolan/1hr/atmos/1hr/r1i1p1/v20210528/pr/pr_1hr_DWD_radolan_r1i1p1_2000-2009.nc
/work/ch1187/freva-regiklim/data/observations/grid/DWD/DWD/radolan/1hr/atmos/1hr/r1i1p1/v20210528/pr/pr_1hr_DWD_radolan_r1i1p1_2010-2019.nc

The code would then bundle the files by time and create a tar archive: pr_1hr_DWD_radolan_r1i1p1_2000-2019.tar and push the tar file hsm at /arch/ch1187/freva-regiklim/data/observations/grid/DWD/DWD/radolan/1hr/atmos/1hr/r1i1p1/v20210528/pr/pr_1hr_DWD_radolan_r1i1p1_2000-2019.tar

using this procedure it would be straightforward to crawl metadata from the hsm and ingest it into the solr databrowser.

@k204221 @k204229 @k204237 @k204206

Edited Sep 14, 2022 by Martin Bergemann

Add hsm archive procedure, that we can crawl metadata afterwards.

Merge request reports