Skip to content
Snippets Groups Projects

Intake-esm support from DKRZ Data Management

This repo contains material for preparing, scheduling and archiving intake-esm catalogs.

Cronjobs running for generating intake functionalities include

  • updating catalogs for the cmip data pool of dkrz (builder/dkrz_PROJECT_STORE.py)
  • testing catalogs (test/check_load_catalog_PROJECT.py)
  • hosting and archiving catalogs at /pool/data/catalogs and in the cloud (archive-catalog.sh)
  • creating statistics for catalogs including kpis like no. of files and datasets

One main catalog collects all catalogs in /pool/data/catalogs and serves as the entry point for dkrz's intake users.

environment.yml

Use that file with conda env create -f environment.yml to generate a software environment which allows you to use the notebooks wihtin this repository.

esm-collections/

All esm-collections available at DKRZ are saved within this folder. Those are .json files which can be opend with intake.open_esm_datastore().

builder/

This folder contains scripts for generating the catalog data bases (.csv.gz).

tests/

These scripts tests the newly generated catalogs. If the tests are successfull, the old catalogs are archived in an archive/ directory for documentation of update processes.

If you cannot find files in recent catalogs, check if they are retracted by searching for it in the esgf browser interface with enabled 'show all versions'.

If you want to report anything, please create an issue wihtin this repo.