Skip to content
Snippets Groups Projects

Ingest different DRS formats

Merged Ghost User requested to merge drs-specs into main

This PR is a significant rewrite of how we handle DRS file paths to allow us to ingest paths of a few different known DRS formats.

This involved:

  • creating a new crate to handle just loading the different DRS types
  • changing ingestion to load the different types and then convert them to our metadata format
  • fix some funky issues with data-dir not behaving how it should
    • I also made it optional so not having it means that all datasets are ingested

Merge request reports

Pipeline #19027 passed

Pipeline passed for 9f500ae5 on drs-specs

Approval is optional

Merged by avatar (Apr 3, 2025 6:38am UTC)

Merge details

  • Changes merged into main with 56802713 (commits were squashed).
  • Deleted the source branch.

Pipeline #19120 passed

Pipeline passed for 56802713 on main

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Martin Bergemann
  • Martin Bergemann
  • Martin Bergemann
  • Many of my questions come from the fact that I am having trouble comprehending the rust code. I started consulting the rust book but I must say it'll take me months if not years of practice to mildly catch up with the level of the code. So in general I think it would be a good idea to keep an eye on the doc-strings to at least explain what's going on, although it might be a no-brainer to you.

    Also, would it be possible to add an example toml file to the README? With some comments? That'll be great.

  • I'll try this out tomorrow with our project data.

  • Martin Bergemann
  • I tried to run this with the following config: /work/ch1187/freva-regiklim/freva/drs_config.toml

    freva-ingest --data-dir /work/bb1203/freva/model/regional --config-dir  /work/ch1187/freva-regiklim/freva 
    2022-06-02T17:08:10.183794Z  WARN ingest_dataset{dataset="nukleus" batch_size=1000}: freva::drs::ingest: /work/bb1203/freva/model/regional/nukleus/output/GER-3km/GERICS/ECMWF-ERAINT/evaluation/r1i1p1/GERICS-REMO2015/v1/1hr/clivi/v20201205/clivi_GER-3km_ECMWF-ERAINT_evaluation_r1i1p1_GERICS-REMO2015_v1_1hr_200909010030-200909302330.nc not a valid drs file, skipping:
    InvalidCordexPath(InvalidCordexPathError { reason: "Parsing Error: Error { input: \"nukleus/output/GER-3km/GERICS/ECMWF-ERAINT/evaluation/r1i1p1/GERICS-REMO2015/v1/1hr/clivi/v20201205/clivi_GER-3km_ECMWF-ERAINT_evaluation_r1i1p1_GERICS-REMO2015_v1_1hr_200909010030-200909302330.nc\", code: Verify }" })

    I don't really understand why this isn't working. Could you check what is wrong?

  • Ghost User added 8 commits

    added 8 commits

    • fd835210 - Make `ingest_opts` a ref again
    • 727084ee - Add config example to readme
    • d2e5949c - Add warn(missing_docs) to drs and fix issues
    • d06121f2 - Forgot to make warn(missing_docs) apply to the module
    • b4d03d5e - Add docs to the rest of drs
    • 607cb64a - Add some actual documentation to `drs`
    • ba1687e3 - Rearrange code for handling data-dir
    • f92835d9 - Remove the cordex activity and product constraint

    Compare with previous version

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading