SPDX-FileCopyrightText: 2023 Max Planck Institute for Meteorology, Yaco authors
SPDX-License-Identifier: BSD-3-Clause
-->
# Yaco
Yaco is a tool to receive data from YAC and write it to different output formats. It is designed to be easily extendible for new output formats and grids.
## Configuration
Yaco is configured using a yaml file that is specified at runtime and that is expected to be in the form as described in the following
### Glossary
-**field**: an array provided by YAC - mostly (semantically) a 2d array with values of a model variable
-**collection**: an array of fields as provided by YAC - mostly different height layers in the model or variants of a variable
-**map**: a dictionary mapping each key to a value - as per the yaml standard
-**sequence**: a list of values as per the yaml standard
scalar: a single value in yaml that is neither a map nor a sequence
### Basic format
At the yaml root level, several keys have to be specified with their corresponding value:
-`output`: Specifies the output format - currently `fdb` and `netcdf` are supported.
-`grid`: Specifies the grid using a map with keys:
-`name`: Name of the grid as identified in YAC
-`type`: Type of the grid - currently only `healpix` is supported
- Further grid-specific keys. For `type: healpix` these are:
-`nside`: NSide as per HEALPix specification
-`order`: Order of the vertices - `nested` or `ring`
-`missing_value` (optional): Value to be used when a value in a field is missing (default: `-9e33`).
-`yac_input`:
Additionally, each format requires additional configuration as described in the follow sections.
### Metadata for GRIB and FDB output
For GRIB/FDB output, the key `fdb_writer` with a map as value has to be specified with keys:
-`fdb_config`: Path to the FDB configuration file
-`grib_metadata`: Metadata to be included in the GRIB message, that is encoded for each field, as a sequence of maps each with at least one key/value pair (see below)
-`fdb_metadata`: Metadata to be handed to FDB, as a sequence of maps each with at least one key/value pair (see below)
-`collections`: A map with keys being the names of the collections as known in YAC and the values being a map with keys `grib_metadata` and/or `fdb_metadata` (see below)
The metadata to be included in the GRIB message, that is encoded for each field, as well as the keys to be handed to FDB can be specified globally under `fdb_writer` (for each field) and per collection (in the resepective map under `collections` for each field of a collection), the latter taking precedence over the former. Both metadata sets are set as key/value pairs under the `grib_metadata` or `fdb_metadata`, respectively (it is important to note that it has been made sure that metadata for GRIB and that for FDB are fully set independent from each other - where they should equal, this has to be set explicitely, see below).
Each metadata set is given as a sequence of maps each with at least one key/value pair (note that, if there are more than one, their order is not defined as per YAML specification!):
If the value is a scalar, it is used as a constant value - integer or float if it can be interpreted as such or as a plain string. For dynamic values, these can be derived from different sources by providing the value as a map with the key `source` set to one of:
-`yaco`: This specifies the final value to be taken from the value of a Yaco variable at the time the field is put out. This variable is to be specified using the key `key` with a value of:
-`collection_index`: Yields the current index of the field in its collection (starting with `0`).
- If the `offset` key is given additionally, the specified offset value will be added (e.g. `offset: 1` would make the numbering start with `1`). Cannot be combined with `lookup`.
- If the `lookup` key is given additionally, the yielded value is taken from a list specified using `lookup_table`. The value of `lookup` can be:
-`direct`: Use the first value of the `lookup_table` for the first field in a collection, the second for the second, ...
-`accumulate`: Like `direct`, but accumulate first, i.e. the first field will get a value of `0`, the second that of the first value in the `lookup_table`, the third that of the sum of the table's first two values, ...
-`accumulate_half`: Like `accumulate` but only use the half value of the latest looked up value, i.e. `0` for first field, `0.5 * lookup_table[0]` for the second, `lookup_table[0] + 0.5 * lookup_table[1]` for the third, ...
-`timestep_begin`: The time at the beginning of the (output) timestep for which the variable is written. Needs a `format` key with a value of the format of the time string (e.g. `"%Y-%m-%dT%H:%M:%SZ"`).
-`timestep_end`: The time at the end of the (output) timestep for which the variable is written. Needs a `format` key with a value of the format of the time string (e.g. `"%Y-%m-%dT%H:%M:%SZ"`).
-`missing_value`: Yields the missing value - default (`-9e33`) or as specified globally using the `missing_value` key.
-`has_missing_values`: Yields `1` if missing values have been found in the field or `0` otherwise.
-`grib`: This specifies the final value to be taken from a GRIB message value of the current field. The GRIB key is to be specified using the key `key` and corresponds to a GRIB key as understood by the eccodes library - hence it can be one of the key/values explicitely specified in the `grib_metadata` sections in the Yaco configuration or one implicitely specified and set during GRIB encoding via eccodes. This enables the user to explicitely specify the FDB metadata to be used from GRIB if necessary. When the additional key `maybeempty` is set to `true` the FDB key will not be set if the corresponding key is not present in the GRIB message (rather than raising an error).
-`grid`: Similar to Yaco variables, the grid configured to be mapped to has properties that can be used as values to ensure that each value has a single source of truth in the Yaco configuration.
- For `type: healpix` grids, the following variables are available:
-`nside`: NSide as per HEALPix specification
-`sides`: Number of sides of the grid (2^nside)
-`order`: Order of the vertices - `nested` or `ring`
-`name`: Name of the grid as identified in YAC
-`type`: Equal to `healpix`
### Metadata for NetCDF output
For NetCDF output (`output: netcdf`), the key `netcdf_writer` needs to be specified with a value of map with keys:
-`reference_time`: Reference time (ISO format) for the resulting NetCDF files
-`collections`: A map with keys being the names of the collections as known in YAC and the values being a map with keys:
-`filename`: Filename for the resulting NetCDF file (must not be shared between collections)
- For collections whose fields should be written as different variables (e.g. also for 2D fields):
-`collection`: A sequence of the length equal to the number of variables in the collection, each with a map with keys:
-`variable`: A map with the key `name` with the value being the name of the variable in the NetCDF file
- For colelctions whose fields should be written as different layers of the same variable (e.g. for 3D fields):
-`variable`: A map with the key `name` with the value being the name of the variable in the NetCDF file
-`dimension`: A map with the key `name` with the value being the name of the dimension to be used for the level of the variable in the NetCDF file
-`collection`: A sequence of the length equal to the number of levens in the collection, each with a map with keys:
-`dim_value`: Value of the level (e.g. the height)
## Extending
Yaco's structure is designed to be easily extendible, in particular for other output formats and grids.
### Output formats
To add new output formats, a new class can be added in the `src/data_handler` directory that inherits from the `DataHandler` class and implements the `handle` method. This is called for each collection. Additionally, a creator function needs to be added implemented with the signature `std::unique_ptr<DataHandler> DataHandler::createXXX(config::Value config)` and declared in `include/yaco/data_handler.hpp`. The selection and creation of the respective writer class also needs to be added in `src/main.cpp` accordingly
### Grids
To add new grids, a new class can be added in the `src/grid` directory that inherits from the `Grid` (or the `UnstructuredGrid`) class and implements the respective methods (cf. `include/yaco/grid.hpp`). The selection and creation of the respective grid also needs to be added in `src/main.cpp` accordingly.