# Introduction This project provides scripts to build and install the libraries needed for ICON I/O into working configurations with as little effort as possible. The intention is, on the one hand, to gather recipes that allowed for a successful build at multiple sites to highlight the specific and general issues involved and keep track of required patches. On the other hand it is hoped that this way, should a new, desirable version of some library arise, recompiling the whole stack becomes an easier affair. Since the libraries need to be consistent with each other, the aspect of rapidly building the whole stack and running the corresponding test suites should improve confidence that the resulting installation can actually be used for ICON. # Scope This script facilitates a consistent from-source installation for the following packages: + __libaec__ Adaptive entropy encoding data compression library, used by __hdf5__ and __eccodes__. + __HDF5__ Data container library commonly used by netcdf version 4 and later. + __Pnetcdf__, also known as __parallel-netcdf__ A library for parallel access to those file formats of the netcdf library not using the HDF5 library. + __NetCDF-C__ A library to access a data file format common in a number of physical sciences. + __netcdf-fortran__ A Fortran wrapper library for netcdf-c. + __eccodes__ A library to read and/or write data in the WMO GRIB1 and GRIB2 formats. Provided by ECMWF. + __YAXT__ Facilitates various data exchanges for arrays distributed over multiple MPI tasks. Effectively removes the necessity to code invidual MPI message passing calls. + __PPM__ Partitioning and Parallelization Module, a library to aid in various recurring tasks of parallel programs. + __CDI__ Provides an abstraction for multiple data formats to ease switching between COARDS conforming netCDF and WMO GRIB formats. Includes the *CDI-PIO* parallelization layer needed for parallel output from a number of climate/weather models. In various parts of the scripts, each package is referred to by its name in all lower-case. This so called package key is instrumental in referencing the various associative arrays storing information on each package. ## Scripts Overview The supplied scripts provide the following: + __build-cdi-pio-stack.sh__ This is the main driver that is used by all the system-specific scripts. It provides the basic recipes and takes care of the dependencies and general handling of static and/or dynamic libraries. It provides various seams (see above) to customize its operation to the specific properties of the target system. + __build-cdi-pio-stack-daint-cce.sh__ Builds the library stack for Piz Daint at CSCS with the current Cray compiler. + __build-cdi-pio-stack-daint-cce-10.0.2.sh__ Builds the library stack for Piz Daint at CSCS with the older Cray compiler version 10.0.2. + __build-cdi-pio-stack-daint-pgi.sh__ Builds the library stack for Piz Daint at CSCS with the PGI compiler version 20.1. + __build-cdi-pio-stack-juwels-booster-nvhpc-20.11.sh__ Builds the libraries for JSC JUWELS Booster with the NVidia compilers, version 20.11. + __build-cdi-pio-stack-juwels-booster-nvhpc-21.5.sh__ Builds the libraries for JSC JUWELS Booster with the NVidia compilers, version 21.5. + __build-cdi-pio-stack-mistral-intel-openmpi2.sh__ Builds the libraries on DKRZ Mistral with the Intel compiler, version 17.0.6 for compatibility with more user setups. + __build-cdi-pio-stack-mistral-nag-openmpi2.sh__ Builds the libraries on DKRZ Mistral with the NAG compiler version 6.2. + __build-cdi-pio-stack-vader-gcc-10.2-ompi-4.0.5.sh__ Builds the whole stack on the DKRZ ML cluster with gcc 10.2.0 and OpenMPI 4.0.5. + __build-cdi-pio-stack-vader-icc-impi.sh__ Builds the whole stack on the DKRZ ML cluster with Intel compiler 2021.1 and Intel MPI 2021.1. + __build-cdi-pio-stack-vader-icc-impi-O0.sh__ Builds the whole stack on the DKRZ ML cluster with Intel compiler 2021.1 and Intel MPI 2021.1 in a debugging configuration. After some initial difficulties, most of the wrappers can now perform the build in tmpfs filesystem for much improved speed. ## Invoking the main driver script The main driver can be passed the following settings via command-line arguments (preferred) or via environment variables: + __PAR_BUILD__ Number of parallel make jobs to use. Default is to use 22 tasks. + __EXTRA_MAKE_ARGS__ Extra arguments to pass to make. This can e.g. be used to set some common make variables not treated explicitly below. + __EXTRA_MAKE_CHECK_ARGS__ Extra arguments to pass to make check. This can e.g. be used to set some common make variables not treated explicitly below. + __make__ Name of make program to invoke. This defaults to the value of the MAKE variable, or, if unset, to make, or gmake when make can not be found in the PATH. + __build__ A tag to represent the build target, usually by the name and version of compiler and MPI library. + __stages__ build-cdi-pio-stack.sh runs the following steps for each package it considers to be built: + *download* Download the source package archives and/or git clones. + *unpack* Expands archived sources and applies patches. Note that patches are only applied at the initial unpacking. Once the source directory exists, no patches are applied. + *build* Performs the configure && make steps of the build for every package with unfinished installation. + *check* Runs make check in each build directory. + *install* Performs a `make install` for each package not already installed. Optionally the *recheck* stage re-runs the test suite of packages already successfully installed. Separating the stages is most useful for situations where the download must be carried out on another system because the HPC system blocks some network connections or for systems where the environment for running make check is significantly different from the build environment, e.g. because a batch allocation must be provided for testing. + __CC__, __FC__, __CXX__, __F77__ Command to invoke the C, Fortran and C++ compilers respectively, defaults to mpicc, mpifort and mpic++. F77 is special in that it defaults to the value of FC. + __CPPFLAGS__, __CFLAGS__, __FCFLAGS__, __FFLAGS__, __CXXFLAGS__ Initial flag variables for the C preprocessor and compiler, the Fortran and Fortran 77 compilers and the C++ compiler. These default to -g -O2 but it is recommended to at least add flags to adjust code generation for the target architecture, e.g. -march=native for gcc. + __AR__ Command to create standard Unix (library) archive, defaults to ar. + __RANLIB__ Command to index standard Unix (library) archive, defaults to ranlib. + __CC_PIC_FLAGS__ Flag(s) to __CC__ to produce object files that can be incorporated into dynamic shared objects. The default value is -fPIC. + __LIBS__ and __LDFLAGS__ These variables are meant to hold flags and library specifications in case some special library is needed (e.g. a custom malloc library) or some parts of the library stack are already installed by other means. + __MPI_LAUNCH__ Program to start MPI-parallelized programs, defaults to what the configuration scripts of the libraries pick up by automatic configuration, typically `mpirun`. When using the Slurm batch scheduler, `srun` is usually preferred, on Cray systems `aprun` might be needed. + __CMAKE_EXTRA_ARGS__ The contents of this variable is expanded into multiple arguments to be added to the invocation of cmake for eccodes. + __libtype__ Should take one of the values `static`,`shared`, or `both`. Default is to build shared objects only. + __packages_dl__ Is an associative array mapping each package to its downloadable archive URL. + __CC_rpath_flag__, __FC_rpath_flag__ This defaults to -Wl,-rpath, and is meant to be the prefix for additions to the RUNPATH and/or RPATH entries of shared objects via compiler link step flags. Notably, FC_rpath_flag must be set to -Wl,-Wl,,-rpath,, for the NAG Fortran compiler. + __package_git__, __package_git_branch__ URL and branch to use for packages to retrieve via `git clone`. + __basedir__ Path at which to root the following by default: + __archivedir__ Where to store downloaded archive files, defaults to `$basedir/archive`. + __srcdir__ Where to unpack the source archives to. When set to e.g. `/some/path`, netcdf-c 4.7.4 would be unpacked to `/some/path/netcdf-c-4.7.4`. Defaults to `$basedir/src`. + __builddir__ Where to perform compilations. If at all possible, it is suggested to put this on tmpfs to reduce total build time significantly. Defaults to `$basedir/build/$build` with individual packages being built in a subdirectory corresponding to the package name. + __prefix__ Where to install packages to, defaults to `${basedir}/opt/${build}`. If the `multi_installs` variable is set, each package is installed into an individual sub-directory, and the following substitutions are performed on `$prefix`: + %k is replaced by the package base name, e.g. libaec + %n is replaced by the package base name followed by the version or git commit hash, separated by a dash. + %b is replaced by `$build`. + %v is replaced by the package version + __NC_H5_CACHE_SIZE__, __NC_H5_CHUNK_CACHE_NELEMS__ Tunables for netcdf-c default chunking parameters. The default values are 4194304 and 1009 respectively. + __SCRATCH__ A directory with sufficient free space for the large file tests of the various test suites. Must be writable for programs started by __MPI_LAUNCH__. Additionally, for each package variables can be provided by prefixing the following suffices with the canonicalized package name (non alphanumeric and underscore characters are replaced by underscore and only lower case letters are used), e.g. NetCDF-C becomes the `netcdf_c` prefix. Optionally, the prefix can be extended with the canonicalized version to provide for fixes known to be needed for a specific package version only. + *prefix*_configure Contains extra arguments to pass to the configure step of the package denoted by *prefix* + *prefix*_configure_env This variable undergoes variable expansion as a prefix of the configure command and can for example be used to run configure with a different shell or via `salloc`. + *prefix*_check_env These variables serve the same purpose as *prefix*_configure_env but for the stage running the test suite. Presuming a package with canonical name pkg is already installed in the system and the various settings adjusted to make use of it, adding the following argument suppresses building the package: --use-from-system=pkg