Added documentation for notebook

3d7294cc · Fabian Wachsmann · d868f7a9 · 3d7294cc
Commit 3d7294cc authored 4 years ago by Fabian Wachsmann
--- a/notebooks/use-case_analyse-time-series_xarray_cmip6.ipynb
+++ b/notebooks/use-case_analyse-time-series_xarray_cmip6.ipynb
@@ -134,7 +134,7 @@
    "### Post-process the data\n",
    "\n",
    "We define a function to calculate the __global and yearly mean__ of the variable. <br>\n",
-    "We assume that the grid is _rectangular_ . That implies that grid cell area is proportional to the cosinus of the latitude.\n",
+    "We assume that the grid is _rectangular_ (Not valid for other sources). That implies that grid cell area is proportional to the cosinus of the latitude and we can use these cosinuses as the _weights_ for the global mean.\n",
    "\n",
    "With these weights we can calculate a global mean.\n",
    "The yearly mean is calculated with a groupby."
@@ -242,7 +242,7 @@
    "    xlabel=\"Year\",\n",
    "    ylabel=label,\n",
    "    value_label=label,\n",
-    "    legend=\"bottom_right\",\n",
+    "    legend=\"top_left\",\n",
    "    ncol=3,\n",
    "    title=\"Global and yearly mean anomaly in comparison with\" \"1851-1880 \",\n",
    "    grid=True,\n",
@@ -261,9 +261,9 @@
 ],
 "metadata": {
  "kernelspec": {
-   "display_name": "test_env",
+   "display_name": "Python [conda env:root] *",
   "language": "python",
-   "name": "test_env"
+   "name": "conda-root-py"
  },
  "language_info": {
   "codemirror_mode": {
@@ -275,7 +275,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.8.3"
+   "version": "3.7.3"
  }
 },
 "nbformat": 4,

 %% Cell type:markdown id: tags:

 ## Plot the time series of a simulated variable from CMIP6 data for 1850-2100
 We will plot the __anomaly__ of a variable with respect to a reference period __1851-1880__ from the historical simulation.

 For understanding the code we recoomend to be familiar with
 - intake-esm <br>
 check the notebook `tutorial_find-data_catalogs_intake-esm.ipynb`
 - pandas
 - xarray

 We use three different experiments from the CMIP6 data pool:
 - historical <br>
 simulation of the time span 1850-2015
 - ssp245
 - ssp585 <br>
 projections for the future 2015-2100

 We focus on the German Earth System Models (ESM)
 - MPI-ESM1-2-HR <br>
 Mainly developed by Max-Planck-Institut
 - AWI-CM-1-1-MR <br>
 Mainly developed by Alfred-Wegener-Institut

 See also:
 https://www.dkrz.de/projekte-und-partner/HLRE-Projekte/focus/neues-klimamodell

 %% Cell type:code id: tags:

 ``` python
 import intake
 import pandas as pd
 import hvplot.pandas
 import numpy as np
 ```

 %% Cell type:code id: tags:

 ``` python
 # Choose one of
 # pr, psl, tas, tasmax, tasmin, clt
 variable_id = "tas"
 ```

 %% Cell type:markdown id: tags:

 ### Finding data
 We open the CMIP6 catalog with intake

 %% Cell type:code id: tags:

 ``` python
 col_url = "https://swift.dkrz.de/v1/dkrz_a44962e3ba914c309a7421573a6949a6/intake-esm/mistral-cmip6.json"
 col = intake.open_esm_datastore(col_url)
 ```

 %% Cell type:markdown id: tags:

 #### Subsetting

 %% Cell type:markdown id: tags:

 We define a query and specify values for some columns.
 In the following case, we look for the specified variable in monthly resolution for the 3 different experiments.

 _Search for the ESMs by `source_id` and not for its corresponding institutions by `institution_id` because some experiments may be conducted by other institutions._

 %% Cell type:code id: tags:

 ``` python
 query = dict(
    variable_id=variable_id,
    table_id="Amon",
    experiment_id=["historical", "ssp245", "ssp585"],
    source_id=["MPI-ESM1-2-HR", "AWI-CM-1-1-MR"],
 )
 # piControl = pre-industrial control, simulation to represent a stable climate from 1850 for >100 years.
 # historical = historical Simulation, 1850-2014
 # ssp370 = Shared Socioeconomic Pathways (SSPs) are scenarios of projected socioeconomic global changes. Simulation covers 2015-2100
 cat = col.search(**query)
 ```

 %% Cell type:markdown id: tags:

 ### Accessing the data

 The catalog can give us a dictionary of xarray datasets. They are aggregated over time and member. The keys of the dictionary are concatenated directory names.

 %% Cell type:code id: tags:

 ``` python
 xrdsetdict = cat.to_dataset_dict()
 '''Keys of the dataset dictionary are concatinations of 6 column labels:'''
 for key in xrdsetdict.keys():
    print(key)
 ```

 %% Cell type:markdown id: tags:

 ### Post-process the data

 We define a function to calculate the __global and yearly mean__ of the variable. <br>
-We assume that the grid is _rectangular_ . That implies that grid cell area is proportional to the cosinus of the latitude.
+We assume that the grid is _rectangular_ (Not valid for other sources). That implies that grid cell area is proportional to the cosinus of the latitude and we can use these cosinuses as the _weights_ for the global mean.

 With these weights we can calculate a global mean.
 The yearly mean is calculated with a groupby.

 %% Cell type:code id: tags:

 ``` python
 def global_yearly_mean(dfirst):
    # Get weights
    weights = np.cos(np.deg2rad(dfirst.lat))
    # Weighted Variable
    varset = dfirst.get(variable_id)
    wvar = varset.weighted(weights)
    # Tas global mean:
    wvargm = wvar.mean(("lon", "lat"))
    # Tas yearly mean:
    wvargmym = wvargm.groupby("time.year").mean("time")
    return wvargmym.values
 ```

 %% Cell type:markdown id: tags:

 We calculate the historical reference of the variable from the years 1851-1880 in _one_ historical simulation of _one_ ESM.
 With that we can afterwards calculate anomalies.

 %% Cell type:code id: tags:

 ``` python
 historical = [key for key in xrdsetdict.keys() if "historical" in key][0]
 dshist = xrdsetdict[historical]
 tashist = dshist.sel(time=dshist.time.dt.year.isin(range(1851, 1881)))
 # 10member
 tashistgmym = global_yearly_mean(tashist)
 tashistgmymm = tashistgmym.mean()
 ```

 %% Cell type:markdown id: tags:

 We retrieve attributes from the variable to label axes in the plot.

 %% Cell type:code id: tags:

 ``` python
 lname = dshist.get(variable_id).attrs["long_name"]
 units = dshist.get(variable_id).attrs["units"]
 label = "Delta " + lname + "[" + units + "]"
 ```

 %% Cell type:markdown id: tags:

 The plot is generated by a function on top of a __pandas dataframe__.

 We define one and write the results of each global yearly mean into that dataframe. We use parts of the keys of the dataset dictionary as column names of the dataframe.

 %% Cell type:code id: tags:

 ``` python
 lxr = list(xrdsetdict.keys())
 columns = [".".join(elem.split(".")[1:4]) for elem in lxr]
 print(columns)
 tasgmympd = pd.DataFrame(index=range(1850, 2101), columns=columns)
 for key in xrdsetdict.keys():
    print([".".join(key.split(".")[1:4])])
    datatoappend = global_yearly_mean(xrdsetdict[key])[0, :] - tashistgmymm
    years = list(xrdsetdict[key].get(variable_id).groupby("time.year").groups.keys())
    tasgmympd.loc[years, ".".join(key.split(".")[1:4])] = datatoappend
 ```

 %% Cell type:markdown id: tags:

 ### Plotting the data

 %% Cell type:code id: tags:

 ``` python
 tasgmympd.hvplot.line(
    xlabel="Year",
    ylabel=label,
    value_label=label,
-    legend="bottom_right",
+    legend="top_left",
    ncol=3,
    title="Global and yearly mean anomaly in comparison with" "1851-1880 ",
    grid=True,
    height=600,
    width=820,
 )
 ```

 %% Cell type:code id: tags:

 ``` python
 ```