diff --git a/lectures/good-practice/slides.qmd b/lectures/good-practice/slides.qmd index ec167295c0c66c2ac1fac1f85f01a4a0a36ffa7d..ffd0c13121152ea5b8a4b4f681f044ac206f9f1c 100644 --- a/lectures/good-practice/slides.qmd +++ b/lectures/good-practice/slides.qmd @@ -6,11 +6,15 @@ author: "Bjorn Stevens and Theresa Mieslinger" # Good Scientific Practice *Building trust in research. And in your own work.* -## What is it and why should we care? -* Principles fomulated by the research community that define proper research behaviour with the aim to ensure a high quality, robustness and reproducibility of results (publications, data, code, software). -* How it relates to this course: it provides rules for building software, using own and other software/data and for communicate the usage of software/data. +## What is it? +Principles fomulated by the research community that define ***proper research behaviour*** with the aim to ensure a high quality, robustness and reproducibility of results (publications, data, code, software). -## The pillars of Good Scientific Practice +## Why should we care? +* guidlines for building software +* using own and other software/data +* communicating the usage of software/data + +## The pillars of Good Scientific Practice {.smaller} ::: {.incremental} * **Reliability** in ensuring the quality of research, reflected in the design, methodology, analysis, and use of resources. @@ -22,33 +26,203 @@ author: "Bjorn Stevens and Theresa Mieslinger" ## The pillars of Good Scientific Practice -* Reliability / **Reproducibility** - * primary data - * data management and sharing +* **Reliability & Reproducibility** + * *primary data* + * *data management* * Honesty * **Respect & Accountability** - * authorship - * proper citation and referencing + * *authorship* + * *citation* -# Reproducibility +# Reliability & Reproducibility ## What do we want to reproduce? {.special} ::: {.fragment} -*the argument* +*the scientific argument* +::: + +::: {.notes} +* example: you run ICON to model future climate scenario for 2050 +* example: calculate the orbit of EarthCARE +::: + +## What do we need to save and how? {.special} + +### Primary Data +::: {.fragment} +* code, configuration, input data +* data management ::: -## What is needed to reproduce the argument? What do we need to save and how? {.special} +::: {.notes} +* What is needed to reproduce the argument? +::: ## What should we document? {.special} +:::fragment +*intent and usage* +::: +## +### Dokumentation + +| self-explanatory code | [Commenting Showing Intent (CSI)](https://standards.mousepawmedia.com/en/stable/csi.html) | Docstrings & Manuals +| -----------| ----------- | ----------- | +| Actual behaviour | Intent and design of code | software / API usage | +| developers / maintainers | developers / maintainers | end-developers and -users | +| **WHAT** does the code do? | **WHY** was the code written? | **HOW** to use it? | + + +:::{.notes} +* self-explanatory code: meaningful variable and function names, modular structure +* CSI: makes code language-agnostic! +* Doku: includes docstrings +::: + +## +### Commenting Showing Intent (CSI) + +:::leftalign +Bad example stating **WHAT** + + ```cpp + // set box_width to equal the floor of items and 17 +int items_per_box = floor(items/17) +``` +:::fragment +Good example stating **WHY** + +```cpp +/* Divide our items among 17 boxes. + * We'll deal with the leftovers later. */ +int items_per_box = floor(items/17) +``` +::: +::: + +::: {.notes} +* don't state the obvious +::: + +## +### Docstrings and Manuals + +:::: {.columns} + +::: {.column width="50%" .smaller} + +```python +def sum(a, axis=None, dtype=None, out=None, keepdims=np._NoValue, + initial=np._NoValue, where=np._NoValue): + """ + Sum of array elements over a given axis. + + Parameters + ---------- + a : array_like + Elements to sum. + axis : None or int or tuple of ints, optional + Axis or axes along which a sum is performed. The default, + axis=None, will sum all of the elements of the input array. If + axis is negative it counts from the last to the first axis. -## FAIR data -* state the ideas and problems of FAIR data + .. versionadded:: 1.7.0 + + If axis is a tuple of ints, a sum is performed on all of the axes + specified in the tuple instead of a single axis or all the axes as + before. + dtype : dtype, optional + The type of the returned array and of the accumulator in which the + elements are summed. The dtype of `a` is used by default unless `a` + has an integer dtype of less precision than the default platform + integer. In that case, if `a` is signed then the platform integer + is used while if `a` is unsigned then an unsigned integer of the + same precision as the platform integer is used. + out : ndarray, optional + Alternative output array in which to place the result. It must have + the same shape as the expected output, but the type of the output + values will be cast if necessary. + keepdims : bool, optional + If this is set to True, the axes which are reduced are left + in the result as dimensions with size one. With this option, + the result will broadcast correctly against the input array. + + If the default value is passed, then `keepdims` will not be + passed through to the `sum` method of sub-classes of + `ndarray`, however any non-default value will be. If the + sub-class' method does not implement `keepdims` any + exceptions will be raised. + initial : scalar, optional + Starting value for the sum. See `~numpy.ufunc.reduce` for details. + + .. versionadded:: 1.15.0 + + where : array_like of bool, optional + Elements to include in the sum. See `~numpy.ufunc.reduce` for details. + + .. versionadded:: 1.17.0 + + Returns + ------- + sum_along_axis : ndarray + An array with the same shape as `a`, with the specified + axis removed. If `a` is a 0-d array, or if `axis` is None, a scalar + is returned. If an output array is specified, a reference to + `out` is returned. +``` +::: + +::: {.column width="50%"} +-> docstring info is used in the [numpy.sum](https://numpy.org/doc/stable/reference/generated/numpy.sum.html) API + + +:::{.info .smaller} +See also Python docstring conventions [PEP-257](https://peps.python.org/pep-0257/). +::: +::: +:::: + +## (Meta)data and FAIR principles +* **Findable**: unique identifiers, metadata registered in a searchable resource +* **Accessible**: (meta)data retrievable via stnadardised communication protocol +* **Interoperable**: compatibility with other data through, e.g. [CF conventions](http://cfconventions.org/), common data formats `netCDF`, `zarr`, `csv`. +* **Reusable**: (meta)data description, attributes, data usage license + +:::{.info .smaller} +[FAIR principles](https://www.go-fair.org/fair-principles/) +::: + +## Beyond FAIR data +Make it FUN :) + +:::: {.columns} + +::: {.column width="50%"} +**Issues with FAIR data** + +* data availability not guaranteed +* accessibility only with credentials possible +* DOIs don't point to data, but only to landingpages +::: + +::: {.column width="50%"} +**Make it FUN to work with the data** + +* design user-friendly datasets +* make them publicly available +::: +:::: ## Which tools shall we use? -* open source / development -* trustworthy sources +### Open Source +decentralized software development model that encourages open collaboration +relies on goal-oriented yet loosely coordinated participants who cooperate voluntarily to create a product (or service) of economic value, which is made freely available to contributors and noncontributors alike + +* trustworthy sources, supply chain bug +::: {.notes} +* you are responsible for the result +::: ## Summary on Reproducibility * save the primary data needed to reproduce the argument of your scientific study @@ -61,8 +235,10 @@ author: "Bjorn Stevens and Theresa Mieslinger" CC0 versus CC-BY ## Intellectual Property (IP) +Copyright versus Copyleft ## Using AI +* you are still responsable for your scientific results ## Summary on Authorship and Credit @@ -96,3 +272,6 @@ Good Scientific Practice ensures research integrity and the advancement of knowl # Further Reading * [European Code of Conduct for Research Integrity](https://allea.org/wp-content/uploads/2023/06/European-Code-of-Conduct-Revised-Edition-2023.pdf) * [DFG Guidlines for Safeguarding Good Research Practice. Code of Conduct](https://zenodo.org/records/6472827) + +# Slide example + diff --git a/lectures/good-practice/static/numpy_api_example.png b/lectures/good-practice/static/numpy_api_example.png new file mode 100644 index 0000000000000000000000000000000000000000..cf4767be0c663223e8e258e2b354a074261c48fa Binary files /dev/null and b/lectures/good-practice/static/numpy_api_example.png differ