Skip to content
Snippets Groups Projects
Commit 6cf1eee8 authored by Theresa Mieslinger's avatar Theresa Mieslinger
Browse files

added content to Reliability section

parent bd7ac67f
No related branches found
No related tags found
1 merge request!57First draft for Good Scientific and Coding Practice
...@@ -6,11 +6,15 @@ author: "Bjorn Stevens and Theresa Mieslinger" ...@@ -6,11 +6,15 @@ author: "Bjorn Stevens and Theresa Mieslinger"
# Good Scientific Practice # Good Scientific Practice
*Building trust in research. And in your own work.* *Building trust in research. And in your own work.*
## What is it and why should we care? ## What is it?
* Principles fomulated by the research community that define proper research behaviour with the aim to ensure a high quality, robustness and reproducibility of results (publications, data, code, software). Principles fomulated by the research community that define ***proper research behaviour*** with the aim to ensure a high quality, robustness and reproducibility of results (publications, data, code, software).
* How it relates to this course: it provides rules for building software, using own and other software/data and for communicate the usage of software/data.
## The pillars of Good Scientific Practice ## Why should we care?
* guidlines for building software
* using own and other software/data
* communicating the usage of software/data
## The pillars of Good Scientific Practice {.smaller}
::: {.incremental} ::: {.incremental}
* **Reliability** in ensuring the quality of research, reflected in the design, methodology, analysis, and use of resources. * **Reliability** in ensuring the quality of research, reflected in the design, methodology, analysis, and use of resources.
...@@ -22,33 +26,203 @@ author: "Bjorn Stevens and Theresa Mieslinger" ...@@ -22,33 +26,203 @@ author: "Bjorn Stevens and Theresa Mieslinger"
## The pillars of Good Scientific Practice ## The pillars of Good Scientific Practice
* Reliability / **Reproducibility** * **Reliability & Reproducibility**
* primary data * *primary data*
* data management and sharing * *data management*
* Honesty * Honesty
* **Respect & Accountability** * **Respect & Accountability**
* authorship * *authorship*
* proper citation and referencing * *citation*
# Reproducibility # Reliability & Reproducibility
## What do we want to reproduce? {.special} ## What do we want to reproduce? {.special}
::: {.fragment} ::: {.fragment}
*the argument* *the scientific argument*
:::
::: {.notes}
* example: you run ICON to model future climate scenario for 2050
* example: calculate the orbit of EarthCARE
:::
## What do we need to save and how? {.special}
### Primary Data
::: {.fragment}
* code, configuration, input data
* data management
::: :::
## What is needed to reproduce the argument? What do we need to save and how? {.special} ::: {.notes}
* What is needed to reproduce the argument?
:::
## What should we document? {.special} ## What should we document? {.special}
:::fragment
*intent and usage*
:::
##
### Dokumentation
| self-explanatory code | [Commenting Showing Intent (CSI)](https://standards.mousepawmedia.com/en/stable/csi.html) | Docstrings & Manuals
| -----------| ----------- | ----------- |
| Actual behaviour | Intent and design of code | software / API usage |
| developers / maintainers | developers / maintainers | end-developers and -users |
| **WHAT** does the code do? | **WHY** was the code written? | **HOW** to use it? |
:::{.notes}
* self-explanatory code: meaningful variable and function names, modular structure
* CSI: makes code language-agnostic!
* Doku: includes docstrings
:::
##
### Commenting Showing Intent (CSI)
:::leftalign
Bad example stating **WHAT**
```cpp
// set box_width to equal the floor of items and 17
int items_per_box = floor(items/17)
```
:::fragment
Good example stating **WHY**
```cpp
/* Divide our items among 17 boxes.
* We'll deal with the leftovers later. */
int items_per_box = floor(items/17)
```
:::
:::
::: {.notes}
* don't state the obvious
:::
##
### Docstrings and Manuals
:::: {.columns}
::: {.column width="50%" .smaller}
```python
def sum(a, axis=None, dtype=None, out=None, keepdims=np._NoValue,
initial=np._NoValue, where=np._NoValue):
"""
Sum of array elements over a given axis.
Parameters
----------
a : array_like
Elements to sum.
axis : None or int or tuple of ints, optional
Axis or axes along which a sum is performed. The default,
axis=None, will sum all of the elements of the input array. If
axis is negative it counts from the last to the first axis.
## FAIR data .. versionadded:: 1.7.0
* state the ideas and problems of FAIR data
If axis is a tuple of ints, a sum is performed on all of the axes
specified in the tuple instead of a single axis or all the axes as
before.
dtype : dtype, optional
The type of the returned array and of the accumulator in which the
elements are summed. The dtype of `a` is used by default unless `a`
has an integer dtype of less precision than the default platform
integer. In that case, if `a` is signed then the platform integer
is used while if `a` is unsigned then an unsigned integer of the
same precision as the platform integer is used.
out : ndarray, optional
Alternative output array in which to place the result. It must have
the same shape as the expected output, but the type of the output
values will be cast if necessary.
keepdims : bool, optional
If this is set to True, the axes which are reduced are left
in the result as dimensions with size one. With this option,
the result will broadcast correctly against the input array.
If the default value is passed, then `keepdims` will not be
passed through to the `sum` method of sub-classes of
`ndarray`, however any non-default value will be. If the
sub-class' method does not implement `keepdims` any
exceptions will be raised.
initial : scalar, optional
Starting value for the sum. See `~numpy.ufunc.reduce` for details.
.. versionadded:: 1.15.0
where : array_like of bool, optional
Elements to include in the sum. See `~numpy.ufunc.reduce` for details.
.. versionadded:: 1.17.0
Returns
-------
sum_along_axis : ndarray
An array with the same shape as `a`, with the specified
axis removed. If `a` is a 0-d array, or if `axis` is None, a scalar
is returned. If an output array is specified, a reference to
`out` is returned.
```
:::
::: {.column width="50%"}
-> docstring info is used in the [numpy.sum](https://numpy.org/doc/stable/reference/generated/numpy.sum.html) API
![](static/numpy_api_example.png)
:::{.info .smaller}
See also Python docstring conventions [PEP-257](https://peps.python.org/pep-0257/).
:::
:::
::::
## (Meta)data and FAIR principles
* **Findable**: unique identifiers, metadata registered in a searchable resource
* **Accessible**: (meta)data retrievable via stnadardised communication protocol
* **Interoperable**: compatibility with other data through, e.g. [CF conventions](http://cfconventions.org/), common data formats `netCDF`, `zarr`, `csv`.
* **Reusable**: (meta)data description, attributes, data usage license
:::{.info .smaller}
[FAIR principles](https://www.go-fair.org/fair-principles/)
:::
## Beyond FAIR data
Make it FUN :)
:::: {.columns}
::: {.column width="50%"}
**Issues with FAIR data**
* data availability not guaranteed
* accessibility only with credentials possible
* DOIs don't point to data, but only to landingpages
:::
::: {.column width="50%"}
**Make it FUN to work with the data**
* design user-friendly datasets
* make them publicly available
:::
::::
## Which tools shall we use? ## Which tools shall we use?
* open source / development ### Open Source
* trustworthy sources decentralized software development model that encourages open collaboration
relies on goal-oriented yet loosely coordinated participants who cooperate voluntarily to create a product (or service) of economic value, which is made freely available to contributors and noncontributors alike
* trustworthy sources, supply chain bug
::: {.notes}
* you are responsible for the result
:::
## Summary on Reproducibility ## Summary on Reproducibility
* save the primary data needed to reproduce the argument of your scientific study * save the primary data needed to reproduce the argument of your scientific study
...@@ -61,8 +235,10 @@ author: "Bjorn Stevens and Theresa Mieslinger" ...@@ -61,8 +235,10 @@ author: "Bjorn Stevens and Theresa Mieslinger"
CC0 versus CC-BY CC0 versus CC-BY
## Intellectual Property (IP) ## Intellectual Property (IP)
Copyright versus Copyleft
## Using AI ## Using AI
* you are still responsable for your scientific results
## Summary on Authorship and Credit ## Summary on Authorship and Credit
...@@ -96,3 +272,6 @@ Good Scientific Practice ensures research integrity and the advancement of knowl ...@@ -96,3 +272,6 @@ Good Scientific Practice ensures research integrity and the advancement of knowl
# Further Reading # Further Reading
* [European Code of Conduct for Research Integrity](https://allea.org/wp-content/uploads/2023/06/European-Code-of-Conduct-Revised-Edition-2023.pdf) * [European Code of Conduct for Research Integrity](https://allea.org/wp-content/uploads/2023/06/European-Code-of-Conduct-Revised-Edition-2023.pdf)
* [DFG Guidlines for Safeguarding Good Research Practice. Code of Conduct](https://zenodo.org/records/6472827) * [DFG Guidlines for Safeguarding Good Research Practice. Code of Conduct](https://zenodo.org/records/6472827)
# Slide example
lectures/good-practice/static/numpy_api_example.png

331 KiB

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment