You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 11 Next »

Looking for data sets to be used in teaching and RDM training? Check out the list below!

Questions regarding other training material? Contact our Helpdesk!

This is a list of well documented training datasets covering different data types and different aspects of research data management for use in research data management training. 

Training datasets

Training datasets are essential for the effective teaching and training of young researchers. They form the basis for teaching data skills and analysis methods. Here, we are refering to datasets used in tutorials on research data management, as demo data set in tools or methods, or as examples for challenges in data handling. This definition does not cover datasets used to train AI applications.

To be labelled as a training dataset they have to:

  • be FAIR (Findable, Accessible, Interoperable, Reusable).
  • be freely available, with an appropriate license and open data format.
  • be of reasonable size.
  • be citable.
  • enable easy-to-understand but interesting questions to be addressed.
  • be sufficiently documented.
  • be either “perfect” or datasets with didactic errors.

For an overview, check the Poster on What are training datasets in the context of NFDI4Biodiversity (in German): Signer, J., Schlägel, U., Tschink, D., & Röder, J. (2024). Trainingsdatensätze. Zenodo. https://doi.org/10.5281/zenodo.13805722

Training datasets help to illustrate all stages of the data life cycle (DLC), e.g.

  • Metadata standards to describe and structure (newly collected) data
  • (Reproducible) processing of data
  • (Reproducible) data analysis
  • Workflows to archive, share and publish data for personal and/or public re-use
















Figure 1: Data life cycle. Source: RDMkit: The ELIXIR Research Data Management toolkit for Life Sciences URL: https://rdmkit.elixir-europe.org

Biological Datasets

Natur conservation

Biology

Forestry

Genetics

Taxnomy, Traits

Animal Tracking

Time Series

Spatial Data

Environmental Datasets

Land cover

Other Collections




Do you have questions, feedback or need help?

Contact our Helpdesk for direct support.



  • No labels