Wondering if your data is suitable for archival & publication in one of the data centers? Get to know the Data Centers below !
Ready to archive your data? Submit your data via the GFBio submission service
Questions regarding your data? Contact our Helpdesk!
About the Data Center Network
The ten Data Centers listed below are are infrastructure partners in the GFBio broker network. They entail seven Data Centers located at natural history collections located throughout Germany. They are are departments of science informatics and data infrastructures at recognized science institutions devoted to manage, store, archive and publish various types of bio- and geodiversity data. Data submitted through GFBio are transmitted to and curated by data curators at a matching GFBio Data Center, based on the profiles below. Data are available via various facets of the GFBio search portal.
The core group of ten GFBio data centers has agreed on a number of consensus documents, tools, data pipelines and standards and technical formats for interoperability.
Data Submissions to the Data Centers
Data curation
Submitted data will be checked for quality and scope. If suitable for one of the data centers, the curators are starting to harmonise the data set. This entails checking for mandatory fields, converting values in standardised formats (such as date, time or coordinates) and using standardised vocabulary (metadata standards) to describe the data. The curation process takes effort and the curator relies on the continuous communication with the data submitter. After the data undergo curation, the data Findability and Reusability are greatly enhanced and the data are ready to be archived and subsequently published. Data can be placed under moratorium to be published at a later date, however, the extension of the moratorium period is limited and varies according to the data centers policy.
Shared archival
Biodiversity is influenced by many factors. Therefore, data generated to answer questions about biodiversity are often heterogeneous and looking at different aspects of biodiversity, such as the occurrence of species in a certain environment. To be able to archive the data in the most suitable repository, the heterogenous data sets are divided, archived in the respective data centers and interlinked via their persistent identifiers.
- e!DAL-PGP - Plant Genomics and Phenomics Research Data Repository
- ENA – European Nucleotide Archive
- LIB - Leibniz-Institut zur Analyse des Biodiversitätswandels
- PANGAEA – Data Publisher for Earth & Environmental Science
- SNSB – Staatliche Naturwissenschaftliche Sammlungen Bayerns – SNSB IT Center, München
Data Centers
Data Centers specialized on Plant, Nucleotide and Environmental Data
e!DAL-PGP archives, curates and publishes cross-domain, plant-related research data that exceeds existing repositories due to their size or scope. This includes for example:
- image collections from plant phenotyping and microscopy
- unfinished genomes
- genotyping data
- visualizations of morphological plant models
- data from mass spectrometry
- software & documents
The European Nucleotide Archive archives, curates and publishes nucleotide sequence data and associated metadata.
The GFBio Brokerage Service provides the timely, standards-compliant deposition of all molecular sequence data into the public repositories of the INSDC. The key components of the service include: (a) Support for metadata standardization, curation and quality control, (b) negotiation of embargo periods, including communication with INSDC, (c) parallel submission of environmental metadata to PANGAEA and other GFBio data centers, (d) cross-linking sequence data and environmental (PANGAEA) or other contextual or related data via accession number and DOI.
PANGAEA archives, curates and publishes multidisciplinary (e.g. geochemical, biological observational and occurrence) data from marine and terrestrial environments. Curation includes user support, definition of data set granularity, quality control, archival format transformation, metadata description and control. Supported data types are tabular data but also binary data, e.g multimedia.
Data Centers at Natural Science Collections
BGBM – Botanic Garden and Botanical Museum Berlin, Freie Universität Berlin
DSMZ – German Collection of Microorganisms
Learn more about the data center BGBM
Learn more about the data center DSMZ
MfN – Leibniz Institute for Research on Evolution and Biodiversity, Berlin
The Leibniz Institute for the Analysis of Biodiversity Change is a foundation under public law. The Biodiversity Data Center as part of the LIB is aimed at hosting, archiving, publishing and distributing data from biodiversity research and zoological collections.
The Biodiversity Data Center handles and curates data on:
- The specimens of the institutes collection, including provenance, distribution, habitat, and taxonomic data.
- Observations, recordings and measurements from field research, monitoring and ecological inventories.
- Morphological measurements, descriptions on specimens, as well as
- Genetic barcode libraries, and
- Genetic and molecular research data associated with specimens or environmental samples.
Learn more about the data center MfN
SGN – Senckenberg Gesellschaft für Naturforschung
Learn more about the data center SGN
The Staatliche Naturwissenschaftliche Sammlungen Bayerns (SNSB) is a natural history collection facility in Bavaria. The SNSB IT Center as part of the SNSB is its institutional repository primarily for scientific bio- and geodiversity data of the natural history collections belonging to the SNSB.
The mission comprizes research activities in the field of biodiversity informatics and data science. Software is mainly designed and set up following the concepts of the Diversity Workbench (DWB). DWB software tools are registered in bio.tools, a service of ELIXIR Europe.
The SNSB IT Center as GFBio Data Center supports scientists and institutions by offering DWB support and workshops. Additional services are provided on a case-by-case basis. They might include the sustainable DWB management of data from its generation up to persistent storage, archiving and publication of approved, quality-controlled, standardized and well-structured, i.e. FAIR
- occurrence and provenance data (from specimens, biological samples and observations)
- taxonomic and checklist data
- trait data (e.g., morphological, anatomical, chemical and molecular descriptions)
SMNS – State Museum of Natural History Stuttgart
Learn more about the data center SMNS