About the Data Center Network
The ten Data Centers listed below are are infrastructure partners in the GFBio broker network. They are are departments of science informatics and data infrastructures at recognized science institutions devoted to manage, store, archive and publish various types of bio- and geodiversity data. Data submitted through GFBio are transmitted to and curated by data curators at a matching GFBio Data Center, based on the profiles below.
The core group of ten GFBio data centers has agreed on a number of consensus documents, tools, data pipelines and standards and technical formats for interoperability as published in the GFBio Wiki. Each of the data centers has its own profile (see below) and provides a portfolio of services which is in accordance with the core tasks of the respective organisation. Data are available via various facets of the GFBio search portal and as far as appropriate visualisable via the VAT tool/ GeoEngine.
Wondering if your data is suitable for archival & publication in one of the data centers?
Get to know the data centers below!
Data Curation and Shared archival
Data curation
Submitted data will be checked for quality and scope. If suitable for one of the data centers, the curators are starting to harmonise the data set. This entails checking for mandatory fields, converting values in standardised formats (such as date, time or coordinates) and using standardised vocabulary (metadata standards) to describe the data. The curation process takes effort and the curator relies on the continuous communication with the data submitter. After the data undergo curation, the data Findability and Reusability are greatly enhanced and the data are ready to be archived and subsequently published. Data can be placed under moratorium to be published at a later date, however, the extension of the moratorium period is limited and varies according to the data centers policy.
Shared archival
Biodiversity is influenced by many factors. Therefore, data generated to answer questions about biodiversity are often heterogeneous and looking at different aspects of biodiversity, such as the occurrence of species in a certain environment. To be able to archive the data in the most suitable repository, the heterogenous data sets are divided, archived in the respective data centers and interlinked via their persistent identifiers.
Data Centers
Data Centers specialized on Plant, Nucleotide and Environmental Data
e!DAL-PGP archives, curates and publishes cross-domain, plant-related research data that exceeds existing repositories due to their size or scope. This includes for example:
- image collections from plant phenotyping and microscopy
- unfinished genomes
- genotyping data
- visualizations of morphological plant models
- data from mass spectrometry
- software & documents
PANGAEA archives, curates and publishes multidisciplinary (e.g. geochemical, biological observational and occurrence) data from marine and terrestrial environments. Curation includes user support, definition of data set granularity, quality control, archival format transformation, metadata description and control. Supported data types are tabular data but also binary data, e.g multimedia.
Data Centers at Natural Science Collections
BGBM – Botanic Garden and Botanical Museum Berlin, Freie Universität Berlin
DSMZ – German Collection of Microorganisms
Learn more about the data center BGBM
Learn more about the data center DSMZ
MfN – Leibniz Institute for Research on Evolution and Biodiversity, Berlin
The Leibniz Institute for the Analysis of Biodiversity Change is a foundation under public law. The Biodiversity Data Center as part of the LIB is aimed at hosting, archiving, publishing and distributing data from biodiversity research and zoological collections.
The Biodiversity Data Center handles and curates data on:
- The specimens of the institutes collection, including provenance, distribution, habitat, and taxonomic data.
- Observations, recordings and measurements from field research, monitoring and ecological inventories.
- Morphological measurements, descriptions on specimens, as well as
- Genetic barcode libraries, and
- Genetic and molecular research data associated with specimens or environmental samples.
Learn more about the data center MfN
SGN – Senckenberg Gesellschaft für Naturforschung
SNSB – Staatliche Naturwissenschaftliche Sammlungen Bayerns
Learn more about the data center SGN
The Staatliche Naturwissenschaftliche Sammlungen Bayerns (SNSB) is a natural history collection facility in Bavaria. The SNSB IT Center as part of the SNSB is its institutional repository primarily for scientific bio- and geodiversity data of the natural history collections belonging to the SNSB.
The mission comprizes research activities in the field of biodiversity informatics and data science. Software is mainly designed and set up following the concepts of the Diversity Workbench (DWB). DWB software tools are registered in bio.tools, a service of ELIXIR Europe.
The SNSB IT Center as GFBio Data Center supports scientists and institutions by offering DWB support and workshops. Additional services are provided on a case-by-case basis. They might include the sustainable DWB management of data from its generation up to persistent storage, archiving and publication of approved, quality-controlled, standardized and well-structured, i.e. FAIR
- occurrence and provenance data (from specimens, biological samples and observations)
- taxonomic and checklist data
- trait data (e.g., morphological, anatomical, chemical and molecular descriptions)
Learn more about the data center SNSB
SMNS – State Museum of Natural History Stuttgart
Learn more about the data center SMNS