Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Talk Diskussion gestartet.

Overview

A django web application based metadata harvester for the European Nucleotide Archive and OAI-PMH endpoint (https://gitlab-pe.gwdg.de/gfbio/ena2pansimple)


LOGO (work in progress)

Status:

Status
colourGreen
titleProductive

Weblink: https://ena2pansimple.gfbio.org/

Target group: data user

Keywords: indexing, harvesting, search

RDC Integration: integrated

Product owner: GfBio e.V.


Getting started

A web-based solution which hooks into the European Nucleotide Archive (ENA) (

Talk
idtalk-1109
https://bit.ly/3pPEdv0) metadata API to harvest records. The tool includes a scheduler for regular metadata harvesting and a database to store the metadata. As a post-processing step to the harvesting the tool converts the items into differen target format utilizing XSLT transformations
Talk
idtalk-1110
. By now the tool support the the format oai-pmh and pansimple. Finally the service provides access to the collected and transformed resources via an OAI-PMH comliant API for oai-pmh harhvester clients to consume the records

User Guide

Users who want to consume the data can query the OAI-PHM compliant API which can be found here https://ena2pansimple.gfbio.org/oai/?verb=Identify some

Talk
idtalk-1111
simple examples on how to query the entpoint are listed here https://ena2pansimple.gfbio.org/about/ more general information on how to use the endpoint you can find in the documentation of OAI-PMH itself http://www.openarchives.org/OAI/openarchivesprotocol.html

Developer Guide

The  tool is developed in a dockerized django setup. The repository itself contains detailed information on how to get running locally which you can use to try it out yourself and for further development (https://gitlab-pe.gwdg.de/gfbio/ena2pansimple)

References