You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Overview

LOGO (work in progress)

Status: PRODUCTIVE

Weblink: https://ena2pansimple.gfbio.org/

Target group: data user

Keywords: indexing, harvesting, search

RDC Integration: integrated

Product owner: GfBio e.V.

A django web application based metadata harvester for the European Nucleotide Archive and OAI-PMH endpoint (https://gitlab-pe.gwdg.de/gfbio/ena2pansimple)


Getting started

A web-based solution which hooks into the European Nucleotide Archive (ENA) (https://bit.ly/3pPEdv0) metadata API to harvest records. The tool includes a scheduler for regular metadata harvesting and a database to store the metadata. As a post-processing step to the harvesting the tool converts the items into differen target format utilizing XSLT transformations. By now the tool support the the format oai-pmh and pnasimple. Finally the service provides access to the collected and transformed resources via an OAI-PMH comliant API for oai-pmh harhvester clients to consume the records

User Guide

Users who want to consume the data can query the OAI-PHM compliant API which can be found here https://ena2pansimple.gfbio.org/oai/?verb=Identify some simple examples on how to query the entpoint are listed here https://ena2pansimple.gfbio.org/about/ more general information on how to use the endpoint you can find in the documentation of OAI-PMH itself http://www.openarchives.org/OAI/openarchivesprotocol.html

Developer Guide

The  tool is developed in a dockerized django setup. The repository itself contains detailed information on how to get running locally which you can use to try it out yourself and for further development.

References


  • No labels