Getting started
A web-based solution which hooks into the European Nucleotide Archive (ENA) (
Talk | ||
---|---|---|
|
Talk | ||
---|---|---|
|
User Guide
Users who want to consume interested in accessing the data can query the OAI-PHM compliant API which can be found here PMH compliant API provided by ena2pansimple
. The API endpoint is accessible at https://ena2pansimple.gfbio.org/oai/?verb=Identify some
Talk | ||
---|---|---|
|
ena2pansimple
offers straightforward examples on how to effectively query the entpoint are listed here endpoint. These examples can be found at https://ena2pansimple.gfbio.org/about/ more Talk | ||
---|---|---|
|
For more comprehensive information on how to use the endpoint you can find in , users are encouraged to consult the documentation of OAI-PMH itself. The official OAI-PMH documentation, available at http://www.openarchives.org/OAI/openarchivesprotocol.html offers an in-depth look at the protocol, including its architecture, operations, and the types of requests that can be made. This resource is invaluable for users who wish to gain a deeper understanding of how OAI-PMH works and how to leverage its capabilities for data harvesting.
Developer Guide
The ena2pansimple
tool is developed within a Dockerized Django environment, offering a streamlined setup for both usage and development. This approach ensures that the tool can be easily deployed, run, and developed upon by encapsulating its environment and dependencies, making it highly accessible for developers and users alike.
To begin working with ena2pansimple
, whether for trying it out or for further development, the repository provides comprehensive instructions and resources. These guidelines facilitate a smooth setup process, allowing users to get the tool running locally on their machines.
For detailed setup instructions and to access the source code, visit the GitLab repository at in a dockerized django setup. The repository itself contains detailed information on how to get running locally which you can use to try it out yourself and for further development (https://gitlab-pe.gwdg.de/gfbio/ena2pansimple. Here, you will find all the necessary documentation on how to get ena2pansimple
up and running in your local environment. This includes steps for Docker installation, setting up the Django environment, and configuring the application to connect to the ENA repository for data retrieval.
RDC Integration
ena2pansimple
plays a pivotal role in the GFBio (German Federation for Biological Data) ecosystem by providing structured access to the metadata of submissions to the European Nucleotide Archive (ENA)
. As a central component of the GFBio harvesting, indexing, and search infrastructure, it bridges the gap between vast biological data in the ENA repository and potential end-users, facilitating efficient data discovery and utilization.
The tool serves as the primary interface for harvesting metadata from the ENA, transforming it into formats that are compatible with the broader GFBio infrastructure. Once the data is harvested, ena2pansimple
integrates it into the GFBio Elasticsearch index. This process not only standardizes the data but also makes it readily searchable, significantly enhancing the accessibility and usability of the information contained within the ENA.
The integration of harvested data into the GFBio Elasticsearch index is a critical step that enables the GFBio Search API to provide access to this metadata. This means that all components within the Research Data Commons (RDC) can leverage this consolidated, searchable pool of metadata, facilitating interdisciplinary research and collaboration.