Publication of Type 1 Data via BioCASe Data Pipelines at SGN Data Center
The SGN Data Center is one of the 7 GFBio Collection Data Centers, which are core components of the GFBio Submission, Repository and Archiving Infrastructure. The data archiving and publication at SGN is mainly based on the in-house developed database systems SeSam and AQUiLA.
- The workflow with these central components is described in figure 1 and the text below.
Export of GFBio DIPs from SGN in-house-management system
- The citation is according to the GFBio citation pattern. If needed, it will be edited in close collaboration with data provider.
- Example: Janssen, R. (2016). Digitalisierung und taxonomische Überarbeitung der Unionida-Sammlung des Senckenberg Forschungsinstituts und Naturmuseums Frankfurt am Main. Senckenbergianum Frankfurt. [Dataset]. Data Publisher: Senckenberg Gesellschaft für Naturforschung – Leibniz Institute, Frankfurt. http://www.senckenberg.de/root/index.php?page_id=297"
- The licenses for the data packages are ingested in SeSam/AQUiLA during the submission/ingestion process.
- The licenses for multimedia objects are stored in SeSam/AQUiLA together with multimedia URLs. GFBio is promoting CC licenses, SGN favorite license is CC BY-SA 4.0.
- GFBio data and metadata created during submission
- The metadata which are generated during GFBio submission are processed via JIRA ticket system and ingested in SeSam/AQUiLA (work in progress). Additional metadata and original research data are imported in SeSam/AQUiLA. Additional parameter assignment is done manually by the data producers in cooperation with SGN data curator.
- GFBio IDs according to GFBio consensus documents
- All GFBio IDs as well as other external IDs as far as available are stored in SeSam/AQUiLA.
- Occurrence data according to GFBio consensus documents
- The occurrence data are stored at two levels and two granularities, (a) at dataset level and (b) at unit level in SeSam/AQUiLA.
- Other (meta)data
- Multimedia and morphological data are stored in SeSam/AQUiLA.
- Other (meta)data recommended or mandatory for export are stored in SeSam/AQUiLA.
Transformation of DIPs for SGN archiving system
- Archiving of DIPs
- All DIPs are created as zipped ABCD 2.06 xml archives using a regular manual function of the BioCASe Provider Software. A backup of the stored DIPs is done on a daily basis by Senckenberg IT according to Technical documentation of long-term archiving solutions at the GFBio collection data centers
Transformation of DIPs for publication in GFBio data portal and VAT tool
- using BioCASe Provider Software
- Access to the SGN datasources is provided via BioCASe Monitor Service (BMS).
- This includes links to the BioCASe Local Query Tool and the Consistency Check regarding ABCD Consensus elements how they are agreed in GFBio.
- Indexing/harvesting for access via GFBio Data Portal and VAT-System
- After all quality checks the final data package is announced to the central GFBio indexing/harvesting process and can finally be accessed via GFBio data portal.
- For georeferenced data, an import to the VAT-System is provided by the data portal.