Major Types of Biological Data

Within GFBio we distinguish five major types of biological data. They are used for the "Service Description" of the individual Collection Data Centers as well as Technical Documentations of processing tools.

Type 1: Biodiversity and occurrence data

 * These are the data from the classical collection and alpha-diversity research domain, i.e. digital objects with taxon name(s), georeferences, e.g. locality, date and often referenced resources as multimedia objects.
 * We distinguish between
 * Type 1a: Collection Data (with reference to physical object)
 * Type 1b: Observation Data (without reference to physical object)
 * Used standards
 * ABCD (Access to Biological Collection Data) and extensions
 * DwC (Darwin Core) and extensions
 * DC (Dublin Core) as included in ABCD and DwC for basic bibliographic information
 * Used identifiers
 * primary identifier: biological (digital) object (digital specimen or observation)
 * main secondary information: geo-information and time, related (multimedia) resources
 * Example packages
 * Herbarium Berolinense Asteraceae (digital specimens), also accessible via GFBio VAT ??
 * IBF Monitoring of Orthoptera (digital observations), also accessible via GFBio VAT
 * Media and additional measurements belonging to the description of Cophyla fortuna (digital specimens), also accessible via GFBio VAT ??
 * Notes
 * The time investment for individual scientific data curation to be done by data providers and GFBio data managers before and during data transformation is varying. Bitte nicht in die einleitende Definition des Datentyps, da spezifische Submission (Rand-)Information
 * Okay. die Info kann man in TG2 diskutieren und auch wieder löschen. Sie bezieht sich auf die Gesamt-Data-Pipelines und nicht nur auf die Submission, daher umbenannt in "Notes".

Type 2: Taxon Data

 * These are taxon-related data (e.g. in a catalogue, checklist or so-called red list).
 * Used standards
 * ABCD (Access to Biological Collection Data) and extensions
 * DwC (Darwin Core) and extensions
 * DC (Dublin Core) as included in ABCD and DwC for basic bibliographic information
 * Used identifiers
 * primary identifier: class name (taxon), e.g., as defined by the nomenclatural rules of the three International Codes of Biological Nomenclature
 * International Code of Nomenclature for algae, fungi, and plants
 * International Code of Zoological Nomenclature
 * International Code of Nomenclature of Prokaryotes
 * main secondary information: taxonomic classifications and concepts, synonymy, vernacular names, geo- and conservation status information etc.
 * Example packages
 * Taxon list of vascular plants from Bavaria, Germany compiled in the context of the BFL project, also accessible via GFBio terminology service and as taxon backbone in GFBio portal
 * Taxon list of animals with German names (worldwide) compiled at the SMNS, also accessible via GFBio terminology service and as taxon backbone in GFBio portal
 * Notes
 * The time investment for individual scientific data curation to be done by data providers and GFBio data managers before and during data transformation is varying.

Type 3: Environmental Biological and Ecological Data

 * These are environmental biological and ecological study data including functional and phylogentic trait data and other kind of analysis data.
 * Used standards
 * EML (Ecological Metadata Language)
 * DELTA (Description Language for Taxonomy, for trait data)
 * SDD (Structured Descriptive Data, for trait data)
 * GML (Geography Markup Language) and ISO 19139 metadata
 * Used identifiers
 * a)
 * primary identifier: biological class concept (e.g., OTU or OFU)
 * main secondary information: trait and environmental (analysis, measurement, transformation, translocation) information
 * b)
 * primary identifier: environmental and ecological study item and event
 * main secondary information: biological and ecological information, measurements and description of the environment
 * Example packages
 * SDD example with EML for basic bibliographic information (coming soon)
 * EML example with table data and bibliographic information in EML (coming soon)
 * | Ferger et al. (2018) Various investigations to analyze the effects on species richness of birds during the KiLi (Kilimanjaro) Project
 * Notes
 * The time investment for individual scientific data curation before and during data transformation of (matrix) data into a highly structured and standard schema-compliant format at data item level might be high. Thus, the data management process has to be agreed between data provider and GFBio data curator before starting (see DMPs).

Type 4 Non-Molecular Analysis Data

 * These are non-molecular analysis data (data sets and/or data packages) in its original data file format (often RAW format).
 * Used standards
 * EML (Ecological Metadata Language) for basic bibliographic information
 * DC (with Pansimple XSD) for basic bibliographic information
 * Used identifiers
 * primary identifier: as provided by data producer
 * main secondary information: as provided by data producer
 * Example packages
 * from Jena Data Center?
 * from PANGAEA?
 * Notes
 * This type of data is accepted, as far as well documented and with a core set of standard-compliant metadata and appropriate for long-term archiving.
 * The time investment for individual scientific data curation to be done by data providers and GFBio data managers before and during data transformation is might be limited.

Type 5 Molecular Sequence Data

 * These are molecular sequence data including MIxS-compliant metadata.
 * Used standards
 * MixS
 * Used identifiers
 * primary identifier: molecular sample accession
 * main secondary information: geo-information and time
 * Example package
 * PRJEB26997
 * Notes
 * The time investment for individual scientific data curation to be done by data providers and GFBio data managers before and during data transformation might be limited.

For more details see also
 * Data exchange standards, protocols and formats relevant for the collection data domain within the GFBio network
 * Technical documentation of GFBio publication of type 1 data