Notwithstanding considerable differences in specific characteristics, formats and quantities there are three basic categories of research data:
- raw data: not yet processed original data, e.g. sample content, measurement data or photos;
- primary data: processed data, e.g. converted or corrected raw data, asssembled and commented data;
- secondary data: data sampled with a different intention and re-used in in other context.
Often, metadata are mentioned as another type of data, although the differentiation of metadata from data is a matter of viewpoint and the special kind of use of the data. A useful definition is that metadata describe the content, context, quality, structure, provenance and accessibility of data objects (Michener et al. 1997).
In GFBio five major types of biological data are distinguished for the use in the "Service Description" of the individual Collection Data Centers.
- Type 1 data are biodiversity and occurrence data which are treated by the ABCD and DwC standards and extensions (see Data exchange standards, protocols and formats relevant for the collection data domain within the GFBio network and Technical documentation of GFBio publication of type 1 data). (Type 1a are collection data, type 1b observation data without collection object.) (primary identifier=biological (digital) object (with or without material) with geo-information and time as main secondary information)
- Type 2 data are taxonomic (checklist) data, which are treated via the ABCD and DwC standards (primary identifier=taxon name according the rule of the three International Codes of Biological Nomenclature)
- Type 3 data are environmental biological and ecological data, which are transferred into a highly structured format at data item level (e.g., single measurement) and associated with e.g. EML or ISO 19139 metadata. This type includes functional and phylogenetic trait data, the latter are subject of DELTA or SDD standards. (primary identifier= biological concept, e.g. OTU or OFU), with environmental (analysis, measurement) information as main secondary information or primary identifier=environmental event with biological information as main secondary information)
- Type 4 data are non-molecular analysis data (data sets and/or data packages) in its original data file format (often RAW format). (This data are accepted if well documented, with a core set of standard-compliant metadata and appropriate for long-term archiving, without further data management required.)
- Type 5 data are molecular sequence data, including MIxS-compliant metadata (primary identifier=molecular sequence with geo-information and time as main secondary information)
Source: GFBio
In the arachnological context of this database, we understand research data as data from methodical studies on the ecology of spiders, in contrast to data from accidental/casual sampling of spiders, which are indeed interesting and valuable for faunistics and knowledge on the occurence/distribution of species and therefore collected in the Atlas of the European Arachnids.