Class: DataObject
An object that primarily consists of symbols that represent information. Files, records, and omics data are examples of data objects.
URI: nmdc:DataObject
classDiagram
class DataObject
click DataObject href "../DataObject"
InformationObject <|-- DataObject
click InformationObject href "../InformationObject"
DataObject : alternative_identifiers
DataObject : compression_type
DataObject : data_category
DataObject --> "0..1" DataCategoryEnum : data_category
click DataCategoryEnum href "../DataCategoryEnum"
DataObject : data_object_type
DataObject --> "0..1" FileTypeEnum : data_object_type
click FileTypeEnum href "../FileTypeEnum"
DataObject : description
DataObject : file_size_bytes
DataObject : id
DataObject : in_manifest
DataObject --> "*" Manifest : in_manifest
click Manifest href "../Manifest"
DataObject : insdc_experiment_identifiers
DataObject : md5_checksum
DataObject : name
DataObject : type
DataObject : url
DataObject : was_generated_by
DataObject --> "0..1" WorkflowExecution : was_generated_by
click WorkflowExecution href "../WorkflowExecution"
Inheritance
- NamedThing
- InformationObject
- DataObject
- InformationObject
Slots
Name | Cardinality and Range | Description | Inheritance |
---|---|---|---|
compression_type | 0..1 String |
If provided, specifies the compression type | direct |
data_category | 0..1 DataCategoryEnum |
The category of the file, such as instrument data from data generation or pro... | direct |
data_object_type | 0..1 FileTypeEnum |
The type of file represented by the data object | direct |
file_size_bytes | 0..1 Bytes |
Size of the file in bytes | direct |
insdc_experiment_identifiers | * ExternalIdentifier |
direct | |
md5_checksum | 0..1 String |
MD5 checksum of file (pre-compressed) | direct |
url | 0..1 String |
direct | |
was_generated_by | 0..1 WorkflowExecution or WorkflowExecution or DataGeneration |
direct | |
in_manifest | * Manifest |
one or more combinations of other DataObjects that can be analyzed together | direct |
id | 1 Uriorcurie |
A unique identifier for a thing | NamedThing |
name | 1 String |
A human readable label for an entity | NamedThing |
description | 1 String |
a human-readable description of a thing | NamedThing |
alternative_identifiers | * Uriorcurie |
A list of alternative identifiers for the entity | NamedThing |
type | 1 Uriorcurie |
the class_uri of the class that has been instantiated | NamedThing |
Usages
used by | used in | type | used |
---|---|---|---|
NucleotideSequencing | has_output | range | DataObject |
MassSpectrometry | has_output | range | DataObject |
CalibrationInformation | calibration_object | range | DataObject |
Database | data_object_set | range | DataObject |
DataGeneration | has_output | range | DataObject |
Identifier and Mapping Information
Schema Source
- from schema: https://w3id.org/nmdc/nmdc
Mappings
Mapping Type | Mapped Value |
---|---|
self | nmdc:DataObject |
native | nmdc:DataObject |
LinkML Source
Direct
name: DataObject
description: An object that primarily consists of symbols that represent information. Files,
records, and omics data are examples of data objects.
in_subset:
- data object subset
from_schema: https://w3id.org/nmdc/nmdc
is_a: InformationObject
slots:
- compression_type
- data_category
- data_object_type
- file_size_bytes
- insdc_experiment_identifiers
- md5_checksum
- url
- was_generated_by
- in_manifest
slot_usage:
name:
name: name
required: true
description:
name: description
required: true
id:
name: id
required: true
structured_pattern:
syntax: '{id_nmdc_prefix}:dobj-{id_shoulder}-{id_blade}$'
interpolated: true
was_generated_by:
name: was_generated_by
structured_pattern:
syntax: ^{id_nmdc_prefix}:(wfmag|wfmb|wfmgan|wfmgas|wfmsa|wfmp|wfmt|wfmtan|wfmtas|wfnom|wfrbt|wfrqc)-{id_shoulder}-{id_blade}{id_version}$|^{id_nmdc_prefix}:(omprc|dgms|dgns)-{id_shoulder}-{id_blade}$
interpolated: true
class_uri: nmdc:DataObject
Induced
name: DataObject
description: An object that primarily consists of symbols that represent information. Files,
records, and omics data are examples of data objects.
in_subset:
- data object subset
from_schema: https://w3id.org/nmdc/nmdc
is_a: InformationObject
slot_usage:
name:
name: name
required: true
description:
name: description
required: true
id:
name: id
required: true
structured_pattern:
syntax: '{id_nmdc_prefix}:dobj-{id_shoulder}-{id_blade}$'
interpolated: true
was_generated_by:
name: was_generated_by
structured_pattern:
syntax: ^{id_nmdc_prefix}:(wfmag|wfmb|wfmgan|wfmgas|wfmsa|wfmp|wfmt|wfmtan|wfmtas|wfnom|wfrbt|wfrqc)-{id_shoulder}-{id_blade}{id_version}$|^{id_nmdc_prefix}:(omprc|dgms|dgns)-{id_shoulder}-{id_blade}$
interpolated: true
attributes:
compression_type:
name: compression_type
description: If provided, specifies the compression type
todos:
- consider setting the range to an enum
examples:
- value: gzip
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: compression_type
owner: DataObject
domain_of:
- DataObject
range: string
data_category:
name: data_category
description: The category of the file, such as instrument data from data generation
or processed data from a workflow execution.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: data_category
owner: DataObject
domain_of:
- DataObject
range: DataCategoryEnum
data_object_type:
name: data_object_type
description: The type of file represented by the data object.
examples:
- value: FT ICR-MS Analysis Results
- value: GC-MS Metabolomics Results
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: data_object_type
owner: DataObject
domain_of:
- DataObject
range: FileTypeEnum
file_size_bytes:
name: file_size_bytes
description: Size of the file in bytes
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: file_size_bytes
owner: DataObject
domain_of:
- DataObject
range: bytes
insdc_experiment_identifiers:
name: insdc_experiment_identifiers
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: external_database_identifiers
mixins:
- insdc_identifiers
alias: insdc_experiment_identifiers
owner: DataObject
domain_of:
- NucleotideSequencing
- DataObject
range: external_identifier
multivalued: true
pattern: ^insdc.sra:(E|D|S)RX[0-9]{6,}$
md5_checksum:
name: md5_checksum
description: MD5 checksum of file (pre-compressed)
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: md5_checksum
owner: DataObject
domain_of:
- DataObject
range: string
url:
name: url
notes:
- See issue 207 - this clashes with the mixs field
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: url
owner: DataObject
domain_of:
- ImageValue
- Protocol
- DataObject
range: string
was_generated_by:
name: was_generated_by
from_schema: https://w3id.org/nmdc/nmdc
mappings:
- prov:wasGeneratedBy
rank: 1000
alias: was_generated_by
owner: DataObject
domain_of:
- FunctionalAnnotationAggMember
- FunctionalAnnotation
- DataObject
range: WorkflowExecution
structured_pattern:
syntax: ^{id_nmdc_prefix}:(wfmag|wfmb|wfmgan|wfmgas|wfmsa|wfmp|wfmt|wfmtan|wfmtas|wfnom|wfrbt|wfrqc)-{id_shoulder}-{id_blade}{id_version}$|^{id_nmdc_prefix}:(omprc|dgms|dgns)-{id_shoulder}-{id_blade}$
interpolated: true
any_of:
- range: WorkflowExecution
- range: DataGeneration
in_manifest:
name: in_manifest
description: one or more combinations of other DataObjects that can be analyzed
together
comments:
- A DataObject can be part of multiple manifests, for example, a DataObject could
be part of a manifest for a single run of an instrument and a manifest for technical
replicates of a single sample.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: in_manifest
owner: DataObject
domain_of:
- DataObject
range: Manifest
multivalued: true
id:
name: id
description: A unique identifier for a thing. Must be either a CURIE shorthand
for a URI or a complete URI
notes:
- 'abstracted pattern: prefix:typecode-authshoulder-blade(.version)?(_seqsuffix)?'
- a minimum length of 3 characters is suggested for typecodes, but 1 or 2 characters
will be accepted
- typecodes must correspond 1:1 to a class in the NMDC schema. this will be checked
via per-class id slot usage assertions
- minting authority shoulders should probably be enumerated and checked in the
pattern
examples:
- value: nmdc:mgmag-00-x012.1_7_c1
description: https://github.com/microbiomedata/nmdc-schema/pull/499#discussion_r1018499248
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
identifier: true
alias: id
owner: DataObject
domain_of:
- NamedThing
range: uriorcurie
required: true
pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
structured_pattern:
syntax: '{id_nmdc_prefix}:dobj-{id_shoulder}-{id_blade}$'
interpolated: true
name:
name: name
description: A human readable label for an entity
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: name
owner: DataObject
domain_of:
- PersonValue
- NamedThing
- Protocol
range: string
required: true
description:
name: description
description: a human-readable description of a thing
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
slot_uri: dcterms:description
alias: description
owner: DataObject
domain_of:
- ImageValue
- NamedThing
range: string
required: true
alternative_identifiers:
name: alternative_identifiers
description: A list of alternative identifiers for the entity.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: alternative_identifiers
owner: DataObject
domain_of:
- MetaboliteIdentification
- NamedThing
range: uriorcurie
multivalued: true
pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
type:
name: type
description: the class_uri of the class that has been instantiated
notes:
- replaces legacy nmdc:type slot
- makes it easier to read example data files
- required for polymorphic MongoDB collections
examples:
- value: nmdc:Biosample
- value: nmdc:Study
from_schema: https://w3id.org/nmdc/nmdc
see_also:
- https://github.com/microbiomedata/nmdc-schema/issues/1048
- https://github.com/microbiomedata/nmdc-schema/issues/1233
- https://github.com/microbiomedata/nmdc-schema/issues/248
rank: 1000
slot_uri: rdf:type
designates_type: true
alias: type
owner: DataObject
domain_of:
- EukEval
- FunctionalAnnotationAggMember
- MobilePhaseSegment
- PortionOfSubstance
- MagBin
- MetaboliteIdentification
- PeptideQuantification
- ProteinQuantification
- GenomeFeature
- FunctionalAnnotation
- AttributeValue
- NamedThing
- FailureCategorization
- Protocol
- CreditAssociation
- Doi
range: uriorcurie
required: true
class_uri: nmdc:DataObject