Class: DataGeneration
The methods and processes used to generate omics data from a biosample or organism.
Note
This is an abstract class and should not be instantiated directly.
URI: nmdc:DataGeneration
classDiagram
class DataGeneration
click DataGeneration href "../DataGeneration"
PlannedProcess <|-- DataGeneration
click PlannedProcess href "../PlannedProcess"
DataGeneration <|-- NucleotideSequencing
click NucleotideSequencing href "../NucleotideSequencing"
DataGeneration <|-- MassSpectrometry
click MassSpectrometry href "../MassSpectrometry"
DataGeneration : add_date
DataGeneration : alternative_identifiers
DataGeneration : analyte_category
DataGeneration --> "1" AnalyteCategoryEnum : analyte_category
click AnalyteCategoryEnum href "../AnalyteCategoryEnum"
DataGeneration : associated_studies
DataGeneration --> "1..*" Study : associated_studies
click Study href "../Study"
DataGeneration : description
DataGeneration : end_date
DataGeneration : has_failure_categorization
DataGeneration --> "*" FailureCategorization : has_failure_categorization
click FailureCategorization href "../FailureCategorization"
DataGeneration : has_input
DataGeneration --> "1..*" NamedThing : has_input
click NamedThing href "../NamedThing"
DataGeneration : has_output
DataGeneration --> "*" DataObject : has_output
click DataObject href "../DataObject"
DataGeneration : id
DataGeneration : instrument_used
DataGeneration --> "*" Instrument : instrument_used
click Instrument href "../Instrument"
DataGeneration : mod_date
DataGeneration : name
DataGeneration : principal_investigator
DataGeneration --> "0..1" PersonValue : principal_investigator
click PersonValue href "../PersonValue"
DataGeneration : processing_institution
DataGeneration --> "0..1" ProcessingInstitutionEnum : processing_institution
click ProcessingInstitutionEnum href "../ProcessingInstitutionEnum"
DataGeneration : protocol_link
DataGeneration --> "0..1" Protocol : protocol_link
click Protocol href "../Protocol"
DataGeneration : qc_comment
DataGeneration : qc_status
DataGeneration --> "0..1" StatusEnum : qc_status
click StatusEnum href "../StatusEnum"
DataGeneration : start_date
DataGeneration : type
Inheritance
- NamedThing
- PlannedProcess
- DataGeneration
- PlannedProcess
Slots
Name | Cardinality and Range | Description | Inheritance |
---|---|---|---|
add_date | 0..1 String |
The date on which the information was added to the database | direct |
analyte_category | 1 AnalyteCategoryEnum |
The type of analyte(s) that were measured in the data generation process and ... | direct |
associated_studies | 1..* Study |
The study associated with a resource | direct |
instrument_used | * Instrument |
What instrument was used during DataGeneration or MaterialProcessing | direct |
mod_date | 0..1 String |
The last date on which the database information was modified | direct |
principal_investigator | 0..1 PersonValue |
Principal Investigator who led the study and/or generated the dataset | direct |
has_input | 1..* NamedThing or Biosample or ProcessedSample |
An input to a process | PlannedProcess |
has_output | * DataObject |
An output from a process | PlannedProcess |
processing_institution | 0..1 ProcessingInstitutionEnum |
The organization that processed the sample | PlannedProcess |
protocol_link | 0..1 Protocol |
PlannedProcess | |
start_date | 0..1 String |
The date on which any process or activity was started | PlannedProcess |
end_date | 0..1 String |
The date on which any process or activity was ended | PlannedProcess |
qc_status | 0..1 StatusEnum |
Stores information about the result of a process (ie the process of sequencin... | PlannedProcess |
qc_comment | 0..1 String |
Slot to store additional comments about laboratory or workflow output | PlannedProcess |
has_failure_categorization | * FailureCategorization |
PlannedProcess | |
id | 1 Uriorcurie |
A unique identifier for a thing | NamedThing |
name | 0..1 String |
A human readable label for an entity | NamedThing |
description | 0..1 String |
a human-readable description of a thing | NamedThing |
alternative_identifiers | * Uriorcurie |
A list of alternative identifiers for the entity | NamedThing |
type | 1 Uriorcurie |
the class_uri of the class that has been instantiated | NamedThing |
Usages
Aliases
- OmicsProcessing
- assay
- omics assay
- sequencing project
- experiment
Identifier and Mapping Information
Schema Source
- from schema: https://w3id.org/nmdc/nmdc
Mappings
Mapping Type | Mapped Value |
---|---|
self | nmdc:DataGeneration |
native | nmdc:DataGeneration |
broad | OBI:0000070, ISA:Assay |
LinkML Source
Direct
name: DataGeneration
description: The methods and processes used to generate omics data from a biosample
or organism.
alt_descriptions:
embl.ena:
source: embl.ena
description: An experiment contains information about a sequencing experiment
including library and instrument details.
in_subset:
- sample subset
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- OmicsProcessing
- assay
- omics assay
- sequencing project
- experiment
broad_mappings:
- OBI:0000070
- ISA:Assay
is_a: PlannedProcess
abstract: true
slots:
- add_date
- analyte_category
- associated_studies
- instrument_used
- mod_date
- principal_investigator
slot_usage:
has_input:
name: has_input
required: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$'
interpolated: true
any_of:
- range: Biosample
- range: ProcessedSample
associated_studies:
name: associated_studies
range: Study
structured_pattern:
syntax: '{id_nmdc_prefix}:(sty)-{id_shoulder}-{id_blade}$'
interpolated: true
has_output:
name: has_output
range: DataObject
structured_pattern:
syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
interpolated: true
class_uri: nmdc:DataGeneration
Induced
name: DataGeneration
description: The methods and processes used to generate omics data from a biosample
or organism.
alt_descriptions:
embl.ena:
source: embl.ena
description: An experiment contains information about a sequencing experiment
including library and instrument details.
in_subset:
- sample subset
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- OmicsProcessing
- assay
- omics assay
- sequencing project
- experiment
broad_mappings:
- OBI:0000070
- ISA:Assay
is_a: PlannedProcess
abstract: true
slot_usage:
has_input:
name: has_input
required: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$'
interpolated: true
any_of:
- range: Biosample
- range: ProcessedSample
associated_studies:
name: associated_studies
range: Study
structured_pattern:
syntax: '{id_nmdc_prefix}:(sty)-{id_shoulder}-{id_blade}$'
interpolated: true
has_output:
name: has_output
range: DataObject
structured_pattern:
syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
interpolated: true
attributes:
add_date:
name: add_date
description: The date on which the information was added to the database.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: add_date
owner: DataGeneration
domain_of:
- Biosample
- DataGeneration
range: string
analyte_category:
name: analyte_category
description: "The type of analyte(s) that were measured in the data generation\
\ process and analyzed\n in the Workflow Chain\n"
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: analyte_category
owner: DataGeneration
domain_of:
- DataGeneration
range: AnalyteCategoryEnum
required: true
associated_studies:
name: associated_studies
description: The study associated with a resource.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: associated_studies
owner: DataGeneration
domain_of:
- Biosample
- DataGeneration
range: Study
required: true
multivalued: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(sty)-{id_shoulder}-{id_blade}$'
interpolated: true
instrument_used:
name: instrument_used
description: What instrument was used during DataGeneration or MaterialProcessing.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: instrument_used
owner: DataGeneration
domain_of:
- MaterialProcessing
- DataGeneration
range: Instrument
multivalued: true
mod_date:
name: mod_date
description: The last date on which the database information was modified.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: mod_date
owner: DataGeneration
domain_of:
- Biosample
- DataGeneration
range: string
principal_investigator:
name: principal_investigator
description: Principal Investigator who led the study and/or generated the dataset.
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- PI
rank: 1000
alias: principal_investigator
owner: DataGeneration
domain_of:
- Study
- DataGeneration
range: PersonValue
has_input:
name: has_input
description: An input to a process.
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- input
rank: 1000
alias: has_input
owner: DataGeneration
domain_of:
- PlannedProcess
range: NamedThing
required: true
multivalued: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$'
interpolated: true
any_of:
- range: Biosample
- range: ProcessedSample
has_output:
name: has_output
description: An output from a process.
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- output
rank: 1000
alias: has_output
owner: DataGeneration
domain_of:
- PlannedProcess
range: DataObject
multivalued: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
interpolated: true
processing_institution:
name: processing_institution
description: The organization that processed the sample.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: processing_institution
owner: DataGeneration
domain_of:
- PlannedProcess
range: ProcessingInstitutionEnum
protocol_link:
name: protocol_link
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: protocol_link
owner: DataGeneration
domain_of:
- PlannedProcess
- Study
range: Protocol
start_date:
name: start_date
description: The date on which any process or activity was started
todos:
- add date string validation pattern
comments:
- We are using string representations of dates until all components of our ecosystem
can handle ISO 8610 dates
- The date should be formatted as YYYY-MM-DD
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: start_date
owner: DataGeneration
domain_of:
- PlannedProcess
range: string
end_date:
name: end_date
description: The date on which any process or activity was ended
todos:
- add date string validation pattern
comments:
- We are using string representations of dates until all components of our ecosystem
can handle ISO 8610 dates
- The date should be formatted as YYYY-MM-DD
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: end_date
owner: DataGeneration
domain_of:
- PlannedProcess
range: string
qc_status:
name: qc_status
description: Stores information about the result of a process (ie the process
of sequencing a library may have for qc_status of 'fail' if not enough data
was generated)
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: qc_status
owner: DataGeneration
domain_of:
- PlannedProcess
range: StatusEnum
qc_comment:
name: qc_comment
description: Slot to store additional comments about laboratory or workflow output.
For workflow output it may describe the particular workflow stage that failed.
(ie Failed at call-stage due to a malformed fastq file).
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: qc_comment
owner: DataGeneration
domain_of:
- PlannedProcess
range: string
has_failure_categorization:
name: has_failure_categorization
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: has_failure_categorization
owner: DataGeneration
domain_of:
- PlannedProcess
range: FailureCategorization
multivalued: true
inlined: true
inlined_as_list: true
id:
name: id
description: A unique identifier for a thing. Must be either a CURIE shorthand
for a URI or a complete URI
notes:
- 'abstracted pattern: prefix:typecode-authshoulder-blade(.version)?(_seqsuffix)?'
- a minimum length of 3 characters is suggested for typecodes, but 1 or 2 characters
will be accepted
- typecodes must correspond 1:1 to a class in the NMDC schema. this will be checked
via per-class id slot usage assertions
- minting authority shoulders should probably be enumerated and checked in the
pattern
examples:
- value: nmdc:mgmag-00-x012.1_7_c1
description: https://github.com/microbiomedata/nmdc-schema/pull/499#discussion_r1018499248
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
identifier: true
alias: id
owner: DataGeneration
domain_of:
- NamedThing
range: uriorcurie
required: true
pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
name:
name: name
description: A human readable label for an entity
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: name
owner: DataGeneration
domain_of:
- PersonValue
- NamedThing
- Protocol
range: string
description:
name: description
description: a human-readable description of a thing
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
slot_uri: dcterms:description
alias: description
owner: DataGeneration
domain_of:
- ImageValue
- NamedThing
range: string
alternative_identifiers:
name: alternative_identifiers
description: A list of alternative identifiers for the entity.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: alternative_identifiers
owner: DataGeneration
domain_of:
- MetaboliteIdentification
- NamedThing
range: uriorcurie
multivalued: true
pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
type:
name: type
description: the class_uri of the class that has been instantiated
notes:
- replaces legacy nmdc:type slot
- makes it easier to read example data files
- required for polymorphic MongoDB collections
examples:
- value: nmdc:Biosample
- value: nmdc:Study
from_schema: https://w3id.org/nmdc/nmdc
see_also:
- https://github.com/microbiomedata/nmdc-schema/issues/1048
- https://github.com/microbiomedata/nmdc-schema/issues/1233
- https://github.com/microbiomedata/nmdc-schema/issues/248
rank: 1000
slot_uri: rdf:type
designates_type: true
alias: type
owner: DataGeneration
domain_of:
- EukEval
- FunctionalAnnotationAggMember
- MobilePhaseSegment
- PortionOfSubstance
- MagBin
- MetaboliteIdentification
- PeptideQuantification
- ProteinQuantification
- GenomeFeature
- FunctionalAnnotation
- AttributeValue
- NamedThing
- FailureCategorization
- Protocol
- CreditAssociation
- Doi
range: uriorcurie
required: true
class_uri: nmdc:DataGeneration