Class: WorkflowExecution
Represents an instance of an execution of a particular workflow
Note
This is an abstract class and should not be instantiated directly.
classDiagram
class WorkflowExecution
click WorkflowExecution href "../WorkflowExecution"
DataEmitterProcess <|-- WorkflowExecution
click DataEmitterProcess href "../DataEmitterProcess"
WorkflowExecution <|-- AnnotatingWorkflow
click AnnotatingWorkflow href "../AnnotatingWorkflow"
WorkflowExecution <|-- MetagenomeAssembly
click MetagenomeAssembly href "../MetagenomeAssembly"
WorkflowExecution <|-- MetatranscriptomeAssembly
click MetatranscriptomeAssembly href "../MetatranscriptomeAssembly"
WorkflowExecution <|-- MetatranscriptomeExpressionAnalysis
click MetatranscriptomeExpressionAnalysis href "../MetatranscriptomeExpressionAnalysis"
WorkflowExecution <|-- MagsAnalysis
click MagsAnalysis href "../MagsAnalysis"
WorkflowExecution <|-- MetagenomeSequencing
click MetagenomeSequencing href "../MetagenomeSequencing"
WorkflowExecution <|-- ReadQcAnalysis
click ReadQcAnalysis href "../ReadQcAnalysis"
WorkflowExecution <|-- ReadBasedTaxonomyAnalysis
click ReadBasedTaxonomyAnalysis href "../ReadBasedTaxonomyAnalysis"
WorkflowExecution <|-- MetabolomicsAnalysis
click MetabolomicsAnalysis href "../MetabolomicsAnalysis"
WorkflowExecution <|-- NomAnalysis
click NomAnalysis href "../NomAnalysis"
WorkflowExecution : alternative_identifiers
WorkflowExecution : description
WorkflowExecution : end_date
WorkflowExecution : ended_at_time
WorkflowExecution : execution_resource
WorkflowExecution --> "1" ExecutionResourceEnum : execution_resource
click ExecutionResourceEnum href "../ExecutionResourceEnum"
WorkflowExecution : git_url
WorkflowExecution : has_failure_categorization
WorkflowExecution --> "*" FailureCategorization : has_failure_categorization
click FailureCategorization href "../FailureCategorization"
WorkflowExecution : has_input
WorkflowExecution --> "1..*" DataObject : has_input
click DataObject href "../DataObject"
WorkflowExecution : has_output
WorkflowExecution --> "*" DataObject : has_output
click DataObject href "../DataObject"
WorkflowExecution : id
WorkflowExecution : name
WorkflowExecution : processing_institution
WorkflowExecution --> "0..1" ProcessingInstitutionEnum : processing_institution
click ProcessingInstitutionEnum href "../ProcessingInstitutionEnum"
WorkflowExecution : processing_institution_workflow_metadata
WorkflowExecution : protocol_link
WorkflowExecution --> "0..1" Protocol : protocol_link
click Protocol href "../Protocol"
WorkflowExecution : qc_comment
WorkflowExecution : qc_status
WorkflowExecution --> "0..1" StatusEnum : qc_status
click StatusEnum href "../StatusEnum"
WorkflowExecution : start_date
WorkflowExecution : started_at_time
WorkflowExecution : type
WorkflowExecution : version
WorkflowExecution : was_informed_by
WorkflowExecution --> "1..*" DataGeneration : was_informed_by
click DataGeneration href "../DataGeneration"
Inheritance
Slots
Name | Cardinality and Range | Description | Inheritance |
---|---|---|---|
ended_at_time | 0..1 String |
direct | |
execution_resource | 1 ExecutionResourceEnum |
The computing resource or facility where the workflow was executed | direct |
git_url | 1 String |
The url that points to the exact github location of a workflow | direct |
started_at_time | 1 String |
direct | |
version | 0..1 String |
The NMDC release tag for a given workflow release used for data processing | direct |
was_informed_by | 1..* DataGeneration |
The primary DataGeneration subclass that the WorkflowExecution subclass depen... | direct |
processing_institution_workflow_metadata | 0..1 String |
Information about how workflow results were generated when the processing is ... | direct |
has_input | 1..* DataObject |
An input to a process | PlannedProcess |
has_output | * DataObject |
An output from a process | PlannedProcess |
processing_institution | 0..1 ProcessingInstitutionEnum |
The organization that processed the sample | PlannedProcess |
protocol_link | 0..1 Protocol |
PlannedProcess | |
start_date | 0..1 String |
The date on which any process or activity was started | PlannedProcess |
end_date | 0..1 String |
The date on which any process or activity was ended | PlannedProcess |
qc_status | 0..1 StatusEnum |
Stores information about the result of a process (ie the process of sequencin... | PlannedProcess |
qc_comment | 0..1 String |
Slot to store additional comments about laboratory or workflow output | PlannedProcess |
has_failure_categorization | * FailureCategorization |
PlannedProcess | |
id | 1 Uriorcurie |
A unique identifier for a thing | NamedThing |
name | 0..1 String |
A human readable label for an entity | NamedThing |
description | 0..1 String |
a human-readable description of a thing | NamedThing |
alternative_identifiers | * Uriorcurie |
A list of alternative identifiers for the entity | NamedThing |
type | 1 Uriorcurie |
the class_uri of the class that has been instantiated | NamedThing |
Usages
used by | used in | type | used |
---|---|---|---|
Database | workflow_execution_set | range | WorkflowExecution |
Aliases
- analysis
Comments
- Each instance of this (and all other) subclasses of WorkflowExecution is a distinct run with start and stop times, potentially with different inputs and outputs
Identifier and Mapping Information
Schema Source
- from schema: https://w3id.org/nmdc/nmdc
Mappings
Mapping Type | Mapped Value |
---|---|
LinkML Source
Direct
name: WorkflowExecution
description: Represents an instance of an execution of a particular workflow
alt_descriptions:
embl.ena:
source: embl.ena
description: An analysis contains secondary analysis results derived from sequence
reads (e.g. a genome assembly)
comments:
- Each instance of this (and all other) subclasses of WorkflowExecution is a distinct
run with start and stop times, potentially with different inputs and outputs
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- analysis
is_a: DataEmitterProcess
abstract: true
slots:
- ended_at_time
- execution_resource
- git_url
- started_at_time
- version
- was_informed_by
- processing_institution_workflow_metadata
slot_usage:
started_at_time:
name: started_at_time
required: true
git_url:
name: git_url
required: true
has_input:
name: has_input
range: DataObject
required: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
interpolated: true
has_output:
name: has_output
range: DataObject
structured_pattern:
syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
interpolated: true
execution_resource:
name: execution_resource
required: true
was_informed_by:
name: was_informed_by
required: true
class_uri: nmdc:WorkflowExecution
rules:
- preconditions:
slot_conditions:
qc_status:
name: qc_status
equals_string: pass
postconditions:
slot_conditions:
has_output:
name: has_output
required: true
description: If qc_status has a value of pass, then the has_output slot is required.
title: qc_status_pass_has_output_required
- preconditions:
slot_conditions:
qc_status:
name: qc_status
value_presence: ABSENT
postconditions:
slot_conditions:
has_output:
name: has_output
required: true
description: If qc_status is not specified, then the has_output slot is required.
title: qc_status_pass_null_has_output_required
Induced
name: WorkflowExecution
description: Represents an instance of an execution of a particular workflow
alt_descriptions:
embl.ena:
source: embl.ena
description: An analysis contains secondary analysis results derived from sequence
reads (e.g. a genome assembly)
comments:
- Each instance of this (and all other) subclasses of WorkflowExecution is a distinct
run with start and stop times, potentially with different inputs and outputs
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- analysis
is_a: DataEmitterProcess
abstract: true
slot_usage:
started_at_time:
name: started_at_time
required: true
git_url:
name: git_url
required: true
has_input:
name: has_input
range: DataObject
required: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
interpolated: true
has_output:
name: has_output
range: DataObject
structured_pattern:
syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
interpolated: true
execution_resource:
name: execution_resource
required: true
was_informed_by:
name: was_informed_by
required: true
attributes:
ended_at_time:
name: ended_at_time
notes:
- 'The regex for ISO-8601 format was taken from here: https://www.myintervals.com/blog/2009/05/20/iso-8601-date-validation-that-doesnt-suck/
It may not be complete, but it is good enough for now.'
from_schema: https://w3id.org/nmdc/nmdc
mappings:
- prov:endedAtTime
rank: 1000
alias: ended_at_time
owner: WorkflowExecution
domain_of:
- WorkflowExecution
range: string
pattern: ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
execution_resource:
name: execution_resource
description: The computing resource or facility where the workflow was executed.
examples:
- value: NERSC-Cori
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: execution_resource
owner: WorkflowExecution
domain_of:
- WorkflowExecution
range: ExecutionResourceEnum
required: true
git_url:
name: git_url
description: The url that points to the exact github location of a workflow.
examples:
- value: https://github.com/microbiomedata/mg_annotation/releases/tag/0.1
- value: https://github.com/microbiomedata/metaMS/blob/master/metaMS/gcmsWorkflow.py
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: git_url
owner: WorkflowExecution
domain_of:
- WorkflowExecution
range: string
required: true
started_at_time:
name: started_at_time
notes:
- 'The regex for ISO-8601 format was taken from here: https://www.myintervals.com/blog/2009/05/20/iso-8601-date-validation-that-doesnt-suck/
It may not be complete, but it is good enough for now.'
from_schema: https://w3id.org/nmdc/nmdc
mappings:
- prov:startedAtTime
rank: 1000
alias: started_at_time
owner: WorkflowExecution
domain_of:
- WorkflowExecution
range: string
required: true
pattern: ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
version:
name: version
description: The NMDC release tag for a given workflow release used for data processing.
If workflows are processed externally, as denoted by processing_institution,
this value represents the best mapping between a processing institution's (e.g.,
JGI) workflow metadata and a NMDC tagged release.
examples:
- value: v1.2.0
from_schema: https://w3id.org/nmdc/nmdc
broad_mappings:
- NCIT:C182117
rank: 1000
alias: version
owner: WorkflowExecution
domain_of:
- WorkflowExecution
range: string
was_informed_by:
name: was_informed_by
description: The primary DataGeneration subclass that the WorkflowExecution subclass
depends on.
comments:
- For version 1 of the proteomics workflow there are input files both from the
NucleotideSequencing and MassSpectrometry, the MassSpectrometry record is considered
the primary class to reference.
from_schema: https://w3id.org/nmdc/nmdc
structured_aliases:
was_informed_by:
literal_form: was_informed_by
predicate: EXACT_SYNONYM
contexts:
- https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
narrow_mappings:
- prov:wasInformedBy
rank: 1000
alias: was_informed_by
owner: WorkflowExecution
domain_of:
- WorkflowExecution
range: DataGeneration
required: true
multivalued: true
processing_institution_workflow_metadata:
name: processing_institution_workflow_metadata
description: Information about how workflow results were generated when the processing
is done by an external organziation (e.g., JGI) such as software tool name and
version or pipeline name and version.
examples:
- value: metaspades v. 3.15.2
- value: IMG Annotation Pipeline v.5.0.25
from_schema: https://w3id.org/nmdc/nmdc
mappings:
- NCIT:C165211
rank: 1000
alias: processing_institution_workflow_metadata
owner: WorkflowExecution
domain_of:
- WorkflowExecution
range: string
has_input:
name: has_input
description: An input to a process.
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- input
rank: 1000
alias: has_input
owner: WorkflowExecution
domain_of:
- PlannedProcess
range: DataObject
required: true
multivalued: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
interpolated: true
has_output:
name: has_output
description: An output from a process.
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- output
rank: 1000
alias: has_output
owner: WorkflowExecution
domain_of:
- PlannedProcess
range: DataObject
multivalued: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
interpolated: true
processing_institution:
name: processing_institution
description: The organization that processed the sample.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: processing_institution
owner: WorkflowExecution
domain_of:
- PlannedProcess
range: ProcessingInstitutionEnum
protocol_link:
name: protocol_link
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: protocol_link
owner: WorkflowExecution
domain_of:
- Configuration
- PlannedProcess
- Study
range: Protocol
start_date:
name: start_date
description: The date on which any process or activity was started
todos:
- add date string validation pattern
comments:
- We are using string representations of dates until all components of our ecosystem
can handle ISO 8610 dates
- The date should be formatted as YYYY-MM-DD
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: start_date
owner: WorkflowExecution
domain_of:
- PlannedProcess
range: string
end_date:
name: end_date
description: The date on which any process or activity was ended
todos:
- add date string validation pattern
comments:
- We are using string representations of dates until all components of our ecosystem
can handle ISO 8610 dates
- The date should be formatted as YYYY-MM-DD
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: end_date
owner: WorkflowExecution
domain_of:
- PlannedProcess
range: string
qc_status:
name: qc_status
description: Stores information about the result of a process (ie the process
of sequencing a library may have for qc_status of 'fail' if not enough data
was generated)
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: qc_status
owner: WorkflowExecution
domain_of:
- PlannedProcess
range: StatusEnum
qc_comment:
name: qc_comment
description: Slot to store additional comments about laboratory or workflow output.
For workflow output it may describe the particular workflow stage that failed.
(ie Failed at call-stage due to a malformed fastq file).
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: qc_comment
owner: WorkflowExecution
domain_of:
- PlannedProcess
range: string
has_failure_categorization:
name: has_failure_categorization
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: has_failure_categorization
owner: WorkflowExecution
domain_of:
- PlannedProcess
range: FailureCategorization
multivalued: true
inlined: true
inlined_as_list: true
id:
name: id
description: A unique identifier for a thing. Must be either a CURIE shorthand
for a URI or a complete URI
notes:
- 'abstracted pattern: prefix:typecode-authshoulder-blade(.version)?(_seqsuffix)?'
- a minimum length of 3 characters is suggested for typecodes, but 1 or 2 characters
will be accepted
- typecodes must correspond 1:1 to a class in the NMDC schema. this will be checked
via per-class id slot usage assertions
- minting authority shoulders should probably be enumerated and checked in the
pattern
examples:
- value: nmdc:mgmag-00-x012.1_7_c1
description: https://github.com/microbiomedata/nmdc-schema/pull/499#discussion_r1018499248
from_schema: https://w3id.org/nmdc/nmdc
structured_aliases:
workflow_execution_id:
literal_form: workflow_execution_id
predicate: NARROW_SYNONYM
contexts:
- https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
data_object_id:
literal_form: data_object_id
predicate: NARROW_SYNONYM
contexts:
- https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
rank: 1000
identifier: true
alias: id
owner: WorkflowExecution
domain_of:
- NamedThing
range: uriorcurie
required: true
pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
name:
name: name
description: A human readable label for an entity
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: name
owner: WorkflowExecution
domain_of:
- PersonValue
- NamedThing
- Protocol
range: string
description:
name: description
description: a human-readable description of a thing
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
slot_uri: dcterms:description
alias: description
owner: WorkflowExecution
domain_of:
- ImageValue
- NamedThing
range: string
alternative_identifiers:
name: alternative_identifiers
description: A list of alternative identifiers for the entity.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: alternative_identifiers
owner: WorkflowExecution
domain_of:
- MetaboliteIdentification
- NamedThing
range: uriorcurie
multivalued: true
pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,\(\)\=\#]*$
type:
name: type
description: the class_uri of the class that has been instantiated
notes:
- makes it easier to read example data files
- required for polymorphic MongoDB collections
examples:
- value: nmdc:Biosample
- value: nmdc:Study
from_schema: https://w3id.org/nmdc/nmdc
see_also:
- https://github.com/microbiomedata/nmdc-schema/issues/1048
- https://github.com/microbiomedata/nmdc-schema/issues/1233
- https://github.com/microbiomedata/nmdc-schema/issues/248
structured_aliases:
workflow_execution_class:
literal_form: workflow_execution_class
predicate: NARROW_SYNONYM
contexts:
- https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
rank: 1000
slot_uri: rdf:type
designates_type: true
alias: type
owner: WorkflowExecution
domain_of:
- EukEval
- FunctionalAnnotationAggMember
- PeptideQuantification
- ProteinQuantification
- MobilePhaseSegment
- PortionOfSubstance
- MagBin
- MetaboliteIdentification
- GenomeFeature
- FunctionalAnnotation
- AttributeValue
- NamedThing
- OntologyRelation
- FailureCategorization
- Protocol
- CreditAssociation
- Doi
range: uriorcurie
required: true
class_uri: nmdc:WorkflowExecution
rules:
- preconditions:
slot_conditions:
qc_status:
name: qc_status
equals_string: pass
postconditions:
slot_conditions:
has_output:
name: has_output
required: true
description: If qc_status has a value of pass, then the has_output slot is required.
title: qc_status_pass_has_output_required
- preconditions:
slot_conditions:
qc_status:
name: qc_status
value_presence: ABSENT
postconditions:
slot_conditions:
has_output:
name: has_output
required: true
description: If qc_status is not specified, then the has_output slot is required.
title: qc_status_pass_null_has_output_required