Skip to content

Class: WorkflowExecution

Represents an instance of an execution of a particular workflow

Note

This is an abstract class and should not be instantiated directly.

URI: nmdc:WorkflowExecution

classDiagram class WorkflowExecution click WorkflowExecution href "../WorkflowExecution" DataEmitterProcess <|-- WorkflowExecution click DataEmitterProcess href "../DataEmitterProcess" WorkflowExecution <|-- AnnotatingWorkflow click AnnotatingWorkflow href "../AnnotatingWorkflow" WorkflowExecution <|-- MetagenomeAssembly click MetagenomeAssembly href "../MetagenomeAssembly" WorkflowExecution <|-- MetatranscriptomeAssembly click MetatranscriptomeAssembly href "../MetatranscriptomeAssembly" WorkflowExecution <|-- MetatranscriptomeExpressionAnalysis click MetatranscriptomeExpressionAnalysis href "../MetatranscriptomeExpressionAnalysis" WorkflowExecution <|-- MagsAnalysis click MagsAnalysis href "../MagsAnalysis" WorkflowExecution <|-- MetagenomeSequencing click MetagenomeSequencing href "../MetagenomeSequencing" WorkflowExecution <|-- ReadQcAnalysis click ReadQcAnalysis href "../ReadQcAnalysis" WorkflowExecution <|-- ReadBasedTaxonomyAnalysis click ReadBasedTaxonomyAnalysis href "../ReadBasedTaxonomyAnalysis" WorkflowExecution <|-- MetabolomicsAnalysis click MetabolomicsAnalysis href "../MetabolomicsAnalysis" WorkflowExecution <|-- NomAnalysis click NomAnalysis href "../NomAnalysis" WorkflowExecution : alternative_identifiers WorkflowExecution : description WorkflowExecution : end_date WorkflowExecution : ended_at_time WorkflowExecution : execution_resource WorkflowExecution --> "1" ExecutionResourceEnum : execution_resource click ExecutionResourceEnum href "../ExecutionResourceEnum" WorkflowExecution : git_url WorkflowExecution : has_failure_categorization WorkflowExecution --> "*" FailureCategorization : has_failure_categorization click FailureCategorization href "../FailureCategorization" WorkflowExecution : has_input WorkflowExecution --> "1..*" DataObject : has_input click DataObject href "../DataObject" WorkflowExecution : has_output WorkflowExecution --> "*" DataObject : has_output click DataObject href "../DataObject" WorkflowExecution : id WorkflowExecution : name WorkflowExecution : processing_institution WorkflowExecution --> "0..1" ProcessingInstitutionEnum : processing_institution click ProcessingInstitutionEnum href "../ProcessingInstitutionEnum" WorkflowExecution : processing_institution_workflow_metadata WorkflowExecution : protocol_link WorkflowExecution --> "0..1" Protocol : protocol_link click Protocol href "../Protocol" WorkflowExecution : qc_comment WorkflowExecution : qc_status WorkflowExecution --> "0..1" StatusEnum : qc_status click StatusEnum href "../StatusEnum" WorkflowExecution : start_date WorkflowExecution : started_at_time WorkflowExecution : type WorkflowExecution : version WorkflowExecution : was_informed_by WorkflowExecution --> "1..*" DataGeneration : was_informed_by click DataGeneration href "../DataGeneration"

Inheritance

Slots

Name Cardinality and Range Description Inheritance
ended_at_time 0..1
String
direct
execution_resource 1
ExecutionResourceEnum
The computing resource or facility where the workflow was executed direct
git_url 1
String
The url that points to the exact github location of a workflow direct
started_at_time 1
String
direct
version 0..1
String
The NMDC release tag for a given workflow release used for data processing direct
was_informed_by 1..*
DataGeneration
The primary DataGeneration subclass that the WorkflowExecution subclass depen... direct
processing_institution_workflow_metadata 0..1
String
Information about how workflow results were generated when the processing is ... direct
has_input 1..*
DataObject
An input to a process PlannedProcess
has_output *
DataObject
An output from a process PlannedProcess
processing_institution 0..1
ProcessingInstitutionEnum
The organization that processed the sample PlannedProcess
protocol_link 0..1
Protocol
PlannedProcess
start_date 0..1
String
The date on which any process or activity was started PlannedProcess
end_date 0..1
String
The date on which any process or activity was ended PlannedProcess
qc_status 0..1
StatusEnum
Stores information about the result of a process (ie the process of sequencin... PlannedProcess
qc_comment 0..1
String
Slot to store additional comments about laboratory or workflow output PlannedProcess
has_failure_categorization *
FailureCategorization
PlannedProcess
id 1
Uriorcurie
A unique identifier for a thing NamedThing
name 0..1
String
A human readable label for an entity NamedThing
description 0..1
String
a human-readable description of a thing NamedThing
alternative_identifiers *
Uriorcurie
A list of alternative identifiers for the entity NamedThing
type 1
Uriorcurie
the class_uri of the class that has been instantiated NamedThing

Usages

used by used in type used
Database workflow_execution_set range WorkflowExecution

Aliases

  • analysis

Comments

  • Each instance of this (and all other) subclasses of WorkflowExecution is a distinct run with start and stop times, potentially with different inputs and outputs

Identifier and Mapping Information

Schema Source

Mappings

Mapping Type Mapped Value

LinkML Source

Direct

name: WorkflowExecution
description: Represents an instance of an execution of a particular workflow
alt_descriptions:
  embl.ena:
    source: embl.ena
    description: An analysis contains secondary analysis results derived from sequence
      reads (e.g. a genome assembly)
comments:
- Each instance of this (and all other) subclasses of WorkflowExecution is a distinct
  run with start and stop times, potentially with different inputs and outputs
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- analysis
is_a: DataEmitterProcess
abstract: true
slots:
- ended_at_time
- execution_resource
- git_url
- started_at_time
- version
- was_informed_by
- processing_institution_workflow_metadata
slot_usage:
  started_at_time:
    name: started_at_time
    required: true
  git_url:
    name: git_url
    required: true
  has_input:
    name: has_input
    range: DataObject
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
  has_output:
    name: has_output
    range: DataObject
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
  execution_resource:
    name: execution_resource
    required: true
  was_informed_by:
    name: was_informed_by
    required: true
class_uri: nmdc:WorkflowExecution
rules:
- preconditions:
    slot_conditions:
      qc_status:
        name: qc_status
        equals_string: pass
  postconditions:
    slot_conditions:
      has_output:
        name: has_output
        required: true
  description: If qc_status has a value of pass, then the has_output slot is required.
  title: qc_status_pass_has_output_required
- preconditions:
    slot_conditions:
      qc_status:
        name: qc_status
        value_presence: ABSENT
  postconditions:
    slot_conditions:
      has_output:
        name: has_output
        required: true
  description: If qc_status is not specified, then the has_output slot is required.
  title: qc_status_pass_null_has_output_required

Induced

name: WorkflowExecution
description: Represents an instance of an execution of a particular workflow
alt_descriptions:
  embl.ena:
    source: embl.ena
    description: An analysis contains secondary analysis results derived from sequence
      reads (e.g. a genome assembly)
comments:
- Each instance of this (and all other) subclasses of WorkflowExecution is a distinct
  run with start and stop times, potentially with different inputs and outputs
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- analysis
is_a: DataEmitterProcess
abstract: true
slot_usage:
  started_at_time:
    name: started_at_time
    required: true
  git_url:
    name: git_url
    required: true
  has_input:
    name: has_input
    range: DataObject
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
  has_output:
    name: has_output
    range: DataObject
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
  execution_resource:
    name: execution_resource
    required: true
  was_informed_by:
    name: was_informed_by
    required: true
attributes:
  ended_at_time:
    name: ended_at_time
    notes:
    - 'The regex for ISO-8601 format was taken from here: https://www.myintervals.com/blog/2009/05/20/iso-8601-date-validation-that-doesnt-suck/
      It may not be complete, but it is good enough for now.'
    from_schema: https://w3id.org/nmdc/nmdc
    mappings:
    - prov:endedAtTime
    rank: 1000
    alias: ended_at_time
    owner: WorkflowExecution
    domain_of:
    - WorkflowExecution
    range: string
    pattern: ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
  execution_resource:
    name: execution_resource
    description: The computing resource or facility where the workflow was executed.
    examples:
    - value: NERSC-Cori
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: execution_resource
    owner: WorkflowExecution
    domain_of:
    - WorkflowExecution
    range: ExecutionResourceEnum
    required: true
  git_url:
    name: git_url
    description: The url that points to the exact github location of a workflow.
    examples:
    - value: https://github.com/microbiomedata/mg_annotation/releases/tag/0.1
    - value: https://github.com/microbiomedata/metaMS/blob/master/metaMS/gcmsWorkflow.py
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: git_url
    owner: WorkflowExecution
    domain_of:
    - WorkflowExecution
    range: string
    required: true
  started_at_time:
    name: started_at_time
    notes:
    - 'The regex for ISO-8601 format was taken from here: https://www.myintervals.com/blog/2009/05/20/iso-8601-date-validation-that-doesnt-suck/
      It may not be complete, but it is good enough for now.'
    from_schema: https://w3id.org/nmdc/nmdc
    mappings:
    - prov:startedAtTime
    rank: 1000
    alias: started_at_time
    owner: WorkflowExecution
    domain_of:
    - WorkflowExecution
    range: string
    required: true
    pattern: ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
  version:
    name: version
    description: The NMDC release tag for a given workflow release used for data processing.
      If workflows are processed externally, as denoted by processing_institution,
      this value represents the best mapping between a processing institution's (e.g.,
      JGI) workflow metadata and a NMDC tagged release.
    examples:
    - value: v1.2.0
    from_schema: https://w3id.org/nmdc/nmdc
    broad_mappings:
    - NCIT:C182117
    rank: 1000
    alias: version
    owner: WorkflowExecution
    domain_of:
    - WorkflowExecution
    range: string
  was_informed_by:
    name: was_informed_by
    description: The primary DataGeneration subclass that the WorkflowExecution subclass
      depends on.
    comments:
    - For version 1 of the proteomics workflow there are input files both from the
      NucleotideSequencing and MassSpectrometry, the MassSpectrometry record is considered
      the primary class to reference.
    from_schema: https://w3id.org/nmdc/nmdc
    structured_aliases:
      was_informed_by:
        literal_form: was_informed_by
        predicate: EXACT_SYNONYM
        contexts:
        - https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
    narrow_mappings:
    - prov:wasInformedBy
    rank: 1000
    alias: was_informed_by
    owner: WorkflowExecution
    domain_of:
    - WorkflowExecution
    range: DataGeneration
    required: true
    multivalued: true
  processing_institution_workflow_metadata:
    name: processing_institution_workflow_metadata
    description: Information about how workflow results were generated when the processing
      is done by an external organziation (e.g., JGI) such as software tool name and
      version or pipeline name and version.
    examples:
    - value: metaspades v. 3.15.2
    - value: IMG Annotation Pipeline v.5.0.25
    from_schema: https://w3id.org/nmdc/nmdc
    mappings:
    - NCIT:C165211
    rank: 1000
    alias: processing_institution_workflow_metadata
    owner: WorkflowExecution
    domain_of:
    - WorkflowExecution
    range: string
  has_input:
    name: has_input
    description: An input to a process.
    from_schema: https://w3id.org/nmdc/nmdc
    aliases:
    - input
    rank: 1000
    alias: has_input
    owner: WorkflowExecution
    domain_of:
    - PlannedProcess
    range: DataObject
    required: true
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
  has_output:
    name: has_output
    description: An output from a process.
    from_schema: https://w3id.org/nmdc/nmdc
    aliases:
    - output
    rank: 1000
    alias: has_output
    owner: WorkflowExecution
    domain_of:
    - PlannedProcess
    range: DataObject
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
  processing_institution:
    name: processing_institution
    description: The organization that processed the sample.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: processing_institution
    owner: WorkflowExecution
    domain_of:
    - PlannedProcess
    range: ProcessingInstitutionEnum
  protocol_link:
    name: protocol_link
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: protocol_link
    owner: WorkflowExecution
    domain_of:
    - Configuration
    - PlannedProcess
    - Study
    range: Protocol
  start_date:
    name: start_date
    description: The date on which any process or activity was started
    todos:
    - add date string validation pattern
    comments:
    - We are using string representations of dates until all components of our ecosystem
      can handle ISO 8610 dates
    - The date should be formatted as YYYY-MM-DD
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: start_date
    owner: WorkflowExecution
    domain_of:
    - PlannedProcess
    range: string
  end_date:
    name: end_date
    description: The date on which any process or activity was ended
    todos:
    - add date string validation pattern
    comments:
    - We are using string representations of dates until all components of our ecosystem
      can handle ISO 8610 dates
    - The date should be formatted as YYYY-MM-DD
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: end_date
    owner: WorkflowExecution
    domain_of:
    - PlannedProcess
    range: string
  qc_status:
    name: qc_status
    description: Stores information about the result of a process (ie the process
      of sequencing a library may have for qc_status of 'fail' if not enough data
      was generated)
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: qc_status
    owner: WorkflowExecution
    domain_of:
    - PlannedProcess
    range: StatusEnum
  qc_comment:
    name: qc_comment
    description: Slot to store additional comments about laboratory or workflow output.
      For workflow output it may describe the particular workflow stage that failed.
      (ie Failed at call-stage due to a malformed fastq file).
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: qc_comment
    owner: WorkflowExecution
    domain_of:
    - PlannedProcess
    range: string
  has_failure_categorization:
    name: has_failure_categorization
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: has_failure_categorization
    owner: WorkflowExecution
    domain_of:
    - PlannedProcess
    range: FailureCategorization
    multivalued: true
    inlined: true
    inlined_as_list: true
  id:
    name: id
    description: A unique identifier for a thing. Must be either a CURIE shorthand
      for a URI or a complete URI
    notes:
    - 'abstracted pattern: prefix:typecode-authshoulder-blade(.version)?(_seqsuffix)?'
    - a minimum length of 3 characters is suggested for typecodes, but 1 or 2 characters
      will be accepted
    - typecodes must correspond 1:1 to a class in the NMDC schema. this will be checked
      via per-class id slot usage assertions
    - minting authority shoulders should probably be enumerated and checked in the
      pattern
    examples:
    - value: nmdc:mgmag-00-x012.1_7_c1
      description: https://github.com/microbiomedata/nmdc-schema/pull/499#discussion_r1018499248
    from_schema: https://w3id.org/nmdc/nmdc
    structured_aliases:
      workflow_execution_id:
        literal_form: workflow_execution_id
        predicate: NARROW_SYNONYM
        contexts:
        - https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
      data_object_id:
        literal_form: data_object_id
        predicate: NARROW_SYNONYM
        contexts:
        - https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
    rank: 1000
    identifier: true
    alias: id
    owner: WorkflowExecution
    domain_of:
    - NamedThing
    range: uriorcurie
    required: true
    pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
  name:
    name: name
    description: A human readable label for an entity
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: name
    owner: WorkflowExecution
    domain_of:
    - PersonValue
    - NamedThing
    - Protocol
    range: string
  description:
    name: description
    description: a human-readable description of a thing
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    slot_uri: dcterms:description
    alias: description
    owner: WorkflowExecution
    domain_of:
    - ImageValue
    - NamedThing
    range: string
  alternative_identifiers:
    name: alternative_identifiers
    description: A list of alternative identifiers for the entity.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: alternative_identifiers
    owner: WorkflowExecution
    domain_of:
    - MetaboliteIdentification
    - NamedThing
    range: uriorcurie
    multivalued: true
    pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,\(\)\=\#]*$
  type:
    name: type
    description: the class_uri of the class that has been instantiated
    notes:
    - makes it easier to read example data files
    - required for polymorphic MongoDB collections
    examples:
    - value: nmdc:Biosample
    - value: nmdc:Study
    from_schema: https://w3id.org/nmdc/nmdc
    see_also:
    - https://github.com/microbiomedata/nmdc-schema/issues/1048
    - https://github.com/microbiomedata/nmdc-schema/issues/1233
    - https://github.com/microbiomedata/nmdc-schema/issues/248
    structured_aliases:
      workflow_execution_class:
        literal_form: workflow_execution_class
        predicate: NARROW_SYNONYM
        contexts:
        - https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
    rank: 1000
    slot_uri: rdf:type
    designates_type: true
    alias: type
    owner: WorkflowExecution
    domain_of:
    - EukEval
    - FunctionalAnnotationAggMember
    - PeptideQuantification
    - ProteinQuantification
    - MobilePhaseSegment
    - PortionOfSubstance
    - MagBin
    - MetaboliteIdentification
    - GenomeFeature
    - FunctionalAnnotation
    - AttributeValue
    - NamedThing
    - OntologyRelation
    - FailureCategorization
    - Protocol
    - CreditAssociation
    - Doi
    range: uriorcurie
    required: true
class_uri: nmdc:WorkflowExecution
rules:
- preconditions:
    slot_conditions:
      qc_status:
        name: qc_status
        equals_string: pass
  postconditions:
    slot_conditions:
      has_output:
        name: has_output
        required: true
  description: If qc_status has a value of pass, then the has_output slot is required.
  title: qc_status_pass_has_output_required
- preconditions:
    slot_conditions:
      qc_status:
        name: qc_status
        value_presence: ABSENT
  postconditions:
    slot_conditions:
      has_output:
        name: has_output
        required: true
  description: If qc_status is not specified, then the has_output slot is required.
  title: qc_status_pass_null_has_output_required