Class: MetagenomeAnnotation
A workflow execution activity that provides functional and structural annotation of assembled metagenome contigs
URI: nmdc:MetagenomeAnnotation
classDiagram
class MetagenomeAnnotation
click MetagenomeAnnotation href "../MetagenomeAnnotation"
WorkflowExecution <|-- MetagenomeAnnotation
click WorkflowExecution href "../WorkflowExecution"
MetagenomeAnnotation : alternative_identifiers
MetagenomeAnnotation : description
MetagenomeAnnotation : end_date
MetagenomeAnnotation : ended_at_time
MetagenomeAnnotation : execution_resource
MetagenomeAnnotation --> "1" ExecutionResourceEnum : execution_resource
click ExecutionResourceEnum href "../ExecutionResourceEnum"
MetagenomeAnnotation : git_url
MetagenomeAnnotation : gold_analysis_project_identifiers
MetagenomeAnnotation : has_failure_categorization
MetagenomeAnnotation --> "*" FailureCategorization : has_failure_categorization
click FailureCategorization href "../FailureCategorization"
MetagenomeAnnotation : has_input
MetagenomeAnnotation --> "1..*" NamedThing : has_input
click NamedThing href "../NamedThing"
MetagenomeAnnotation : has_output
MetagenomeAnnotation --> "*" NamedThing : has_output
click NamedThing href "../NamedThing"
MetagenomeAnnotation : id
MetagenomeAnnotation : img_identifiers
MetagenomeAnnotation : name
MetagenomeAnnotation : processing_institution
MetagenomeAnnotation --> "0..1" ProcessingInstitutionEnum : processing_institution
click ProcessingInstitutionEnum href "../ProcessingInstitutionEnum"
MetagenomeAnnotation : protocol_link
MetagenomeAnnotation --> "0..1" Protocol : protocol_link
click Protocol href "../Protocol"
MetagenomeAnnotation : qc_comment
MetagenomeAnnotation : qc_status
MetagenomeAnnotation --> "0..1" StatusEnum : qc_status
click StatusEnum href "../StatusEnum"
MetagenomeAnnotation : start_date
MetagenomeAnnotation : started_at_time
MetagenomeAnnotation : type
MetagenomeAnnotation : version
MetagenomeAnnotation : was_informed_by
MetagenomeAnnotation --> "1" DataGeneration : was_informed_by
click DataGeneration href "../DataGeneration"
Inheritance
- NamedThing
- PlannedProcess
- WorkflowExecution
- MetagenomeAnnotation
- WorkflowExecution
- PlannedProcess
Slots
Name | Cardinality and Range | Description | Inheritance |
---|---|---|---|
img_identifiers | * ExternalIdentifier |
A list of identifiers that relate the biosample to records in the IMG databas... | direct |
gold_analysis_project_identifiers | * ExternalIdentifier |
identifiers for corresponding analysis projects in GOLD | direct |
ended_at_time | 0..1 String |
WorkflowExecution | |
execution_resource | 1 ExecutionResourceEnum |
The computing resource or facility where the workflow was executed | WorkflowExecution |
git_url | 1 String |
The url that points to the exact github location of a workflow | WorkflowExecution |
started_at_time | 1 String |
WorkflowExecution | |
version | 0..1 String |
WorkflowExecution | |
was_informed_by | 1 DataGeneration |
WorkflowExecution | |
has_input | 1..* NamedThing |
An input to a process | PlannedProcess |
has_output | * NamedThing |
An output from a process | PlannedProcess |
processing_institution | 0..1 ProcessingInstitutionEnum |
The organization that processed the sample | PlannedProcess |
protocol_link | 0..1 Protocol |
PlannedProcess | |
start_date | 0..1 String |
The date on which any process or activity was started | PlannedProcess |
end_date | 0..1 String |
The date on which any process or activity was ended | PlannedProcess |
qc_status | 0..1 StatusEnum |
Stores information about the result of a process (ie the process of sequencin... | PlannedProcess |
qc_comment | 0..1 String |
Slot to store additional comments about laboratory or workflow output | PlannedProcess |
has_failure_categorization | * FailureCategorization |
PlannedProcess | |
id | 1 Uriorcurie |
A unique identifier for a thing | NamedThing |
name | 0..1 String |
A human readable label for an entity | NamedThing |
description | 0..1 String |
a human-readable description of a thing | NamedThing |
alternative_identifiers | * Uriorcurie |
A list of alternative identifiers for the entity | NamedThing |
type | 1 Uriorcurie |
the class_uri of the class that has been instantiated | NamedThing |
Usages
used by | used in | type | used |
---|---|---|---|
FunctionalAnnotationAggMember | metagenome_annotation_id | any_of[range] | MetagenomeAnnotation |
FunctionalAnnotation | was_generated_by | range | MetagenomeAnnotation |
Identifier and Mapping Information
Schema Source
- from schema: https://w3id.org/nmdc/nmdc
Mappings
Mapping Type | Mapped Value |
---|---|
self | nmdc:MetagenomeAnnotation |
native | nmdc:MetagenomeAnnotation |
LinkML Source
Direct
name: MetagenomeAnnotation
description: A workflow execution activity that provides functional and structural
annotation of assembled metagenome contigs
in_subset:
- workflow subset
from_schema: https://w3id.org/nmdc/nmdc
is_a: WorkflowExecution
slots:
- img_identifiers
- gold_analysis_project_identifiers
slot_usage:
id:
name: id
required: true
structured_pattern:
syntax: '{id_nmdc_prefix}:wfmgan-{id_shoulder}-{id_blade}{id_version}$'
interpolated: true
img_identifiers:
name: img_identifiers
maximum_cardinality: 1
was_informed_by:
name: was_informed_by
structured_pattern:
syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
interpolated: true
gold_analysis_project_identifiers:
name: gold_analysis_project_identifiers
structured_pattern:
syntax: ^gold:Ga[0-9]+$
interpolated: true
class_uri: nmdc:MetagenomeAnnotation
Induced
name: MetagenomeAnnotation
description: A workflow execution activity that provides functional and structural
annotation of assembled metagenome contigs
in_subset:
- workflow subset
from_schema: https://w3id.org/nmdc/nmdc
is_a: WorkflowExecution
slot_usage:
id:
name: id
required: true
structured_pattern:
syntax: '{id_nmdc_prefix}:wfmgan-{id_shoulder}-{id_blade}{id_version}$'
interpolated: true
img_identifiers:
name: img_identifiers
maximum_cardinality: 1
was_informed_by:
name: was_informed_by
structured_pattern:
syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
interpolated: true
gold_analysis_project_identifiers:
name: gold_analysis_project_identifiers
structured_pattern:
syntax: ^gold:Ga[0-9]+$
interpolated: true
attributes:
img_identifiers:
name: img_identifiers
description: A list of identifiers that relate the biosample to records in the
IMG database.
title: IMG Identifiers
todos:
- add is_a or mixin modeling, like other external_database_identifiers
- what class would IMG records belong to?! Are they Studies, Biosamples, or something
else?
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: external_database_identifiers
alias: img_identifiers
owner: MetagenomeAnnotation
domain_of:
- MetagenomeAnnotation
- Biosample
- MetatranscriptomeAnnotation
- MetatranscriptomeExpressionAnalysis
- MagsAnalysis
range: external_identifier
multivalued: true
pattern: ^img\.taxon:[a-zA-Z0-9_][a-zA-Z0-9_\/\.]*$
maximum_cardinality: 1
gold_analysis_project_identifiers:
name: gold_analysis_project_identifiers
description: identifiers for corresponding analysis projects in GOLD
examples:
- value: https://bioregistry.io/gold:Ga0526289
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: analysis_identifiers
mixins:
- gold_identifiers
alias: gold_analysis_project_identifiers
owner: MetagenomeAnnotation
domain_of:
- MetagenomeAnnotation
- MetatranscriptomeAnnotation
range: external_identifier
multivalued: true
pattern: ^gold:Ga[0-9]+$
structured_pattern:
syntax: ^gold:Ga[0-9]+$
interpolated: true
ended_at_time:
name: ended_at_time
notes:
- 'The regex for ISO-8601 format was taken from here: https://www.myintervals.com/blog/2009/05/20/iso-8601-date-validation-that-doesnt-suck/
It may not be complete, but it is good enough for now.'
from_schema: https://w3id.org/nmdc/nmdc
mappings:
- prov:endedAtTime
rank: 1000
alias: ended_at_time
owner: MetagenomeAnnotation
domain_of:
- WorkflowExecution
range: string
pattern: ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
execution_resource:
name: execution_resource
description: The computing resource or facility where the workflow was executed.
examples:
- value: NERSC-Cori
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: execution_resource
owner: MetagenomeAnnotation
domain_of:
- WorkflowExecution
range: ExecutionResourceEnum
required: true
git_url:
name: git_url
description: The url that points to the exact github location of a workflow.
examples:
- value: https://github.com/microbiomedata/mg_annotation/releases/tag/0.1
- value: https://github.com/microbiomedata/metaMS/blob/master/metaMS/gcmsWorkflow.py
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: git_url
owner: MetagenomeAnnotation
domain_of:
- WorkflowExecution
range: string
required: true
started_at_time:
name: started_at_time
notes:
- 'The regex for ISO-8601 format was taken from here: https://www.myintervals.com/blog/2009/05/20/iso-8601-date-validation-that-doesnt-suck/
It may not be complete, but it is good enough for now.'
from_schema: https://w3id.org/nmdc/nmdc
mappings:
- prov:startedAtTime
rank: 1000
alias: started_at_time
owner: MetagenomeAnnotation
domain_of:
- WorkflowExecution
range: string
required: true
pattern: ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
version:
name: version
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: version
owner: MetagenomeAnnotation
domain_of:
- WorkflowExecution
range: string
was_informed_by:
name: was_informed_by
from_schema: https://w3id.org/nmdc/nmdc
mappings:
- prov:wasInformedBy
rank: 1000
alias: was_informed_by
owner: MetagenomeAnnotation
domain_of:
- WorkflowExecution
range: DataGeneration
required: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
interpolated: true
has_input:
name: has_input
description: An input to a process.
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- input
rank: 1000
alias: has_input
owner: MetagenomeAnnotation
domain_of:
- PlannedProcess
range: NamedThing
required: true
multivalued: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
interpolated: true
has_output:
name: has_output
description: An output from a process.
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- output
rank: 1000
alias: has_output
owner: MetagenomeAnnotation
domain_of:
- PlannedProcess
range: NamedThing
multivalued: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
interpolated: true
processing_institution:
name: processing_institution
description: The organization that processed the sample.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: processing_institution
owner: MetagenomeAnnotation
domain_of:
- PlannedProcess
range: ProcessingInstitutionEnum
protocol_link:
name: protocol_link
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: protocol_link
owner: MetagenomeAnnotation
domain_of:
- PlannedProcess
- Study
range: Protocol
start_date:
name: start_date
description: The date on which any process or activity was started
todos:
- add date string validation pattern
comments:
- We are using string representations of dates until all components of our ecosystem
can handle ISO 8610 dates
- The date should be formatted as YYYY-MM-DD
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: start_date
owner: MetagenomeAnnotation
domain_of:
- PlannedProcess
range: string
end_date:
name: end_date
description: The date on which any process or activity was ended
todos:
- add date string validation pattern
comments:
- We are using string representations of dates until all components of our ecosystem
can handle ISO 8610 dates
- The date should be formatted as YYYY-MM-DD
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: end_date
owner: MetagenomeAnnotation
domain_of:
- PlannedProcess
range: string
qc_status:
name: qc_status
description: Stores information about the result of a process (ie the process
of sequencing a library may have for qc_status of 'fail' if not enough data
was generated)
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: qc_status
owner: MetagenomeAnnotation
domain_of:
- PlannedProcess
range: StatusEnum
qc_comment:
name: qc_comment
description: Slot to store additional comments about laboratory or workflow output.
For workflow output it may describe the particular workflow stage that failed.
(ie Failed at call-stage due to a malformed fastq file).
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: qc_comment
owner: MetagenomeAnnotation
domain_of:
- PlannedProcess
range: string
has_failure_categorization:
name: has_failure_categorization
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: has_failure_categorization
owner: MetagenomeAnnotation
domain_of:
- PlannedProcess
range: FailureCategorization
multivalued: true
inlined: true
inlined_as_list: true
id:
name: id
description: A unique identifier for a thing. Must be either a CURIE shorthand
for a URI or a complete URI
notes:
- 'abstracted pattern: prefix:typecode-authshoulder-blade(.version)?(_seqsuffix)?'
- a minimum length of 3 characters is suggested for typecodes, but 1 or 2 characters
will be accepted
- typecodes must correspond 1:1 to a class in the NMDC schema. this will be checked
via per-class id slot usage assertions
- minting authority shoulders should probably be enumerated and checked in the
pattern
examples:
- value: nmdc:mgmag-00-x012.1_7_c1
description: https://github.com/microbiomedata/nmdc-schema/pull/499#discussion_r1018499248
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
identifier: true
alias: id
owner: MetagenomeAnnotation
domain_of:
- NamedThing
range: uriorcurie
required: true
pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
structured_pattern:
syntax: '{id_nmdc_prefix}:wfmgan-{id_shoulder}-{id_blade}{id_version}$'
interpolated: true
name:
name: name
description: A human readable label for an entity
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: name
owner: MetagenomeAnnotation
domain_of:
- PersonValue
- NamedThing
- Protocol
range: string
description:
name: description
description: a human-readable description of a thing
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
slot_uri: dcterms:description
alias: description
owner: MetagenomeAnnotation
domain_of:
- ImageValue
- NamedThing
range: string
alternative_identifiers:
name: alternative_identifiers
description: A list of alternative identifiers for the entity.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: alternative_identifiers
owner: MetagenomeAnnotation
domain_of:
- MetaboliteIdentification
- NamedThing
range: uriorcurie
multivalued: true
pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
type:
name: type
description: the class_uri of the class that has been instantiated
notes:
- replaces legacy nmdc:type slot
- makes it easier to read example data files
- required for polymorphic MongoDB collections
examples:
- value: nmdc:Biosample
- value: nmdc:Study
from_schema: https://w3id.org/nmdc/nmdc
see_also:
- https://github.com/microbiomedata/nmdc-schema/issues/1048
- https://github.com/microbiomedata/nmdc-schema/issues/1233
- https://github.com/microbiomedata/nmdc-schema/issues/248
rank: 1000
slot_uri: rdf:type
designates_type: true
alias: type
owner: MetagenomeAnnotation
domain_of:
- EukEval
- FunctionalAnnotationAggMember
- MobilePhaseSegment
- PortionOfSubstance
- MagBin
- MetaboliteIdentification
- PeptideQuantification
- ProteinQuantification
- GenomeFeature
- FunctionalAnnotation
- AttributeValue
- NamedThing
- FailureCategorization
- Protocol
- CreditAssociation
- Doi
range: uriorcurie
required: true
class_uri: nmdc:MetagenomeAnnotation