Class: MetagenomeAssembly
A workflow execution activity that converts sequencing reads into an assembled metagenome.
classDiagram
class MetagenomeAssembly
click MetagenomeAssembly href "../MetagenomeAssembly"
WorkflowExecution <|-- MetagenomeAssembly
click WorkflowExecution href "../WorkflowExecution"
MetagenomeAssembly : alternative_identifiers
MetagenomeAssembly : asm_score
MetagenomeAssembly : contig_bp
MetagenomeAssembly : contigs
MetagenomeAssembly : ctg_l50
MetagenomeAssembly : ctg_l90
MetagenomeAssembly : ctg_logsum
MetagenomeAssembly : ctg_max
MetagenomeAssembly : ctg_n50
MetagenomeAssembly : ctg_n90
MetagenomeAssembly : ctg_powsum
MetagenomeAssembly : description
MetagenomeAssembly : end_date
MetagenomeAssembly : ended_at_time
MetagenomeAssembly : execution_resource
MetagenomeAssembly --> "1" ExecutionResourceEnum : execution_resource
click ExecutionResourceEnum href "../ExecutionResourceEnum"
MetagenomeAssembly : gap_pct
MetagenomeAssembly : gc_avg
MetagenomeAssembly : gc_std
MetagenomeAssembly : git_url
MetagenomeAssembly : has_failure_categorization
MetagenomeAssembly --> "*" FailureCategorization : has_failure_categorization
click FailureCategorization href "../FailureCategorization"
MetagenomeAssembly : has_input
MetagenomeAssembly --> "1..*" NamedThing : has_input
click NamedThing href "../NamedThing"
MetagenomeAssembly : has_output
MetagenomeAssembly --> "*" NamedThing : has_output
click NamedThing href "../NamedThing"
MetagenomeAssembly : id
MetagenomeAssembly : insdc_assembly_identifiers
MetagenomeAssembly : name
MetagenomeAssembly : num_aligned_reads
MetagenomeAssembly : num_input_reads
MetagenomeAssembly : processing_institution
MetagenomeAssembly --> "0..1" ProcessingInstitutionEnum : processing_institution
click ProcessingInstitutionEnum href "../ProcessingInstitutionEnum"
MetagenomeAssembly : protocol_link
MetagenomeAssembly --> "0..1" Protocol : protocol_link
click Protocol href "../Protocol"
MetagenomeAssembly : qc_comment
MetagenomeAssembly : qc_status
MetagenomeAssembly --> "0..1" StatusEnum : qc_status
click StatusEnum href "../StatusEnum"
MetagenomeAssembly : scaf_bp
MetagenomeAssembly : scaf_l50
MetagenomeAssembly : scaf_l90
MetagenomeAssembly : scaf_l_gt50k
MetagenomeAssembly : scaf_logsum
MetagenomeAssembly : scaf_max
MetagenomeAssembly : scaf_n50
MetagenomeAssembly : scaf_n90
MetagenomeAssembly : scaf_n_gt50k
MetagenomeAssembly : scaf_pct_gt50k
MetagenomeAssembly : scaf_powsum
MetagenomeAssembly : scaffolds
MetagenomeAssembly : start_date
MetagenomeAssembly : started_at_time
MetagenomeAssembly : type
MetagenomeAssembly : version
MetagenomeAssembly : was_informed_by
MetagenomeAssembly --> "1" DataGeneration : was_informed_by
click DataGeneration href "../DataGeneration"
Inheritance
- NamedThing
- PlannedProcess
- WorkflowExecution
- MetagenomeAssembly
- WorkflowExecution
- PlannedProcess
Slots
Name | Cardinality and Range | Description | Inheritance |
---|---|---|---|
asm_score | 0..1 Float |
A score for comparing metagenomic assembly quality from same sample | direct |
scaffolds | 0..1 Float |
Total sequence count of all scaffolds | direct |
scaf_logsum | 0..1 Float |
The sum of the (length*log(length)) of all scaffolds, times some constant | direct |
scaf_powsum | 0..1 Float |
Powersum of all scaffolds is the same as logsum except that it uses the sum o... | direct |
scaf_max | 0..1 Float |
Maximum scaffold length | direct |
scaf_bp | 0..1 Float |
Total size in bp of all scaffolds | direct |
scaf_n50 | 0..1 Float |
Given a set of scaffolds, each with its own length, the N50 count is defined ... | direct |
scaf_n90 | 0..1 Float |
Given a set of scaffolds, each with its own length, the N90 count is defined ... | direct |
scaf_l50 | 0..1 Float |
Given a set of scaffolds, the L50 is defined as the sequence length of the sh... | direct |
scaf_l90 | 0..1 Float |
The L90 statistic is less than or equal to the L50 statistic; it is the lengt... | direct |
scaf_n_gt50k | 0..1 Float |
Total sequence count of scaffolds greater than 50 KB | direct |
scaf_l_gt50k | 0..1 Float |
Total size in bp of all scaffolds greater than 50 KB | direct |
scaf_pct_gt50k | 0..1 Float |
Total sequence size percentage of scaffolds greater than 50 KB | direct |
contigs | 0..1 Float |
The sum of the (length*log(length)) of all contigs, times some constant | direct |
contig_bp | 0..1 Float |
Total size in bp of all contigs | direct |
ctg_n50 | 0..1 Float |
Given a set of contigs, each with its own length, the N50 count is defined as... | direct |
ctg_l50 | 0..1 Float |
Given a set of contigs, the L50 is defined as the sequence length of the shor... | direct |
ctg_n90 | 0..1 Float |
Given a set of contigs, each with its own length, the N90 count is defined as... | direct |
ctg_l90 | 0..1 Float |
The L90 statistic is less than or equal to the L50 statistic; it is the lengt... | direct |
ctg_logsum | 0..1 Float |
Maximum contig length | direct |
ctg_powsum | 0..1 Float |
Powersum of all contigs is the same as logsum except that it uses the sum of ... | direct |
ctg_max | 0..1 Float |
Maximum contig length | direct |
gap_pct | 0..1 Float |
The gap size percentage of all scaffolds | direct |
gc_std | 0..1 Float |
Standard deviation of GC content of all contigs | direct |
gc_avg | 0..1 Float |
Average of GC content of all contigs | direct |
num_input_reads | 0..1 Float |
The sequence count number of input reads for assembly | direct |
num_aligned_reads | 0..1 Float |
The sequence count number of input reads aligned to assembled contigs | direct |
insdc_assembly_identifiers | 0..1 String |
direct | |
ended_at_time | 0..1 String |
WorkflowExecution | |
execution_resource | 1 ExecutionResourceEnum |
The computing resource or facility where the workflow was executed | WorkflowExecution |
git_url | 1 String |
The url that points to the exact github location of a workflow | WorkflowExecution |
started_at_time | 1 String |
WorkflowExecution | |
version | 0..1 String |
WorkflowExecution | |
was_informed_by | 1 DataGeneration |
WorkflowExecution | |
has_input | 1..* NamedThing |
An input to a process | PlannedProcess |
has_output | * NamedThing |
An output from a process | PlannedProcess |
processing_institution | 0..1 ProcessingInstitutionEnum |
The organization that processed the sample | PlannedProcess |
protocol_link | 0..1 Protocol |
PlannedProcess | |
start_date | 0..1 String |
The date on which any process or activity was started | PlannedProcess |
end_date | 0..1 String |
The date on which any process or activity was ended | PlannedProcess |
qc_status | 0..1 StatusEnum |
Stores information about the result of a process (ie the process of sequencin... | PlannedProcess |
qc_comment | 0..1 String |
Slot to store additional comments about laboratory or workflow output | PlannedProcess |
has_failure_categorization | * FailureCategorization |
PlannedProcess | |
id | 1 Uriorcurie |
A unique identifier for a thing | NamedThing |
name | 0..1 String |
A human readable label for an entity | NamedThing |
description | 0..1 String |
a human-readable description of a thing | NamedThing |
alternative_identifiers | * Uriorcurie |
A list of alternative identifiers for the entity | NamedThing |
type | 1 Uriorcurie |
the class_uri of the class that has been instantiated | NamedThing |
Comments
- instances of this class may use a de novo assembly strategy in most or all cases relevant to NMDC
Identifier and Mapping Information
Schema Source
- from schema: https://w3id.org/nmdc/nmdc
Mappings
Mapping Type | Mapped Value |
---|---|
self | nmdc:MetagenomeAssembly |
native | nmdc:MetagenomeAssembly |
LinkML Source
Direct
name: MetagenomeAssembly
description: A workflow execution activity that converts sequencing reads into an
assembled metagenome.
comments:
- instances of this class may use a de novo assembly strategy in most or all cases
relevant to NMDC
in_subset:
- workflow subset
from_schema: https://w3id.org/nmdc/nmdc
is_a: WorkflowExecution
slots:
- asm_score
- scaffolds
- scaf_logsum
- scaf_powsum
- scaf_max
- scaf_bp
- scaf_n50
- scaf_n90
- scaf_l50
- scaf_l90
- scaf_n_gt50k
- scaf_l_gt50k
- scaf_pct_gt50k
- contigs
- contig_bp
- ctg_n50
- ctg_l50
- ctg_n90
- ctg_l90
- ctg_logsum
- ctg_powsum
- ctg_max
- gap_pct
- gc_std
- gc_avg
- num_input_reads
- num_aligned_reads
- insdc_assembly_identifiers
slot_usage:
id:
name: id
required: true
structured_pattern:
syntax: '{id_nmdc_prefix}:wfmgas-{id_shoulder}-{id_blade}{id_version}$'
interpolated: true
was_informed_by:
name: was_informed_by
structured_pattern:
syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
interpolated: true
class_uri: nmdc:MetagenomeAssembly
Induced
name: MetagenomeAssembly
description: A workflow execution activity that converts sequencing reads into an
assembled metagenome.
comments:
- instances of this class may use a de novo assembly strategy in most or all cases
relevant to NMDC
in_subset:
- workflow subset
from_schema: https://w3id.org/nmdc/nmdc
is_a: WorkflowExecution
slot_usage:
id:
name: id
required: true
structured_pattern:
syntax: '{id_nmdc_prefix}:wfmgas-{id_shoulder}-{id_blade}{id_version}$'
interpolated: true
was_informed_by:
name: was_informed_by
structured_pattern:
syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
interpolated: true
attributes:
asm_score:
name: asm_score
description: A score for comparing metagenomic assembly quality from same sample.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: asm_score
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
scaffolds:
name: scaffolds
description: Total sequence count of all scaffolds.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: scaffolds
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
scaf_logsum:
name: scaf_logsum
description: The sum of the (length*log(length)) of all scaffolds, times some
constant. Increase the contiguity, the score will increase
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: scaf_logsum
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
scaf_powsum:
name: scaf_powsum
description: Powersum of all scaffolds is the same as logsum except that it uses
the sum of (length*(length^P)) for some power P (default P=0.25).
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: scaf_powsum
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
scaf_max:
name: scaf_max
description: Maximum scaffold length.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: scaf_max
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
scaf_bp:
name: scaf_bp
description: Total size in bp of all scaffolds.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: scaf_bp
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
scaf_n50:
name: scaf_n50
description: Given a set of scaffolds, each with its own length, the N50 count
is defined as the smallest number of scaffolds whose length sum makes up half
of genome size.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: scaf_n50
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
scaf_n90:
name: scaf_n90
description: Given a set of scaffolds, each with its own length, the N90 count
is defined as the smallest number of scaffolds whose length sum makes up 90%
of genome size.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: scaf_n90
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
scaf_l50:
name: scaf_l50
description: Given a set of scaffolds, the L50 is defined as the sequence length
of the shortest scaffold at 50% of the total genome length.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: scaf_l50
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
scaf_l90:
name: scaf_l90
description: The L90 statistic is less than or equal to the L50 statistic; it
is the length for which the collection of all scaffolds of that length or longer
contains at least 90% of the sum of the lengths of all scaffolds.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: scaf_l90
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
scaf_n_gt50k:
name: scaf_n_gt50k
description: Total sequence count of scaffolds greater than 50 KB.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: scaf_n_gt50k
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
scaf_l_gt50k:
name: scaf_l_gt50k
description: Total size in bp of all scaffolds greater than 50 KB.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: scaf_l_gt50k
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
scaf_pct_gt50k:
name: scaf_pct_gt50k
description: Total sequence size percentage of scaffolds greater than 50 KB.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: scaf_pct_gt50k
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
contigs:
name: contigs
description: The sum of the (length*log(length)) of all contigs, times some constant. Increase
the contiguity, the score will increase
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: contigs
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
contig_bp:
name: contig_bp
description: Total size in bp of all contigs.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: contig_bp
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
ctg_n50:
name: ctg_n50
description: Given a set of contigs, each with its own length, the N50 count is
defined as the smallest number_of_contigs whose length sum makes up half of
genome size.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: ctg_n50
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
ctg_l50:
name: ctg_l50
description: Given a set of contigs, the L50 is defined as the sequence length
of the shortest contig at 50% of the total genome length.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: ctg_l50
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
ctg_n90:
name: ctg_n90
description: Given a set of contigs, each with its own length, the N90 count is
defined as the smallest number of contigs whose length sum makes up 90% of genome
size.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: ctg_n90
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
ctg_l90:
name: ctg_l90
description: The L90 statistic is less than or equal to the L50 statistic; it
is the length for which the collection of all contigs of that length or longer
contains at least 90% of the sum of the lengths of all contigs.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: ctg_l90
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
ctg_logsum:
name: ctg_logsum
description: Maximum contig length.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: ctg_logsum
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
ctg_powsum:
name: ctg_powsum
description: Powersum of all contigs is the same as logsum except that it uses
the sum of (length*(length^P)) for some power P (default P=0.25).
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: ctg_powsum
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
ctg_max:
name: ctg_max
description: Maximum contig length.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: ctg_max
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
gap_pct:
name: gap_pct
description: The gap size percentage of all scaffolds.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: gap_pct
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
gc_std:
name: gc_std
description: Standard deviation of GC content of all contigs.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: gc_std
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
gc_avg:
name: gc_avg
description: Average of GC content of all contigs.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: gc_avg
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
num_input_reads:
name: num_input_reads
description: The sequence count number of input reads for assembly.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: num_input_reads
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
num_aligned_reads:
name: num_aligned_reads
description: The sequence count number of input reads aligned to assembled contigs.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: metagenome_assembly_parameter
alias: num_aligned_reads
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: float
insdc_assembly_identifiers:
name: insdc_assembly_identifiers
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
is_a: assembly_identifiers
mixins:
- insdc_identifiers
alias: insdc_assembly_identifiers
owner: MetagenomeAssembly
domain_of:
- MetagenomeAssembly
- MetatranscriptomeAssembly
range: string
pattern: ^insdc.sra:[A-Z]+[0-9]+(\.[0-9]+)?$
ended_at_time:
name: ended_at_time
notes:
- 'The regex for ISO-8601 format was taken from here: https://www.myintervals.com/blog/2009/05/20/iso-8601-date-validation-that-doesnt-suck/
It may not be complete, but it is good enough for now.'
from_schema: https://w3id.org/nmdc/nmdc
mappings:
- prov:endedAtTime
rank: 1000
alias: ended_at_time
owner: MetagenomeAssembly
domain_of:
- WorkflowExecution
range: string
pattern: ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
execution_resource:
name: execution_resource
description: The computing resource or facility where the workflow was executed.
examples:
- value: NERSC-Cori
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: execution_resource
owner: MetagenomeAssembly
domain_of:
- WorkflowExecution
range: ExecutionResourceEnum
required: true
git_url:
name: git_url
description: The url that points to the exact github location of a workflow.
examples:
- value: https://github.com/microbiomedata/mg_annotation/releases/tag/0.1
- value: https://github.com/microbiomedata/metaMS/blob/master/metaMS/gcmsWorkflow.py
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: git_url
owner: MetagenomeAssembly
domain_of:
- WorkflowExecution
range: string
required: true
started_at_time:
name: started_at_time
notes:
- 'The regex for ISO-8601 format was taken from here: https://www.myintervals.com/blog/2009/05/20/iso-8601-date-validation-that-doesnt-suck/
It may not be complete, but it is good enough for now.'
from_schema: https://w3id.org/nmdc/nmdc
mappings:
- prov:startedAtTime
rank: 1000
alias: started_at_time
owner: MetagenomeAssembly
domain_of:
- WorkflowExecution
range: string
required: true
pattern: ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
version:
name: version
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: version
owner: MetagenomeAssembly
domain_of:
- WorkflowExecution
range: string
was_informed_by:
name: was_informed_by
from_schema: https://w3id.org/nmdc/nmdc
mappings:
- prov:wasInformedBy
rank: 1000
alias: was_informed_by
owner: MetagenomeAssembly
domain_of:
- WorkflowExecution
range: DataGeneration
required: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
interpolated: true
has_input:
name: has_input
description: An input to a process.
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- input
rank: 1000
alias: has_input
owner: MetagenomeAssembly
domain_of:
- PlannedProcess
range: NamedThing
required: true
multivalued: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
interpolated: true
has_output:
name: has_output
description: An output from a process.
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- output
rank: 1000
alias: has_output
owner: MetagenomeAssembly
domain_of:
- PlannedProcess
range: NamedThing
multivalued: true
structured_pattern:
syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
interpolated: true
processing_institution:
name: processing_institution
description: The organization that processed the sample.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: processing_institution
owner: MetagenomeAssembly
domain_of:
- PlannedProcess
range: ProcessingInstitutionEnum
protocol_link:
name: protocol_link
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: protocol_link
owner: MetagenomeAssembly
domain_of:
- PlannedProcess
- Study
range: Protocol
start_date:
name: start_date
description: The date on which any process or activity was started
todos:
- add date string validation pattern
comments:
- We are using string representations of dates until all components of our ecosystem
can handle ISO 8610 dates
- The date should be formatted as YYYY-MM-DD
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: start_date
owner: MetagenomeAssembly
domain_of:
- PlannedProcess
range: string
end_date:
name: end_date
description: The date on which any process or activity was ended
todos:
- add date string validation pattern
comments:
- We are using string representations of dates until all components of our ecosystem
can handle ISO 8610 dates
- The date should be formatted as YYYY-MM-DD
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: end_date
owner: MetagenomeAssembly
domain_of:
- PlannedProcess
range: string
qc_status:
name: qc_status
description: Stores information about the result of a process (ie the process
of sequencing a library may have for qc_status of 'fail' if not enough data
was generated)
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: qc_status
owner: MetagenomeAssembly
domain_of:
- PlannedProcess
range: StatusEnum
qc_comment:
name: qc_comment
description: Slot to store additional comments about laboratory or workflow output.
For workflow output it may describe the particular workflow stage that failed.
(ie Failed at call-stage due to a malformed fastq file).
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: qc_comment
owner: MetagenomeAssembly
domain_of:
- PlannedProcess
range: string
has_failure_categorization:
name: has_failure_categorization
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: has_failure_categorization
owner: MetagenomeAssembly
domain_of:
- PlannedProcess
range: FailureCategorization
multivalued: true
inlined: true
inlined_as_list: true
id:
name: id
description: A unique identifier for a thing. Must be either a CURIE shorthand
for a URI or a complete URI
notes:
- 'abstracted pattern: prefix:typecode-authshoulder-blade(.version)?(_seqsuffix)?'
- a minimum length of 3 characters is suggested for typecodes, but 1 or 2 characters
will be accepted
- typecodes must correspond 1:1 to a class in the NMDC schema. this will be checked
via per-class id slot usage assertions
- minting authority shoulders should probably be enumerated and checked in the
pattern
examples:
- value: nmdc:mgmag-00-x012.1_7_c1
description: https://github.com/microbiomedata/nmdc-schema/pull/499#discussion_r1018499248
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
identifier: true
alias: id
owner: MetagenomeAssembly
domain_of:
- NamedThing
range: uriorcurie
required: true
pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
structured_pattern:
syntax: '{id_nmdc_prefix}:wfmgas-{id_shoulder}-{id_blade}{id_version}$'
interpolated: true
name:
name: name
description: A human readable label for an entity
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: name
owner: MetagenomeAssembly
domain_of:
- PersonValue
- NamedThing
- Protocol
range: string
description:
name: description
description: a human-readable description of a thing
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
slot_uri: dcterms:description
alias: description
owner: MetagenomeAssembly
domain_of:
- ImageValue
- NamedThing
range: string
alternative_identifiers:
name: alternative_identifiers
description: A list of alternative identifiers for the entity.
from_schema: https://w3id.org/nmdc/nmdc
rank: 1000
alias: alternative_identifiers
owner: MetagenomeAssembly
domain_of:
- MetaboliteIdentification
- NamedThing
range: uriorcurie
multivalued: true
pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
type:
name: type
description: the class_uri of the class that has been instantiated
notes:
- replaces legacy nmdc:type slot
- makes it easier to read example data files
- required for polymorphic MongoDB collections
examples:
- value: nmdc:Biosample
- value: nmdc:Study
from_schema: https://w3id.org/nmdc/nmdc
see_also:
- https://github.com/microbiomedata/nmdc-schema/issues/1048
- https://github.com/microbiomedata/nmdc-schema/issues/1233
- https://github.com/microbiomedata/nmdc-schema/issues/248
rank: 1000
slot_uri: rdf:type
designates_type: true
alias: type
owner: MetagenomeAssembly
domain_of:
- EukEval
- FunctionalAnnotationAggMember
- MobilePhaseSegment
- PortionOfSubstance
- MagBin
- MetaboliteIdentification
- PeptideQuantification
- ProteinQuantification
- GenomeFeature
- FunctionalAnnotation
- AttributeValue
- NamedThing
- FailureCategorization
- Protocol
- CreditAssociation
- Doi
range: uriorcurie
required: true
class_uri: nmdc:MetagenomeAssembly