Skip to content

Class: MetatranscriptomeAssembly

URI: nmdc:MetatranscriptomeAssembly

classDiagram class MetatranscriptomeAssembly click MetatranscriptomeAssembly href "../MetatranscriptomeAssembly" WorkflowExecution <|-- MetatranscriptomeAssembly click WorkflowExecution href "../WorkflowExecution" MetatranscriptomeAssembly : alternative_identifiers MetatranscriptomeAssembly : asm_score MetatranscriptomeAssembly : contig_bp MetatranscriptomeAssembly : contigs MetatranscriptomeAssembly : ctg_l50 MetatranscriptomeAssembly : ctg_l90 MetatranscriptomeAssembly : ctg_logsum MetatranscriptomeAssembly : ctg_max MetatranscriptomeAssembly : ctg_n50 MetatranscriptomeAssembly : ctg_n90 MetatranscriptomeAssembly : ctg_powsum MetatranscriptomeAssembly : description MetatranscriptomeAssembly : end_date MetatranscriptomeAssembly : ended_at_time MetatranscriptomeAssembly : execution_resource MetatranscriptomeAssembly --> "1" ExecutionResourceEnum : execution_resource click ExecutionResourceEnum href "../ExecutionResourceEnum" MetatranscriptomeAssembly : gap_pct MetatranscriptomeAssembly : gc_avg MetatranscriptomeAssembly : gc_std MetatranscriptomeAssembly : git_url MetatranscriptomeAssembly : has_failure_categorization MetatranscriptomeAssembly --> "*" FailureCategorization : has_failure_categorization click FailureCategorization href "../FailureCategorization" MetatranscriptomeAssembly : has_input MetatranscriptomeAssembly --> "1..*" NamedThing : has_input click NamedThing href "../NamedThing" MetatranscriptomeAssembly : has_output MetatranscriptomeAssembly --> "*" NamedThing : has_output click NamedThing href "../NamedThing" MetatranscriptomeAssembly : id MetatranscriptomeAssembly : insdc_assembly_identifiers MetatranscriptomeAssembly : name MetatranscriptomeAssembly : num_aligned_reads MetatranscriptomeAssembly : num_input_reads MetatranscriptomeAssembly : processing_institution MetatranscriptomeAssembly --> "0..1" ProcessingInstitutionEnum : processing_institution click ProcessingInstitutionEnum href "../ProcessingInstitutionEnum" MetatranscriptomeAssembly : protocol_link MetatranscriptomeAssembly --> "0..1" Protocol : protocol_link click Protocol href "../Protocol" MetatranscriptomeAssembly : qc_comment MetatranscriptomeAssembly : qc_status MetatranscriptomeAssembly --> "0..1" StatusEnum : qc_status click StatusEnum href "../StatusEnum" MetatranscriptomeAssembly : scaf_bp MetatranscriptomeAssembly : scaf_l50 MetatranscriptomeAssembly : scaf_l90 MetatranscriptomeAssembly : scaf_l_gt50k MetatranscriptomeAssembly : scaf_logsum MetatranscriptomeAssembly : scaf_max MetatranscriptomeAssembly : scaf_n50 MetatranscriptomeAssembly : scaf_n90 MetatranscriptomeAssembly : scaf_n_gt50k MetatranscriptomeAssembly : scaf_pct_gt50k MetatranscriptomeAssembly : scaf_powsum MetatranscriptomeAssembly : scaffolds MetatranscriptomeAssembly : start_date MetatranscriptomeAssembly : started_at_time MetatranscriptomeAssembly : type MetatranscriptomeAssembly : version MetatranscriptomeAssembly : was_informed_by MetatranscriptomeAssembly --> "1" DataGeneration : was_informed_by click DataGeneration href "../DataGeneration"

Inheritance

Slots

Name Cardinality and Range Description Inheritance
asm_score 0..1
Float
A score for comparing metagenomic assembly quality from same sample direct
scaffolds 0..1
Float
Total sequence count of all scaffolds direct
scaf_logsum 0..1
Float
The sum of the (length*log(length)) of all scaffolds, times some constant direct
scaf_powsum 0..1
Float
Powersum of all scaffolds is the same as logsum except that it uses the sum o... direct
scaf_max 0..1
Float
Maximum scaffold length direct
scaf_bp 0..1
Float
Total size in bp of all scaffolds direct
scaf_n50 0..1
Float
Given a set of scaffolds, each with its own length, the N50 count is defined ... direct
scaf_n90 0..1
Float
Given a set of scaffolds, each with its own length, the N90 count is defined ... direct
scaf_l50 0..1
Float
Given a set of scaffolds, the L50 is defined as the sequence length of the sh... direct
scaf_l90 0..1
Float
The L90 statistic is less than or equal to the L50 statistic; it is the lengt... direct
scaf_n_gt50k 0..1
Float
Total sequence count of scaffolds greater than 50 KB direct
scaf_l_gt50k 0..1
Float
Total size in bp of all scaffolds greater than 50 KB direct
scaf_pct_gt50k 0..1
Float
Total sequence size percentage of scaffolds greater than 50 KB direct
contigs 0..1
Float
The sum of the (length*log(length)) of all contigs, times some constant direct
contig_bp 0..1
Float
Total size in bp of all contigs direct
ctg_n50 0..1
Float
Given a set of contigs, each with its own length, the N50 count is defined as... direct
ctg_l50 0..1
Float
Given a set of contigs, the L50 is defined as the sequence length of the shor... direct
ctg_n90 0..1
Float
Given a set of contigs, each with its own length, the N90 count is defined as... direct
ctg_l90 0..1
Float
The L90 statistic is less than or equal to the L50 statistic; it is the lengt... direct
ctg_logsum 0..1
Float
Maximum contig length direct
ctg_powsum 0..1
Float
Powersum of all contigs is the same as logsum except that it uses the sum of ... direct
ctg_max 0..1
Float
Maximum contig length direct
gap_pct 0..1
Float
The gap size percentage of all scaffolds direct
gc_std 0..1
Float
Standard deviation of GC content of all contigs direct
gc_avg 0..1
Float
Average of GC content of all contigs direct
num_input_reads 0..1
Float
The sequence count number of input reads for assembly direct
num_aligned_reads 0..1
Float
The sequence count number of input reads aligned to assembled contigs direct
insdc_assembly_identifiers 0..1
String
direct
ended_at_time 0..1
String
WorkflowExecution
execution_resource 1
ExecutionResourceEnum
The computing resource or facility where the workflow was executed WorkflowExecution
git_url 1
String
The url that points to the exact github location of a workflow WorkflowExecution
started_at_time 1
String
WorkflowExecution
version 0..1
String
WorkflowExecution
was_informed_by 1
DataGeneration
WorkflowExecution
has_input 1..*
NamedThing
An input to a process PlannedProcess
has_output *
NamedThing
An output from a process PlannedProcess
processing_institution 0..1
ProcessingInstitutionEnum
The organization that processed the sample PlannedProcess
protocol_link 0..1
Protocol
PlannedProcess
start_date 0..1
String
The date on which any process or activity was started PlannedProcess
end_date 0..1
String
The date on which any process or activity was ended PlannedProcess
qc_status 0..1
StatusEnum
Stores information about the result of a process (ie the process of sequencin... PlannedProcess
qc_comment 0..1
String
Slot to store additional comments about laboratory or workflow output PlannedProcess
has_failure_categorization *
FailureCategorization
PlannedProcess
id 1
Uriorcurie
A unique identifier for a thing NamedThing
name 0..1
String
A human readable label for an entity NamedThing
description 0..1
String
a human-readable description of a thing NamedThing
alternative_identifiers *
Uriorcurie
A list of alternative identifiers for the entity NamedThing
type 1
Uriorcurie
the class_uri of the class that has been instantiated NamedThing

Identifier and Mapping Information

Schema Source

Mappings

Mapping Type Mapped Value
self nmdc:MetatranscriptomeAssembly
native nmdc:MetatranscriptomeAssembly

LinkML Source

Direct

name: MetatranscriptomeAssembly
in_subset:
- workflow subset
from_schema: https://w3id.org/nmdc/nmdc
is_a: WorkflowExecution
slots:
- asm_score
- scaffolds
- scaf_logsum
- scaf_powsum
- scaf_max
- scaf_bp
- scaf_n50
- scaf_n90
- scaf_l50
- scaf_l90
- scaf_n_gt50k
- scaf_l_gt50k
- scaf_pct_gt50k
- contigs
- contig_bp
- ctg_n50
- ctg_l50
- ctg_n90
- ctg_l90
- ctg_logsum
- ctg_powsum
- ctg_max
- gap_pct
- gc_std
- gc_avg
- num_input_reads
- num_aligned_reads
- insdc_assembly_identifiers
slot_usage:
  id:
    name: id
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:wfmtas-{id_shoulder}-{id_blade}{id_version}$'
      interpolated: true
  was_informed_by:
    name: was_informed_by
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
      interpolated: true
class_uri: nmdc:MetatranscriptomeAssembly

Induced

name: MetatranscriptomeAssembly
in_subset:
- workflow subset
from_schema: https://w3id.org/nmdc/nmdc
is_a: WorkflowExecution
slot_usage:
  id:
    name: id
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:wfmtas-{id_shoulder}-{id_blade}{id_version}$'
      interpolated: true
  was_informed_by:
    name: was_informed_by
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
      interpolated: true
attributes:
  asm_score:
    name: asm_score
    description: A score for comparing metagenomic assembly quality from same sample.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: asm_score
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  scaffolds:
    name: scaffolds
    description: Total sequence count of all scaffolds.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: scaffolds
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  scaf_logsum:
    name: scaf_logsum
    description: The sum of the (length*log(length)) of all scaffolds, times some
      constant.  Increase the contiguity, the score will increase
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: scaf_logsum
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  scaf_powsum:
    name: scaf_powsum
    description: Powersum of all scaffolds is the same as logsum except that it uses
      the sum of (length*(length^P)) for some power P (default P=0.25).
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: scaf_powsum
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  scaf_max:
    name: scaf_max
    description: Maximum scaffold length.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: scaf_max
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  scaf_bp:
    name: scaf_bp
    description: Total size in bp of all scaffolds.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: scaf_bp
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  scaf_n50:
    name: scaf_n50
    description: Given a set of scaffolds, each with its own length, the N50 count
      is defined as the smallest number of scaffolds whose length sum makes up half
      of genome size.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: scaf_n50
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  scaf_n90:
    name: scaf_n90
    description: Given a set of scaffolds, each with its own length, the N90 count
      is defined as the smallest number of scaffolds whose length sum makes up 90%
      of genome size.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: scaf_n90
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  scaf_l50:
    name: scaf_l50
    description: Given a set of scaffolds, the L50 is defined as the sequence length
      of the shortest scaffold at 50% of the total genome length.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: scaf_l50
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  scaf_l90:
    name: scaf_l90
    description: The L90 statistic is less than or equal to the L50 statistic; it
      is the length for which the collection of all scaffolds of that length or longer
      contains at least 90% of the sum of the lengths of all scaffolds.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: scaf_l90
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  scaf_n_gt50k:
    name: scaf_n_gt50k
    description: Total sequence count of scaffolds greater than 50 KB.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: scaf_n_gt50k
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  scaf_l_gt50k:
    name: scaf_l_gt50k
    description: Total size in bp of all scaffolds greater than 50 KB.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: scaf_l_gt50k
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  scaf_pct_gt50k:
    name: scaf_pct_gt50k
    description: Total sequence size percentage of scaffolds greater than 50 KB.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: scaf_pct_gt50k
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  contigs:
    name: contigs
    description: The sum of the (length*log(length)) of all contigs, times some constant.  Increase
      the contiguity, the score will increase
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: contigs
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  contig_bp:
    name: contig_bp
    description: Total size in bp of all contigs.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: contig_bp
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  ctg_n50:
    name: ctg_n50
    description: Given a set of contigs, each with its own length, the N50 count is
      defined as the smallest number_of_contigs whose length sum makes up half of
      genome size.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: ctg_n50
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  ctg_l50:
    name: ctg_l50
    description: Given a set of contigs, the L50 is defined as the sequence length
      of the shortest contig at 50% of the total genome length.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: ctg_l50
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  ctg_n90:
    name: ctg_n90
    description: Given a set of contigs, each with its own length, the N90 count is
      defined as the smallest number of contigs whose length sum makes up 90% of genome
      size.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: ctg_n90
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  ctg_l90:
    name: ctg_l90
    description: The L90 statistic is less than or equal to the L50 statistic; it
      is the length for which the collection of all contigs of that length or longer
      contains at least 90% of the sum of the lengths of all contigs.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: ctg_l90
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  ctg_logsum:
    name: ctg_logsum
    description: Maximum contig length.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: ctg_logsum
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  ctg_powsum:
    name: ctg_powsum
    description: Powersum of all contigs is the same as logsum except that it uses
      the sum of (length*(length^P)) for some power P (default P=0.25).
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: ctg_powsum
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  ctg_max:
    name: ctg_max
    description: Maximum contig length.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: ctg_max
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  gap_pct:
    name: gap_pct
    description: The gap size percentage of all scaffolds.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: gap_pct
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  gc_std:
    name: gc_std
    description: Standard deviation of GC content of all contigs.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: gc_std
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  gc_avg:
    name: gc_avg
    description: Average of GC content of all contigs.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: gc_avg
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  num_input_reads:
    name: num_input_reads
    description: The sequence count number of input reads for assembly.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: num_input_reads
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  num_aligned_reads:
    name: num_aligned_reads
    description: The sequence count number of input reads aligned to assembled contigs.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: metagenome_assembly_parameter
    alias: num_aligned_reads
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: float
  insdc_assembly_identifiers:
    name: insdc_assembly_identifiers
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: assembly_identifiers
    mixins:
    - insdc_identifiers
    alias: insdc_assembly_identifiers
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetagenomeAssembly
    - MetatranscriptomeAssembly
    range: string
    pattern: ^insdc.sra:[A-Z]+[0-9]+(\.[0-9]+)?$
  ended_at_time:
    name: ended_at_time
    notes:
    - 'The regex for ISO-8601 format was taken from here: https://www.myintervals.com/blog/2009/05/20/iso-8601-date-validation-that-doesnt-suck/
      It may not be complete, but it is good enough for now.'
    from_schema: https://w3id.org/nmdc/nmdc
    mappings:
    - prov:endedAtTime
    rank: 1000
    alias: ended_at_time
    owner: MetatranscriptomeAssembly
    domain_of:
    - WorkflowExecution
    range: string
    pattern: ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
  execution_resource:
    name: execution_resource
    description: The computing resource or facility where the workflow was executed.
    examples:
    - value: NERSC-Cori
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: execution_resource
    owner: MetatranscriptomeAssembly
    domain_of:
    - WorkflowExecution
    range: ExecutionResourceEnum
    required: true
  git_url:
    name: git_url
    description: The url that points to the exact github location of a workflow.
    examples:
    - value: https://github.com/microbiomedata/mg_annotation/releases/tag/0.1
    - value: https://github.com/microbiomedata/metaMS/blob/master/metaMS/gcmsWorkflow.py
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: git_url
    owner: MetatranscriptomeAssembly
    domain_of:
    - WorkflowExecution
    range: string
    required: true
  started_at_time:
    name: started_at_time
    notes:
    - 'The regex for ISO-8601 format was taken from here: https://www.myintervals.com/blog/2009/05/20/iso-8601-date-validation-that-doesnt-suck/
      It may not be complete, but it is good enough for now.'
    from_schema: https://w3id.org/nmdc/nmdc
    mappings:
    - prov:startedAtTime
    rank: 1000
    alias: started_at_time
    owner: MetatranscriptomeAssembly
    domain_of:
    - WorkflowExecution
    range: string
    required: true
    pattern: ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
  version:
    name: version
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: version
    owner: MetatranscriptomeAssembly
    domain_of:
    - WorkflowExecution
    range: string
  was_informed_by:
    name: was_informed_by
    from_schema: https://w3id.org/nmdc/nmdc
    mappings:
    - prov:wasInformedBy
    rank: 1000
    alias: was_informed_by
    owner: MetatranscriptomeAssembly
    domain_of:
    - WorkflowExecution
    range: DataGeneration
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
      interpolated: true
  has_input:
    name: has_input
    description: An input to a process.
    from_schema: https://w3id.org/nmdc/nmdc
    aliases:
    - input
    rank: 1000
    alias: has_input
    owner: MetatranscriptomeAssembly
    domain_of:
    - PlannedProcess
    range: NamedThing
    required: true
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
  has_output:
    name: has_output
    description: An output from a process.
    from_schema: https://w3id.org/nmdc/nmdc
    aliases:
    - output
    rank: 1000
    alias: has_output
    owner: MetatranscriptomeAssembly
    domain_of:
    - PlannedProcess
    range: NamedThing
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
  processing_institution:
    name: processing_institution
    description: The organization that processed the sample.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: processing_institution
    owner: MetatranscriptomeAssembly
    domain_of:
    - PlannedProcess
    range: ProcessingInstitutionEnum
  protocol_link:
    name: protocol_link
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: protocol_link
    owner: MetatranscriptomeAssembly
    domain_of:
    - PlannedProcess
    - Study
    range: Protocol
  start_date:
    name: start_date
    description: The date on which any process or activity was started
    todos:
    - add date string validation pattern
    comments:
    - We are using string representations of dates until all components of our ecosystem
      can handle ISO 8610 dates
    - The date should be formatted as YYYY-MM-DD
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: start_date
    owner: MetatranscriptomeAssembly
    domain_of:
    - PlannedProcess
    range: string
  end_date:
    name: end_date
    description: The date on which any process or activity was ended
    todos:
    - add date string validation pattern
    comments:
    - We are using string representations of dates until all components of our ecosystem
      can handle ISO 8610 dates
    - The date should be formatted as YYYY-MM-DD
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: end_date
    owner: MetatranscriptomeAssembly
    domain_of:
    - PlannedProcess
    range: string
  qc_status:
    name: qc_status
    description: Stores information about the result of a process (ie the process
      of sequencing a library may have for qc_status of 'fail' if not enough data
      was generated)
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: qc_status
    owner: MetatranscriptomeAssembly
    domain_of:
    - PlannedProcess
    range: StatusEnum
  qc_comment:
    name: qc_comment
    description: Slot to store additional comments about laboratory or workflow output.
      For workflow output it may describe the particular workflow stage that failed.
      (ie Failed at call-stage due to a malformed fastq file).
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: qc_comment
    owner: MetatranscriptomeAssembly
    domain_of:
    - PlannedProcess
    range: string
  has_failure_categorization:
    name: has_failure_categorization
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: has_failure_categorization
    owner: MetatranscriptomeAssembly
    domain_of:
    - PlannedProcess
    range: FailureCategorization
    multivalued: true
    inlined: true
    inlined_as_list: true
  id:
    name: id
    description: A unique identifier for a thing. Must be either a CURIE shorthand
      for a URI or a complete URI
    notes:
    - 'abstracted pattern: prefix:typecode-authshoulder-blade(.version)?(_seqsuffix)?'
    - a minimum length of 3 characters is suggested for typecodes, but 1 or 2 characters
      will be accepted
    - typecodes must correspond 1:1 to a class in the NMDC schema. this will be checked
      via per-class id slot usage assertions
    - minting authority shoulders should probably be enumerated and checked in the
      pattern
    examples:
    - value: nmdc:mgmag-00-x012.1_7_c1
      description: https://github.com/microbiomedata/nmdc-schema/pull/499#discussion_r1018499248
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    identifier: true
    alias: id
    owner: MetatranscriptomeAssembly
    domain_of:
    - NamedThing
    range: uriorcurie
    required: true
    pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
    structured_pattern:
      syntax: '{id_nmdc_prefix}:wfmtas-{id_shoulder}-{id_blade}{id_version}$'
      interpolated: true
  name:
    name: name
    description: A human readable label for an entity
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: name
    owner: MetatranscriptomeAssembly
    domain_of:
    - PersonValue
    - NamedThing
    - Protocol
    range: string
  description:
    name: description
    description: a human-readable description of a thing
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    slot_uri: dcterms:description
    alias: description
    owner: MetatranscriptomeAssembly
    domain_of:
    - ImageValue
    - NamedThing
    range: string
  alternative_identifiers:
    name: alternative_identifiers
    description: A list of alternative identifiers for the entity.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: alternative_identifiers
    owner: MetatranscriptomeAssembly
    domain_of:
    - MetaboliteIdentification
    - NamedThing
    range: uriorcurie
    multivalued: true
    pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
  type:
    name: type
    description: the class_uri of the class that has been instantiated
    notes:
    - replaces legacy nmdc:type slot
    - makes it easier to read example data files
    - required for polymorphic MongoDB collections
    examples:
    - value: nmdc:Biosample
    - value: nmdc:Study
    from_schema: https://w3id.org/nmdc/nmdc
    see_also:
    - https://github.com/microbiomedata/nmdc-schema/issues/1048
    - https://github.com/microbiomedata/nmdc-schema/issues/1233
    - https://github.com/microbiomedata/nmdc-schema/issues/248
    rank: 1000
    slot_uri: rdf:type
    designates_type: true
    alias: type
    owner: MetatranscriptomeAssembly
    domain_of:
    - EukEval
    - FunctionalAnnotationAggMember
    - MobilePhaseSegment
    - PortionOfSubstance
    - MagBin
    - MetaboliteIdentification
    - PeptideQuantification
    - ProteinQuantification
    - GenomeFeature
    - FunctionalAnnotation
    - AttributeValue
    - NamedThing
    - FailureCategorization
    - Protocol
    - CreditAssociation
    - Doi
    range: uriorcurie
    required: true
class_uri: nmdc:MetatranscriptomeAssembly