Skip to content

Class: Metagenome-Assembled Genome analysis activity (MagsAnalysis)

A workflow execution activity that uses computational binning tools to group assembled contigs into genomes

URI: nmdc:MagsAnalysis

classDiagram class MagsAnalysis click MagsAnalysis href "../MagsAnalysis" WorkflowExecution <|-- MagsAnalysis click WorkflowExecution href "../WorkflowExecution" MagsAnalysis : alternative_identifiers MagsAnalysis : binned_contig_num MagsAnalysis : description MagsAnalysis : end_date MagsAnalysis : ended_at_time MagsAnalysis : execution_resource MagsAnalysis --> "1" ExecutionResourceEnum : execution_resource click ExecutionResourceEnum href "../ExecutionResourceEnum" MagsAnalysis : git_url MagsAnalysis : has_failure_categorization MagsAnalysis --> "*" FailureCategorization : has_failure_categorization click FailureCategorization href "../FailureCategorization" MagsAnalysis : has_input MagsAnalysis --> "1..*" NamedThing : has_input click NamedThing href "../NamedThing" MagsAnalysis : has_output MagsAnalysis --> "*" NamedThing : has_output click NamedThing href "../NamedThing" MagsAnalysis : id MagsAnalysis : img_identifiers MagsAnalysis : input_contig_num MagsAnalysis : low_depth_contig_num MagsAnalysis : mags_list MagsAnalysis --> "*" MagBin : mags_list click MagBin href "../MagBin" MagsAnalysis : name MagsAnalysis : processing_institution MagsAnalysis --> "0..1" ProcessingInstitutionEnum : processing_institution click ProcessingInstitutionEnum href "../ProcessingInstitutionEnum" MagsAnalysis : protocol_link MagsAnalysis --> "0..1" Protocol : protocol_link click Protocol href "../Protocol" MagsAnalysis : qc_comment MagsAnalysis : qc_status MagsAnalysis --> "0..1" StatusEnum : qc_status click StatusEnum href "../StatusEnum" MagsAnalysis : start_date MagsAnalysis : started_at_time MagsAnalysis : too_short_contig_num MagsAnalysis : type MagsAnalysis : unbinned_contig_num MagsAnalysis : version MagsAnalysis : was_informed_by MagsAnalysis --> "1" DataGeneration : was_informed_by click DataGeneration href "../DataGeneration"

Inheritance

Slots

Name Cardinality and Range Description Inheritance
binned_contig_num 0..1
Integer
Number of contigs that ended up in a medium or high quality bin direct
input_contig_num 0..1
Integer
Total number of input contigs direct
low_depth_contig_num 0..1
Integer
Number of contigs which were excluded from binning for depth of coverage direct
mags_list *
MagBin
Contains detailed information about each metagenome-assembled genome direct
too_short_contig_num 0..1
Integer
Number of contigs which were excluded from binning for length direct
unbinned_contig_num 0..1
Integer
Number of contigs which did not end up in a medium or high quality bin direct
img_identifiers *
ExternalIdentifier
A list of identifiers that relate the biosample to records in the IMG databas... direct
ended_at_time 0..1
String
WorkflowExecution
execution_resource 1
ExecutionResourceEnum
The computing resource or facility where the workflow was executed WorkflowExecution
git_url 1
String
The url that points to the exact github location of a workflow WorkflowExecution
started_at_time 1
String
WorkflowExecution
version 0..1
String
WorkflowExecution
was_informed_by 1
DataGeneration
WorkflowExecution
has_input 1..*
NamedThing
An input to a process PlannedProcess
has_output *
NamedThing
An output from a process PlannedProcess
processing_institution 0..1
ProcessingInstitutionEnum
The organization that processed the sample PlannedProcess
protocol_link 0..1
Protocol
PlannedProcess
start_date 0..1
String
The date on which any process or activity was started PlannedProcess
end_date 0..1
String
The date on which any process or activity was ended PlannedProcess
qc_status 0..1
StatusEnum
Stores information about the result of a process (ie the process of sequencin... PlannedProcess
qc_comment 0..1
String
Slot to store additional comments about laboratory or workflow output PlannedProcess
has_failure_categorization *
FailureCategorization
PlannedProcess
id 1
Uriorcurie
A unique identifier for a thing NamedThing
name 0..1
String
A human readable label for an entity NamedThing
description 0..1
String
a human-readable description of a thing NamedThing
alternative_identifiers *
Uriorcurie
A list of alternative identifiers for the entity NamedThing
type 1
Uriorcurie
the class_uri of the class that has been instantiated NamedThing

Identifier and Mapping Information

Schema Source

Mappings

Mapping Type Mapped Value
self nmdc:MagsAnalysis
native nmdc:MagsAnalysis

LinkML Source

Direct

name: MagsAnalysis
description: A workflow execution activity that uses computational binning tools to
  group assembled contigs into genomes
title: Metagenome-Assembled Genome analysis activity
from_schema: https://w3id.org/nmdc/nmdc
is_a: WorkflowExecution
slots:
- binned_contig_num
- input_contig_num
- low_depth_contig_num
- mags_list
- too_short_contig_num
- unbinned_contig_num
- img_identifiers
slot_usage:
  id:
    name: id
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:wfmag-{id_shoulder}-{id_blade}{id_version}$'
      interpolated: true
  img_identifiers:
    name: img_identifiers
    maximum_cardinality: 1
  was_informed_by:
    name: was_informed_by
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
      interpolated: true
class_uri: nmdc:MagsAnalysis

Induced

name: MagsAnalysis
description: A workflow execution activity that uses computational binning tools to
  group assembled contigs into genomes
title: Metagenome-Assembled Genome analysis activity
from_schema: https://w3id.org/nmdc/nmdc
is_a: WorkflowExecution
slot_usage:
  id:
    name: id
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:wfmag-{id_shoulder}-{id_blade}{id_version}$'
      interpolated: true
  img_identifiers:
    name: img_identifiers
    maximum_cardinality: 1
  was_informed_by:
    name: was_informed_by
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
      interpolated: true
attributes:
  binned_contig_num:
    name: binned_contig_num
    description: Number of contigs that ended up in a medium or high quality bin.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: binned_contig_num
    owner: MagsAnalysis
    domain_of:
    - MagsAnalysis
    range: integer
    minimum_value: 0
  input_contig_num:
    name: input_contig_num
    description: Total number of input contigs.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: input_contig_num
    owner: MagsAnalysis
    domain_of:
    - MagsAnalysis
    range: integer
    minimum_value: 0
  low_depth_contig_num:
    name: low_depth_contig_num
    description: Number of contigs which were excluded from binning for depth of coverage.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: low_depth_contig_num
    owner: MagsAnalysis
    domain_of:
    - MagsAnalysis
    range: integer
    minimum_value: 0
  mags_list:
    name: mags_list
    description: Contains detailed information about each metagenome-assembled genome.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: mags_list
    owner: MagsAnalysis
    domain_of:
    - MagsAnalysis
    range: MagBin
    multivalued: true
    inlined: true
    inlined_as_list: true
  too_short_contig_num:
    name: too_short_contig_num
    description: Number of contigs which were excluded from binning for length.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: too_short_contig_num
    owner: MagsAnalysis
    domain_of:
    - MagsAnalysis
    range: integer
    minimum_value: 0
  unbinned_contig_num:
    name: unbinned_contig_num
    description: Number of contigs which did not end up in a medium or high quality
      bin.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: unbinned_contig_num
    owner: MagsAnalysis
    domain_of:
    - MagsAnalysis
    range: integer
    minimum_value: 0
  img_identifiers:
    name: img_identifiers
    description: A list of identifiers that relate the biosample to records in the
      IMG database.
    title: IMG Identifiers
    todos:
    - add is_a or mixin modeling, like other external_database_identifiers
    - what class would IMG records belong to?! Are they Studies, Biosamples, or something
      else?
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: external_database_identifiers
    alias: img_identifiers
    owner: MagsAnalysis
    domain_of:
    - MetagenomeAnnotation
    - Biosample
    - MetatranscriptomeAnnotation
    - MetatranscriptomeExpressionAnalysis
    - MagsAnalysis
    range: external_identifier
    multivalued: true
    pattern: ^img\.taxon:[a-zA-Z0-9_][a-zA-Z0-9_\/\.]*$
    maximum_cardinality: 1
  ended_at_time:
    name: ended_at_time
    notes:
    - 'The regex for ISO-8601 format was taken from here: https://www.myintervals.com/blog/2009/05/20/iso-8601-date-validation-that-doesnt-suck/
      It may not be complete, but it is good enough for now.'
    from_schema: https://w3id.org/nmdc/nmdc
    mappings:
    - prov:endedAtTime
    rank: 1000
    alias: ended_at_time
    owner: MagsAnalysis
    domain_of:
    - WorkflowExecution
    range: string
    pattern: ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
  execution_resource:
    name: execution_resource
    description: The computing resource or facility where the workflow was executed.
    examples:
    - value: NERSC-Cori
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: execution_resource
    owner: MagsAnalysis
    domain_of:
    - WorkflowExecution
    range: ExecutionResourceEnum
    required: true
  git_url:
    name: git_url
    description: The url that points to the exact github location of a workflow.
    examples:
    - value: https://github.com/microbiomedata/mg_annotation/releases/tag/0.1
    - value: https://github.com/microbiomedata/metaMS/blob/master/metaMS/gcmsWorkflow.py
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: git_url
    owner: MagsAnalysis
    domain_of:
    - WorkflowExecution
    range: string
    required: true
  started_at_time:
    name: started_at_time
    notes:
    - 'The regex for ISO-8601 format was taken from here: https://www.myintervals.com/blog/2009/05/20/iso-8601-date-validation-that-doesnt-suck/
      It may not be complete, but it is good enough for now.'
    from_schema: https://w3id.org/nmdc/nmdc
    mappings:
    - prov:startedAtTime
    rank: 1000
    alias: started_at_time
    owner: MagsAnalysis
    domain_of:
    - WorkflowExecution
    range: string
    required: true
    pattern: ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
  version:
    name: version
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: version
    owner: MagsAnalysis
    domain_of:
    - WorkflowExecution
    range: string
  was_informed_by:
    name: was_informed_by
    from_schema: https://w3id.org/nmdc/nmdc
    mappings:
    - prov:wasInformedBy
    rank: 1000
    alias: was_informed_by
    owner: MagsAnalysis
    domain_of:
    - WorkflowExecution
    range: DataGeneration
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
      interpolated: true
  has_input:
    name: has_input
    description: An input to a process.
    from_schema: https://w3id.org/nmdc/nmdc
    aliases:
    - input
    rank: 1000
    alias: has_input
    owner: MagsAnalysis
    domain_of:
    - PlannedProcess
    range: NamedThing
    required: true
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
  has_output:
    name: has_output
    description: An output from a process.
    from_schema: https://w3id.org/nmdc/nmdc
    aliases:
    - output
    rank: 1000
    alias: has_output
    owner: MagsAnalysis
    domain_of:
    - PlannedProcess
    range: NamedThing
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
  processing_institution:
    name: processing_institution
    description: The organization that processed the sample.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: processing_institution
    owner: MagsAnalysis
    domain_of:
    - PlannedProcess
    range: ProcessingInstitutionEnum
  protocol_link:
    name: protocol_link
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: protocol_link
    owner: MagsAnalysis
    domain_of:
    - PlannedProcess
    - Study
    range: Protocol
  start_date:
    name: start_date
    description: The date on which any process or activity was started
    todos:
    - add date string validation pattern
    comments:
    - We are using string representations of dates until all components of our ecosystem
      can handle ISO 8610 dates
    - The date should be formatted as YYYY-MM-DD
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: start_date
    owner: MagsAnalysis
    domain_of:
    - PlannedProcess
    range: string
  end_date:
    name: end_date
    description: The date on which any process or activity was ended
    todos:
    - add date string validation pattern
    comments:
    - We are using string representations of dates until all components of our ecosystem
      can handle ISO 8610 dates
    - The date should be formatted as YYYY-MM-DD
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: end_date
    owner: MagsAnalysis
    domain_of:
    - PlannedProcess
    range: string
  qc_status:
    name: qc_status
    description: Stores information about the result of a process (ie the process
      of sequencing a library may have for qc_status of 'fail' if not enough data
      was generated)
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: qc_status
    owner: MagsAnalysis
    domain_of:
    - PlannedProcess
    range: StatusEnum
  qc_comment:
    name: qc_comment
    description: Slot to store additional comments about laboratory or workflow output.
      For workflow output it may describe the particular workflow stage that failed.
      (ie Failed at call-stage due to a malformed fastq file).
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: qc_comment
    owner: MagsAnalysis
    domain_of:
    - PlannedProcess
    range: string
  has_failure_categorization:
    name: has_failure_categorization
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: has_failure_categorization
    owner: MagsAnalysis
    domain_of:
    - PlannedProcess
    range: FailureCategorization
    multivalued: true
    inlined: true
    inlined_as_list: true
  id:
    name: id
    description: A unique identifier for a thing. Must be either a CURIE shorthand
      for a URI or a complete URI
    notes:
    - 'abstracted pattern: prefix:typecode-authshoulder-blade(.version)?(_seqsuffix)?'
    - a minimum length of 3 characters is suggested for typecodes, but 1 or 2 characters
      will be accepted
    - typecodes must correspond 1:1 to a class in the NMDC schema. this will be checked
      via per-class id slot usage assertions
    - minting authority shoulders should probably be enumerated and checked in the
      pattern
    examples:
    - value: nmdc:mgmag-00-x012.1_7_c1
      description: https://github.com/microbiomedata/nmdc-schema/pull/499#discussion_r1018499248
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    identifier: true
    alias: id
    owner: MagsAnalysis
    domain_of:
    - NamedThing
    range: uriorcurie
    required: true
    pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
    structured_pattern:
      syntax: '{id_nmdc_prefix}:wfmag-{id_shoulder}-{id_blade}{id_version}$'
      interpolated: true
  name:
    name: name
    description: A human readable label for an entity
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: name
    owner: MagsAnalysis
    domain_of:
    - PersonValue
    - NamedThing
    - Protocol
    range: string
  description:
    name: description
    description: a human-readable description of a thing
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    slot_uri: dcterms:description
    alias: description
    owner: MagsAnalysis
    domain_of:
    - ImageValue
    - NamedThing
    range: string
  alternative_identifiers:
    name: alternative_identifiers
    description: A list of alternative identifiers for the entity.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: alternative_identifiers
    owner: MagsAnalysis
    domain_of:
    - MetaboliteIdentification
    - NamedThing
    range: uriorcurie
    multivalued: true
    pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,\(\)\=\#]*$
  type:
    name: type
    description: the class_uri of the class that has been instantiated
    notes:
    - replaces legacy nmdc:type slot
    - makes it easier to read example data files
    - required for polymorphic MongoDB collections
    examples:
    - value: nmdc:Biosample
    - value: nmdc:Study
    from_schema: https://w3id.org/nmdc/nmdc
    see_also:
    - https://github.com/microbiomedata/nmdc-schema/issues/1048
    - https://github.com/microbiomedata/nmdc-schema/issues/1233
    - https://github.com/microbiomedata/nmdc-schema/issues/248
    rank: 1000
    slot_uri: rdf:type
    designates_type: true
    alias: type
    owner: MagsAnalysis
    domain_of:
    - EukEval
    - FunctionalAnnotationAggMember
    - PeptideQuantification
    - ProteinQuantification
    - MobilePhaseSegment
    - PortionOfSubstance
    - MagBin
    - MetaboliteIdentification
    - GenomeFeature
    - FunctionalAnnotation
    - AttributeValue
    - NamedThing
    - OntologyRelation
    - FailureCategorization
    - Protocol
    - CreditAssociation
    - Doi
    range: uriorcurie
    required: true
class_uri: nmdc:MagsAnalysis