Skip to content

Class: Read quality control analysis activity (ReadQcAnalysis)

A workflow execution activity that performs quality control on raw Illumina reads including quality trimming, artifact removal, linker trimming, adapter trimming, spike-in removal, and human/cat/dog/mouse/microbe contaminant removal

URI: nmdc:ReadQcAnalysis

classDiagram class ReadQcAnalysis click ReadQcAnalysis href "../ReadQcAnalysis" WorkflowExecution <|-- ReadQcAnalysis click WorkflowExecution href "../WorkflowExecution" ReadQcAnalysis : alternative_identifiers ReadQcAnalysis : description ReadQcAnalysis : end_date ReadQcAnalysis : ended_at_time ReadQcAnalysis : execution_resource ReadQcAnalysis --> "1" ExecutionResourceEnum : execution_resource click ExecutionResourceEnum href "../ExecutionResourceEnum" ReadQcAnalysis : git_url ReadQcAnalysis : has_failure_categorization ReadQcAnalysis --> "*" FailureCategorization : has_failure_categorization click FailureCategorization href "../FailureCategorization" ReadQcAnalysis : has_input ReadQcAnalysis --> "1..*" NamedThing : has_input click NamedThing href "../NamedThing" ReadQcAnalysis : has_output ReadQcAnalysis --> "*" NamedThing : has_output click NamedThing href "../NamedThing" ReadQcAnalysis : id ReadQcAnalysis : input_base_count ReadQcAnalysis : input_read_bases ReadQcAnalysis : input_read_count ReadQcAnalysis : name ReadQcAnalysis : output_base_count ReadQcAnalysis : output_read_bases ReadQcAnalysis : output_read_count ReadQcAnalysis : processing_institution ReadQcAnalysis --> "0..1" ProcessingInstitutionEnum : processing_institution click ProcessingInstitutionEnum href "../ProcessingInstitutionEnum" ReadQcAnalysis : protocol_link ReadQcAnalysis --> "0..1" Protocol : protocol_link click Protocol href "../Protocol" ReadQcAnalysis : qc_comment ReadQcAnalysis : qc_status ReadQcAnalysis --> "0..1" StatusEnum : qc_status click StatusEnum href "../StatusEnum" ReadQcAnalysis : start_date ReadQcAnalysis : started_at_time ReadQcAnalysis : type ReadQcAnalysis : version ReadQcAnalysis : was_informed_by ReadQcAnalysis --> "1" DataGeneration : was_informed_by click DataGeneration href "../DataGeneration"

Inheritance

Slots

Name Cardinality and Range Description Inheritance
input_base_count 0..1
Float
The nucleotide base count number of input reads for QC analysis direct
input_read_bases 0..1
Float
TODO direct
input_read_count 0..1
Float
The sequence count number of input reads for QC analysis direct
output_base_count 0..1
Float
After QC analysis nucleotide base count number direct
output_read_bases 0..1
Float
TODO direct
output_read_count 0..1
Float
After QC analysis sequence count number direct
ended_at_time 0..1
String
WorkflowExecution
execution_resource 1
ExecutionResourceEnum
The computing resource or facility where the workflow was executed WorkflowExecution
git_url 1
String
The url that points to the exact github location of a workflow WorkflowExecution
started_at_time 1
String
WorkflowExecution
version 0..1
String
WorkflowExecution
was_informed_by 1
DataGeneration
WorkflowExecution
has_input 1..*
NamedThing
An input to a process PlannedProcess
has_output *
NamedThing
An output from a process PlannedProcess
processing_institution 0..1
ProcessingInstitutionEnum
The organization that processed the sample PlannedProcess
protocol_link 0..1
Protocol
PlannedProcess
start_date 0..1
String
The date on which any process or activity was started PlannedProcess
end_date 0..1
String
The date on which any process or activity was ended PlannedProcess
qc_status 0..1
StatusEnum
Stores information about the result of a process (ie the process of sequencin... PlannedProcess
qc_comment 0..1
String
Slot to store additional comments about laboratory or workflow output PlannedProcess
has_failure_categorization *
FailureCategorization
PlannedProcess
id 1
Uriorcurie
A unique identifier for a thing NamedThing
name 0..1
String
A human readable label for an entity NamedThing
description 0..1
String
a human-readable description of a thing NamedThing
alternative_identifiers *
Uriorcurie
A list of alternative identifiers for the entity NamedThing
type 1
Uriorcurie
the class_uri of the class that has been instantiated NamedThing

Identifier and Mapping Information

Schema Source

Mappings

Mapping Type Mapped Value
self nmdc:ReadQcAnalysis
native nmdc:ReadQcAnalysis

LinkML Source

Direct

name: ReadQcAnalysis
description: A workflow execution activity that performs quality control on raw Illumina
  reads including quality trimming, artifact removal, linker trimming, adapter trimming,
  spike-in removal, and human/cat/dog/mouse/microbe contaminant removal
title: Read quality control analysis activity
from_schema: https://w3id.org/nmdc/nmdc
is_a: WorkflowExecution
slots:
- input_base_count
- input_read_bases
- input_read_count
- output_base_count
- output_read_bases
- output_read_count
slot_usage:
  id:
    name: id
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:wfrqc-{id_shoulder}-{id_blade}{id_version}$'
      interpolated: true
  was_informed_by:
    name: was_informed_by
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
      interpolated: true
class_uri: nmdc:ReadQcAnalysis

Induced

name: ReadQcAnalysis
description: A workflow execution activity that performs quality control on raw Illumina
  reads including quality trimming, artifact removal, linker trimming, adapter trimming,
  spike-in removal, and human/cat/dog/mouse/microbe contaminant removal
title: Read quality control analysis activity
from_schema: https://w3id.org/nmdc/nmdc
is_a: WorkflowExecution
slot_usage:
  id:
    name: id
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:wfrqc-{id_shoulder}-{id_blade}{id_version}$'
      interpolated: true
  was_informed_by:
    name: was_informed_by
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
      interpolated: true
attributes:
  input_base_count:
    name: input_base_count
    description: The nucleotide base count number of input reads for QC analysis.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: read_qc_analysis_statistic
    alias: input_base_count
    owner: ReadQcAnalysis
    domain_of:
    - ReadQcAnalysis
    range: float
  input_read_bases:
    name: input_read_bases
    description: 'TODO      '
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: input_read_bases
    owner: ReadQcAnalysis
    domain_of:
    - ReadQcAnalysis
    range: float
  input_read_count:
    name: input_read_count
    description: The sequence count number of input reads for QC analysis.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: read_qc_analysis_statistic
    alias: input_read_count
    owner: ReadQcAnalysis
    domain_of:
    - ReadQcAnalysis
    range: float
  output_base_count:
    name: output_base_count
    description: After QC analysis nucleotide base count number.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: read_qc_analysis_statistic
    alias: output_base_count
    owner: ReadQcAnalysis
    domain_of:
    - ReadQcAnalysis
    range: float
  output_read_bases:
    name: output_read_bases
    description: TODO
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: output_read_bases
    owner: ReadQcAnalysis
    domain_of:
    - ReadQcAnalysis
    range: float
  output_read_count:
    name: output_read_count
    description: After QC analysis sequence count number.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: read_qc_analysis_statistic
    alias: output_read_count
    owner: ReadQcAnalysis
    domain_of:
    - ReadQcAnalysis
    range: float
  ended_at_time:
    name: ended_at_time
    notes:
    - 'The regex for ISO-8601 format was taken from here: https://www.myintervals.com/blog/2009/05/20/iso-8601-date-validation-that-doesnt-suck/
      It may not be complete, but it is good enough for now.'
    from_schema: https://w3id.org/nmdc/nmdc
    mappings:
    - prov:endedAtTime
    rank: 1000
    alias: ended_at_time
    owner: ReadQcAnalysis
    domain_of:
    - WorkflowExecution
    range: string
    pattern: ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
  execution_resource:
    name: execution_resource
    description: The computing resource or facility where the workflow was executed.
    examples:
    - value: NERSC-Cori
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: execution_resource
    owner: ReadQcAnalysis
    domain_of:
    - WorkflowExecution
    range: ExecutionResourceEnum
    required: true
  git_url:
    name: git_url
    description: The url that points to the exact github location of a workflow.
    examples:
    - value: https://github.com/microbiomedata/mg_annotation/releases/tag/0.1
    - value: https://github.com/microbiomedata/metaMS/blob/master/metaMS/gcmsWorkflow.py
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: git_url
    owner: ReadQcAnalysis
    domain_of:
    - WorkflowExecution
    range: string
    required: true
  started_at_time:
    name: started_at_time
    notes:
    - 'The regex for ISO-8601 format was taken from here: https://www.myintervals.com/blog/2009/05/20/iso-8601-date-validation-that-doesnt-suck/
      It may not be complete, but it is good enough for now.'
    from_schema: https://w3id.org/nmdc/nmdc
    mappings:
    - prov:startedAtTime
    rank: 1000
    alias: started_at_time
    owner: ReadQcAnalysis
    domain_of:
    - WorkflowExecution
    range: string
    required: true
    pattern: ^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$
  version:
    name: version
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: version
    owner: ReadQcAnalysis
    domain_of:
    - WorkflowExecution
    range: string
  was_informed_by:
    name: was_informed_by
    from_schema: https://w3id.org/nmdc/nmdc
    mappings:
    - prov:wasInformedBy
    rank: 1000
    alias: was_informed_by
    owner: ReadQcAnalysis
    domain_of:
    - WorkflowExecution
    range: DataGeneration
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(omprc|dgns)-{id_shoulder}-{id_blade}$'
      interpolated: true
  has_input:
    name: has_input
    description: An input to a process.
    from_schema: https://w3id.org/nmdc/nmdc
    aliases:
    - input
    rank: 1000
    alias: has_input
    owner: ReadQcAnalysis
    domain_of:
    - PlannedProcess
    range: NamedThing
    required: true
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
  has_output:
    name: has_output
    description: An output from a process.
    from_schema: https://w3id.org/nmdc/nmdc
    aliases:
    - output
    rank: 1000
    alias: has_output
    owner: ReadQcAnalysis
    domain_of:
    - PlannedProcess
    range: NamedThing
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
  processing_institution:
    name: processing_institution
    description: The organization that processed the sample.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: processing_institution
    owner: ReadQcAnalysis
    domain_of:
    - PlannedProcess
    range: ProcessingInstitutionEnum
  protocol_link:
    name: protocol_link
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: protocol_link
    owner: ReadQcAnalysis
    domain_of:
    - PlannedProcess
    - Study
    range: Protocol
  start_date:
    name: start_date
    description: The date on which any process or activity was started
    todos:
    - add date string validation pattern
    comments:
    - We are using string representations of dates until all components of our ecosystem
      can handle ISO 8610 dates
    - The date should be formatted as YYYY-MM-DD
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: start_date
    owner: ReadQcAnalysis
    domain_of:
    - PlannedProcess
    range: string
  end_date:
    name: end_date
    description: The date on which any process or activity was ended
    todos:
    - add date string validation pattern
    comments:
    - We are using string representations of dates until all components of our ecosystem
      can handle ISO 8610 dates
    - The date should be formatted as YYYY-MM-DD
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: end_date
    owner: ReadQcAnalysis
    domain_of:
    - PlannedProcess
    range: string
  qc_status:
    name: qc_status
    description: Stores information about the result of a process (ie the process
      of sequencing a library may have for qc_status of 'fail' if not enough data
      was generated)
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: qc_status
    owner: ReadQcAnalysis
    domain_of:
    - PlannedProcess
    range: StatusEnum
  qc_comment:
    name: qc_comment
    description: Slot to store additional comments about laboratory or workflow output.
      For workflow output it may describe the particular workflow stage that failed.
      (ie Failed at call-stage due to a malformed fastq file).
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: qc_comment
    owner: ReadQcAnalysis
    domain_of:
    - PlannedProcess
    range: string
  has_failure_categorization:
    name: has_failure_categorization
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: has_failure_categorization
    owner: ReadQcAnalysis
    domain_of:
    - PlannedProcess
    range: FailureCategorization
    multivalued: true
    inlined: true
    inlined_as_list: true
  id:
    name: id
    description: A unique identifier for a thing. Must be either a CURIE shorthand
      for a URI or a complete URI
    notes:
    - 'abstracted pattern: prefix:typecode-authshoulder-blade(.version)?(_seqsuffix)?'
    - a minimum length of 3 characters is suggested for typecodes, but 1 or 2 characters
      will be accepted
    - typecodes must correspond 1:1 to a class in the NMDC schema. this will be checked
      via per-class id slot usage assertions
    - minting authority shoulders should probably be enumerated and checked in the
      pattern
    examples:
    - value: nmdc:mgmag-00-x012.1_7_c1
      description: https://github.com/microbiomedata/nmdc-schema/pull/499#discussion_r1018499248
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    identifier: true
    alias: id
    owner: ReadQcAnalysis
    domain_of:
    - NamedThing
    range: uriorcurie
    required: true
    pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
    structured_pattern:
      syntax: '{id_nmdc_prefix}:wfrqc-{id_shoulder}-{id_blade}{id_version}$'
      interpolated: true
  name:
    name: name
    description: A human readable label for an entity
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: name
    owner: ReadQcAnalysis
    domain_of:
    - PersonValue
    - NamedThing
    - Protocol
    range: string
  description:
    name: description
    description: a human-readable description of a thing
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    slot_uri: dcterms:description
    alias: description
    owner: ReadQcAnalysis
    domain_of:
    - ImageValue
    - NamedThing
    range: string
  alternative_identifiers:
    name: alternative_identifiers
    description: A list of alternative identifiers for the entity.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: alternative_identifiers
    owner: ReadQcAnalysis
    domain_of:
    - MetaboliteIdentification
    - NamedThing
    range: uriorcurie
    multivalued: true
    pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,\(\)\=\#]*$
  type:
    name: type
    description: the class_uri of the class that has been instantiated
    notes:
    - replaces legacy nmdc:type slot
    - makes it easier to read example data files
    - required for polymorphic MongoDB collections
    examples:
    - value: nmdc:Biosample
    - value: nmdc:Study
    from_schema: https://w3id.org/nmdc/nmdc
    see_also:
    - https://github.com/microbiomedata/nmdc-schema/issues/1048
    - https://github.com/microbiomedata/nmdc-schema/issues/1233
    - https://github.com/microbiomedata/nmdc-schema/issues/248
    rank: 1000
    slot_uri: rdf:type
    designates_type: true
    alias: type
    owner: ReadQcAnalysis
    domain_of:
    - EukEval
    - FunctionalAnnotationAggMember
    - PeptideQuantification
    - ProteinQuantification
    - MobilePhaseSegment
    - PortionOfSubstance
    - MagBin
    - MetaboliteIdentification
    - GenomeFeature
    - FunctionalAnnotation
    - AttributeValue
    - NamedThing
    - OntologyRelation
    - FailureCategorization
    - Protocol
    - CreditAssociation
    - Doi
    range: uriorcurie
    required: true
class_uri: nmdc:ReadQcAnalysis