Class: NucleotideSequencing
A DataGeneration in which the sequence of DNA or RNA molecules is generated.
URI: nmdc:NucleotideSequencing
classDiagram
  class NucleotideSequencing
  click NucleotideSequencing href "../NucleotideSequencing"
    DataGeneration <|-- NucleotideSequencing
      click DataGeneration href "../DataGeneration"
  NucleotideSequencing : add_date
  NucleotideSequencing : alternative_identifiers
  NucleotideSequencing : analyte_category
      NucleotideSequencing --> "1" NucleotideSequencingEnum : analyte_category
    click NucleotideSequencingEnum href "../NucleotideSequencingEnum"
  NucleotideSequencing : associated_studies
      NucleotideSequencing --> "1..*" Study : associated_studies
    click Study href "../Study"
  NucleotideSequencing : description
  NucleotideSequencing : end_date
  NucleotideSequencing : gold_sequencing_project_identifiers
  NucleotideSequencing : has_failure_categorization
      NucleotideSequencing --> "*" FailureCategorization : has_failure_categorization
    click FailureCategorization href "../FailureCategorization"
  NucleotideSequencing : has_input
      NucleotideSequencing --> "1..*" Sample : has_input
    click Sample href "../Sample"
  NucleotideSequencing : has_output
      NucleotideSequencing --> "*" DataObject : has_output
    click DataObject href "../DataObject"
  NucleotideSequencing : id
  NucleotideSequencing : insdc_bioproject_identifiers
  NucleotideSequencing : insdc_experiment_identifiers
  NucleotideSequencing : instrument_instance_specifier
  NucleotideSequencing : instrument_used
      NucleotideSequencing --> "*" Instrument : instrument_used
    click Instrument href "../Instrument"
  NucleotideSequencing : mod_date
  NucleotideSequencing : name
  NucleotideSequencing : ncbi_project_name
  NucleotideSequencing : principal_investigator
      NucleotideSequencing --> "0..1" PersonValue : principal_investigator
    click PersonValue href "../PersonValue"
  NucleotideSequencing : processing_institution
      NucleotideSequencing --> "0..1" ProcessingInstitutionEnum : processing_institution
    click ProcessingInstitutionEnum href "../ProcessingInstitutionEnum"
  NucleotideSequencing : protocol_link
      NucleotideSequencing --> "0..1" Protocol : protocol_link
    click Protocol href "../Protocol"
  NucleotideSequencing : qc_comment
  NucleotideSequencing : qc_status
      NucleotideSequencing --> "0..1" StatusEnum : qc_status
    click StatusEnum href "../StatusEnum"
  NucleotideSequencing : start_date
  NucleotideSequencing : type
Inheritance
- NamedThing- PlannedProcess- DataEmitterProcess- DataGeneration- NucleotideSequencing
 
 
- DataGeneration
 
- DataEmitterProcess
 
- PlannedProcess
Slots
| Name | Cardinality and Range | Description | Inheritance | 
|---|---|---|---|
| gold_sequencing_project_identifiers | * ExternalIdentifier | identifiers for corresponding sequencing project in GOLD | direct | 
| insdc_bioproject_identifiers | * ExternalIdentifier | identifiers for corresponding project in INSDC Bioproject | direct | 
| insdc_experiment_identifiers | * ExternalIdentifier | direct | |
| ncbi_project_name | 0..1 String | direct | |
| add_date | 0..1 String | The date on which the information was added to the database | DataGeneration | 
| analyte_category | 1 NucleotideSequencingEnum | The type of analyte(s) that were measured in the data generation process | DataGeneration | 
| associated_studies | 1..* Study | The study associated with a resource | DataGeneration | 
| instrument_used | * Instrument | What instrument was used during DataGeneration or MaterialProcessing | DataGeneration | 
| mod_date | 0..1 String | The last date on which the database information was modified | DataGeneration | 
| principal_investigator | 0..1 PersonValue | Principal Investigator who led the study and/or generated the dataset | DataGeneration | 
| instrument_instance_specifier | 0..1 String | A unique value that identifies an individual instrument instance, such as a s... | DataGeneration | 
| has_input | 1..* Sample | An input to a process | PlannedProcess | 
| has_output | * DataObject | An output from a process | PlannedProcess | 
| processing_institution | 0..1 ProcessingInstitutionEnum | The organization that processed the sample | PlannedProcess | 
| protocol_link | 0..1 Protocol | PlannedProcess | |
| start_date | 0..1 String | The date on which any process or activity was started | PlannedProcess | 
| end_date | 0..1 String | The date on which any process or activity was ended | PlannedProcess | 
| qc_status | 0..1 StatusEnum | Stores information about the result of a process (ie the process of sequencin... | PlannedProcess | 
| qc_comment | 0..1 String | Slot to store additional comments about laboratory or workflow output | PlannedProcess | 
| has_failure_categorization | * FailureCategorization | PlannedProcess | |
| id | 1 Uriorcurie | A unique identifier for a thing | NamedThing | 
| name | 0..1 String | A human readable label for an entity | NamedThing | 
| description | 0..1 String | a human-readable description of a thing | NamedThing | 
| alternative_identifiers | * Uriorcurie | A list of alternative identifiers for the entity | NamedThing | 
| type | 1 Uriorcurie | the class_uri of the class that has been instantiated | NamedThing | 
Usages
Comments
- For example data generated from an Illumina or Pacific Biosciences instrument.
Identifier and Mapping Information
Schema Source
- from schema: https://w3id.org/nmdc/nmdc
Mappings
| Mapping Type | Mapped Value | 
|---|---|
LinkML Source
Direct
name: NucleotideSequencing
description: A DataGeneration in which the sequence of DNA or RNA molecules is generated.
comments:
- For example data generated from an Illumina or Pacific Biosciences instrument.
from_schema: https://w3id.org/nmdc/nmdc
is_a: DataGeneration
slots:
- gold_sequencing_project_identifiers
- insdc_bioproject_identifiers
- insdc_experiment_identifiers
- ncbi_project_name
slot_usage:
  id:
    name: id
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dgns|omprc)-{id_shoulder}-{id_blade}$'
      interpolated: true
  analyte_category:
    name: analyte_category
    range: NucleotideSequencingEnum
class_uri: nmdc:NucleotideSequencing
Induced
name: NucleotideSequencing
description: A DataGeneration in which the sequence of DNA or RNA molecules is generated.
comments:
- For example data generated from an Illumina or Pacific Biosciences instrument.
from_schema: https://w3id.org/nmdc/nmdc
is_a: DataGeneration
slot_usage:
  id:
    name: id
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dgns|omprc)-{id_shoulder}-{id_blade}$'
      interpolated: true
  analyte_category:
    name: analyte_category
    range: NucleotideSequencingEnum
attributes:
  gold_sequencing_project_identifiers:
    name: gold_sequencing_project_identifiers
    description: identifiers for corresponding sequencing project in GOLD
    examples:
    - value: gold:Gp0108335
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: omics_processing_identifiers
    mixins:
    - gold_identifiers
    alias: gold_sequencing_project_identifiers
    owner: NucleotideSequencing
    domain_of:
    - NucleotideSequencing
    range: external_identifier
    multivalued: true
    pattern: ^gold:Gp[0-9]+$
  insdc_bioproject_identifiers:
    name: insdc_bioproject_identifiers
    description: identifiers for corresponding project in INSDC Bioproject
    comments:
    - these are distinct IDs from INSDC SRA/ENA project identifiers, but are usually(?)
      one to one
    examples:
    - value: bioproject:PRJNA366857
      description: Avena fatua rhizosphere microbial communities - H1_Rhizo_Litter_2
        metatranscriptome
    from_schema: https://w3id.org/nmdc/nmdc
    see_also:
    - https://www.ncbi.nlm.nih.gov/bioproject/
    - https://www.ddbj.nig.ac.jp/bioproject/index-e.html
    aliases:
    - NCBI bioproject identifiers
    - DDBJ bioproject identifiers
    rank: 1000
    is_a: study_identifiers
    mixins:
    - insdc_identifiers
    alias: insdc_bioproject_identifiers
    owner: NucleotideSequencing
    domain_of:
    - NucleotideSequencing
    - Study
    range: external_identifier
    multivalued: true
    pattern: ^bioproject:PRJ[DEN][A-Z][0-9]+$
  insdc_experiment_identifiers:
    name: insdc_experiment_identifiers
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    is_a: external_database_identifiers
    mixins:
    - insdc_identifiers
    alias: insdc_experiment_identifiers
    owner: NucleotideSequencing
    domain_of:
    - NucleotideSequencing
    - DataObject
    range: external_identifier
    multivalued: true
    pattern: ^insdc.sra:(E|D|S)RX[0-9]{6,}$
  ncbi_project_name:
    name: ncbi_project_name
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: ncbi_project_name
    owner: NucleotideSequencing
    domain_of:
    - NucleotideSequencing
    range: string
  add_date:
    name: add_date
    description: The date on which the information was added to the database.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: add_date
    owner: NucleotideSequencing
    domain_of:
    - Biosample
    - DataGeneration
    range: string
  analyte_category:
    name: analyte_category
    description: 'The type of analyte(s) that were measured in the data generation
      process
      '
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: analyte_category
    owner: NucleotideSequencing
    domain_of:
    - DataGeneration
    range: NucleotideSequencingEnum
    required: true
  associated_studies:
    name: associated_studies
    description: The study associated with a resource.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: associated_studies
    owner: NucleotideSequencing
    domain_of:
    - Biosample
    - DataGeneration
    range: Study
    required: true
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(sty)-{id_shoulder}-{id_blade}$'
      interpolated: true
  instrument_used:
    name: instrument_used
    description: What instrument was used during DataGeneration or MaterialProcessing.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: instrument_used
    owner: NucleotideSequencing
    domain_of:
    - MaterialProcessing
    - DataGeneration
    range: Instrument
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:inst-{id_shoulder}-{id_blade}$'
      interpolated: true
  mod_date:
    name: mod_date
    description: The last date on which the database information was modified.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: mod_date
    owner: NucleotideSequencing
    domain_of:
    - Biosample
    - DataGeneration
    range: string
  principal_investigator:
    name: principal_investigator
    description: Principal Investigator who led the study and/or generated the dataset.
    from_schema: https://w3id.org/nmdc/nmdc
    aliases:
    - PI
    rank: 1000
    alias: principal_investigator
    owner: NucleotideSequencing
    domain_of:
    - Study
    - DataGeneration
    range: PersonValue
  instrument_instance_specifier:
    name: instrument_instance_specifier
    description: A unique value that identifies an individual instrument instance,
      such as a serial number or similar identifiers assigned by the manufacturer
      or user.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: instrument_instance_specifier
    owner: NucleotideSequencing
    domain_of:
    - DataGeneration
    range: string
  has_input:
    name: has_input
    description: An input to a process.
    from_schema: https://w3id.org/nmdc/nmdc
    aliases:
    - input
    rank: 1000
    alias: has_input
    owner: NucleotideSequencing
    domain_of:
    - PlannedProcess
    range: Sample
    required: true
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$'
      interpolated: true
  has_output:
    name: has_output
    description: An output from a process.
    from_schema: https://w3id.org/nmdc/nmdc
    aliases:
    - output
    rank: 1000
    alias: has_output
    owner: NucleotideSequencing
    domain_of:
    - PlannedProcess
    range: DataObject
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
  processing_institution:
    name: processing_institution
    description: The organization that processed the sample.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: processing_institution
    owner: NucleotideSequencing
    domain_of:
    - PlannedProcess
    range: ProcessingInstitutionEnum
  protocol_link:
    name: protocol_link
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: protocol_link
    owner: NucleotideSequencing
    domain_of:
    - Configuration
    - PlannedProcess
    - Study
    range: Protocol
  start_date:
    name: start_date
    description: The date on which any process or activity was started
    todos:
    - add date string validation pattern
    comments:
    - We are using string representations of dates until all components of our ecosystem
      can handle ISO 8610 dates
    - The date should be formatted as YYYY-MM-DD
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: start_date
    owner: NucleotideSequencing
    domain_of:
    - PlannedProcess
    range: string
  end_date:
    name: end_date
    description: The date on which any process or activity was ended
    todos:
    - add date string validation pattern
    comments:
    - We are using string representations of dates until all components of our ecosystem
      can handle ISO 8610 dates
    - The date should be formatted as YYYY-MM-DD
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: end_date
    owner: NucleotideSequencing
    domain_of:
    - PlannedProcess
    range: string
  qc_status:
    name: qc_status
    description: Stores information about the result of a process (ie the process
      of sequencing a library may have for qc_status of 'fail' if not enough data
      was generated)
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: qc_status
    owner: NucleotideSequencing
    domain_of:
    - PlannedProcess
    range: StatusEnum
  qc_comment:
    name: qc_comment
    description: Slot to store additional comments about laboratory or workflow output.
      For workflow output it may describe the particular workflow stage that failed.
      (ie Failed at call-stage due to a malformed fastq file).
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: qc_comment
    owner: NucleotideSequencing
    domain_of:
    - PlannedProcess
    range: string
  has_failure_categorization:
    name: has_failure_categorization
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: has_failure_categorization
    owner: NucleotideSequencing
    domain_of:
    - PlannedProcess
    range: FailureCategorization
    multivalued: true
    inlined: true
    inlined_as_list: true
  id:
    name: id
    description: A unique identifier for a thing. Must be either a CURIE shorthand
      for a URI or a complete URI
    notes:
    - 'abstracted pattern: prefix:typecode-authshoulder-blade(.version)?(_seqsuffix)?'
    - a minimum length of 3 characters is suggested for typecodes, but 1 or 2 characters
      will be accepted
    - typecodes must correspond 1:1 to a class in the NMDC schema. this will be checked
      via per-class id slot usage assertions
    - minting authority shoulders should probably be enumerated and checked in the
      pattern
    examples:
    - value: nmdc:mgmag-00-x012.1_7_c1
      description: https://github.com/microbiomedata/nmdc-schema/pull/499#discussion_r1018499248
    from_schema: https://w3id.org/nmdc/nmdc
    structured_aliases:
      workflow_execution_id:
        literal_form: workflow_execution_id
        predicate: NARROW_SYNONYM
        contexts:
        - https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
      data_object_id:
        literal_form: data_object_id
        predicate: NARROW_SYNONYM
        contexts:
        - https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
    rank: 1000
    identifier: true
    alias: id
    owner: NucleotideSequencing
    domain_of:
    - NamedThing
    range: uriorcurie
    required: true
    pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dgns|omprc)-{id_shoulder}-{id_blade}$'
      interpolated: true
  name:
    name: name
    description: A human readable label for an entity
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: name
    owner: NucleotideSequencing
    domain_of:
    - PersonValue
    - NamedThing
    - Protocol
    range: string
  description:
    name: description
    description: a human-readable description of a thing
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    slot_uri: dcterms:description
    alias: description
    owner: NucleotideSequencing
    domain_of:
    - ImageValue
    - NamedThing
    - Protocol
    range: string
  alternative_identifiers:
    name: alternative_identifiers
    description: A list of alternative identifiers for the entity.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: alternative_identifiers
    owner: NucleotideSequencing
    domain_of:
    - MetaboliteIdentification
    - NamedThing
    range: uriorcurie
    multivalued: true
    pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,\(\)\=\#]*$
  type:
    name: type
    description: the class_uri of the class that has been instantiated
    notes:
    - makes it easier to read example data files
    - required for polymorphic MongoDB collections
    examples:
    - value: nmdc:Biosample
    - value: nmdc:Study
    from_schema: https://w3id.org/nmdc/nmdc
    see_also:
    - https://github.com/microbiomedata/nmdc-schema/issues/1048
    - https://github.com/microbiomedata/nmdc-schema/issues/1233
    - https://github.com/microbiomedata/nmdc-schema/issues/248
    structured_aliases:
      workflow_execution_class:
        literal_form: workflow_execution_class
        predicate: NARROW_SYNONYM
        contexts:
        - https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
    rank: 1000
    slot_uri: rdf:type
    designates_type: true
    alias: type
    owner: NucleotideSequencing
    domain_of:
    - EukEval
    - FunctionalAnnotationAggMember
    - PeptideQuantification
    - ProteinQuantification
    - MobilePhaseSegment
    - PortionOfSubstance
    - MagBin
    - MetaboliteIdentification
    - GenomeFeature
    - FunctionalAnnotation
    - AttributeValue
    - NamedThing
    - OntologyRelation
    - FailureCategorization
    - Protocol
    - CreditAssociation
    - Doi
    range: uriorcurie
    required: true
class_uri: nmdc:NucleotideSequencing