Class: DataGeneration
The methods and processes used to generate omics data from a biosample or organism.
Note
This is an abstract class and should not be instantiated directly.
URI: nmdc:DataGeneration
classDiagram
  class DataGeneration
  click DataGeneration href "../DataGeneration"
    DataEmitterProcess <|-- DataGeneration
      click DataEmitterProcess href "../DataEmitterProcess"
    DataGeneration <|-- NucleotideSequencing
      click NucleotideSequencing href "../NucleotideSequencing"
    DataGeneration <|-- MassSpectrometry
      click MassSpectrometry href "../MassSpectrometry"
  DataGeneration : add_date
  DataGeneration : alternative_identifiers
  DataGeneration : analyte_category
  DataGeneration : associated_studies
      DataGeneration --> "1..*" Study : associated_studies
    click Study href "../Study"
  DataGeneration : description
  DataGeneration : end_date
  DataGeneration : has_failure_categorization
      DataGeneration --> "*" FailureCategorization : has_failure_categorization
    click FailureCategorization href "../FailureCategorization"
  DataGeneration : has_input
      DataGeneration --> "1..*" Sample : has_input
    click Sample href "../Sample"
  DataGeneration : has_output
      DataGeneration --> "*" DataObject : has_output
    click DataObject href "../DataObject"
  DataGeneration : id
  DataGeneration : instrument_instance_specifier
  DataGeneration : instrument_used
      DataGeneration --> "*" Instrument : instrument_used
    click Instrument href "../Instrument"
  DataGeneration : mod_date
  DataGeneration : name
  DataGeneration : principal_investigator
      DataGeneration --> "0..1" PersonValue : principal_investigator
    click PersonValue href "../PersonValue"
  DataGeneration : processing_institution
      DataGeneration --> "0..1" ProcessingInstitutionEnum : processing_institution
    click ProcessingInstitutionEnum href "../ProcessingInstitutionEnum"
  DataGeneration : protocol_link
      DataGeneration --> "0..1" Protocol : protocol_link
    click Protocol href "../Protocol"
  DataGeneration : qc_comment
  DataGeneration : qc_status
      DataGeneration --> "0..1" StatusEnum : qc_status
    click StatusEnum href "../StatusEnum"
  DataGeneration : start_date
  DataGeneration : type
Inheritance
Slots
| Name | Cardinality and Range | Description | Inheritance | 
|---|---|---|---|
| add_date | 0..1 String | The date on which the information was added to the database | direct | 
| analyte_category | 1 String | The type of analyte(s) that were measured in the data generation process | direct | 
| associated_studies | 1..* Study | The study associated with a resource | direct | 
| instrument_used | * Instrument | What instrument was used during DataGeneration or MaterialProcessing | direct | 
| mod_date | 0..1 String | The last date on which the database information was modified | direct | 
| principal_investigator | 0..1 PersonValue | Principal Investigator who led the study and/or generated the dataset | direct | 
| instrument_instance_specifier | 0..1 String | A unique value that identifies an individual instrument instance, such as a s... | direct | 
| has_input | 1..* Sample | An input to a process | PlannedProcess | 
| has_output | * DataObject | An output from a process | PlannedProcess | 
| processing_institution | 0..1 ProcessingInstitutionEnum | The organization that processed the sample | PlannedProcess | 
| protocol_link | 0..1 Protocol | PlannedProcess | |
| start_date | 0..1 String | The date on which any process or activity was started | PlannedProcess | 
| end_date | 0..1 String | The date on which any process or activity was ended | PlannedProcess | 
| qc_status | 0..1 StatusEnum | Stores information about the result of a process (ie the process of sequencin... | PlannedProcess | 
| qc_comment | 0..1 String | Slot to store additional comments about laboratory or workflow output | PlannedProcess | 
| has_failure_categorization | * FailureCategorization | PlannedProcess | |
| id | 1 Uriorcurie | A unique identifier for a thing | NamedThing | 
| name | 0..1 String | A human readable label for an entity | NamedThing | 
| description | 0..1 String | a human-readable description of a thing | NamedThing | 
| alternative_identifiers | * Uriorcurie | A list of alternative identifiers for the entity | NamedThing | 
| type | 1 Uriorcurie | the class_uri of the class that has been instantiated | NamedThing | 
Usages
| used by | used in | type | used | 
|---|---|---|---|
| Database | data_generation_set | range | DataGeneration | 
| AnnotatingWorkflow | was_informed_by | range | DataGeneration | 
| WorkflowExecution | was_informed_by | range | DataGeneration | 
Aliases
- OmicsProcessing
- assay
- omics assay
- sequencing project
- experiment
Identifier and Mapping Information
Schema Source
- from schema: https://w3id.org/nmdc/nmdc
Mappings
| Mapping Type | Mapped Value | 
|---|---|
| broad | OBI:0000070, ISA:Assay | 
LinkML Source
Direct
name: DataGeneration
description: The methods and processes used to generate omics data from a biosample
  or organism.
alt_descriptions:
  embl.ena:
    source: embl.ena
    description: An experiment contains information about a sequencing experiment
      including library and instrument details.
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- OmicsProcessing
- assay
- omics assay
- sequencing project
- experiment
broad_mappings:
- OBI:0000070
- ISA:Assay
is_a: DataEmitterProcess
abstract: true
slots:
- add_date
- analyte_category
- associated_studies
- instrument_used
- mod_date
- principal_investigator
- instrument_instance_specifier
slot_usage:
  has_input:
    name: has_input
    range: Sample
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$'
      interpolated: true
  associated_studies:
    name: associated_studies
    range: Study
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(sty)-{id_shoulder}-{id_blade}$'
      interpolated: true
  has_output:
    name: has_output
    range: DataObject
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
class_uri: nmdc:DataGeneration
Induced
name: DataGeneration
description: The methods and processes used to generate omics data from a biosample
  or organism.
alt_descriptions:
  embl.ena:
    source: embl.ena
    description: An experiment contains information about a sequencing experiment
      including library and instrument details.
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- OmicsProcessing
- assay
- omics assay
- sequencing project
- experiment
broad_mappings:
- OBI:0000070
- ISA:Assay
is_a: DataEmitterProcess
abstract: true
slot_usage:
  has_input:
    name: has_input
    range: Sample
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$'
      interpolated: true
  associated_studies:
    name: associated_studies
    range: Study
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(sty)-{id_shoulder}-{id_blade}$'
      interpolated: true
  has_output:
    name: has_output
    range: DataObject
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
attributes:
  add_date:
    name: add_date
    description: The date on which the information was added to the database.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: add_date
    owner: DataGeneration
    domain_of:
    - Biosample
    - DataGeneration
    range: string
  analyte_category:
    name: analyte_category
    description: 'The type of analyte(s) that were measured in the data generation
      process
      '
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: analyte_category
    owner: DataGeneration
    domain_of:
    - DataGeneration
    range: string
    required: true
  associated_studies:
    name: associated_studies
    description: The study associated with a resource.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: associated_studies
    owner: DataGeneration
    domain_of:
    - Biosample
    - DataGeneration
    range: Study
    required: true
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(sty)-{id_shoulder}-{id_blade}$'
      interpolated: true
  instrument_used:
    name: instrument_used
    description: What instrument was used during DataGeneration or MaterialProcessing.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: instrument_used
    owner: DataGeneration
    domain_of:
    - MaterialProcessing
    - DataGeneration
    range: Instrument
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:inst-{id_shoulder}-{id_blade}$'
      interpolated: true
  mod_date:
    name: mod_date
    description: The last date on which the database information was modified.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: mod_date
    owner: DataGeneration
    domain_of:
    - Biosample
    - DataGeneration
    range: string
  principal_investigator:
    name: principal_investigator
    description: Principal Investigator who led the study and/or generated the dataset.
    from_schema: https://w3id.org/nmdc/nmdc
    aliases:
    - PI
    rank: 1000
    alias: principal_investigator
    owner: DataGeneration
    domain_of:
    - Study
    - DataGeneration
    range: PersonValue
  instrument_instance_specifier:
    name: instrument_instance_specifier
    description: A unique value that identifies an individual instrument instance,
      such as a serial number or similar identifiers assigned by the manufacturer
      or user.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: instrument_instance_specifier
    owner: DataGeneration
    domain_of:
    - DataGeneration
    range: string
  has_input:
    name: has_input
    description: An input to a process.
    from_schema: https://w3id.org/nmdc/nmdc
    aliases:
    - input
    rank: 1000
    alias: has_input
    owner: DataGeneration
    domain_of:
    - PlannedProcess
    range: Sample
    required: true
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(bsm|procsm)-{id_shoulder}-{id_blade}$'
      interpolated: true
  has_output:
    name: has_output
    description: An output from a process.
    from_schema: https://w3id.org/nmdc/nmdc
    aliases:
    - output
    rank: 1000
    alias: has_output
    owner: DataGeneration
    domain_of:
    - PlannedProcess
    range: DataObject
    multivalued: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:(dobj)-{id_shoulder}-{id_blade}$'
      interpolated: true
  processing_institution:
    name: processing_institution
    description: The organization that processed the sample.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: processing_institution
    owner: DataGeneration
    domain_of:
    - PlannedProcess
    range: ProcessingInstitutionEnum
  protocol_link:
    name: protocol_link
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: protocol_link
    owner: DataGeneration
    domain_of:
    - Configuration
    - PlannedProcess
    - Study
    range: Protocol
  start_date:
    name: start_date
    description: The date on which any process or activity was started
    todos:
    - add date string validation pattern
    comments:
    - We are using string representations of dates until all components of our ecosystem
      can handle ISO 8610 dates
    - The date should be formatted as YYYY-MM-DD
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: start_date
    owner: DataGeneration
    domain_of:
    - PlannedProcess
    range: string
  end_date:
    name: end_date
    description: The date on which any process or activity was ended
    todos:
    - add date string validation pattern
    comments:
    - We are using string representations of dates until all components of our ecosystem
      can handle ISO 8610 dates
    - The date should be formatted as YYYY-MM-DD
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: end_date
    owner: DataGeneration
    domain_of:
    - PlannedProcess
    range: string
  qc_status:
    name: qc_status
    description: Stores information about the result of a process (ie the process
      of sequencing a library may have for qc_status of 'fail' if not enough data
      was generated)
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: qc_status
    owner: DataGeneration
    domain_of:
    - PlannedProcess
    range: StatusEnum
  qc_comment:
    name: qc_comment
    description: Slot to store additional comments about laboratory or workflow output.
      For workflow output it may describe the particular workflow stage that failed.
      (ie Failed at call-stage due to a malformed fastq file).
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: qc_comment
    owner: DataGeneration
    domain_of:
    - PlannedProcess
    range: string
  has_failure_categorization:
    name: has_failure_categorization
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: has_failure_categorization
    owner: DataGeneration
    domain_of:
    - PlannedProcess
    range: FailureCategorization
    multivalued: true
    inlined: true
    inlined_as_list: true
  id:
    name: id
    description: A unique identifier for a thing. Must be either a CURIE shorthand
      for a URI or a complete URI
    notes:
    - 'abstracted pattern: prefix:typecode-authshoulder-blade(.version)?(_seqsuffix)?'
    - a minimum length of 3 characters is suggested for typecodes, but 1 or 2 characters
      will be accepted
    - typecodes must correspond 1:1 to a class in the NMDC schema. this will be checked
      via per-class id slot usage assertions
    - minting authority shoulders should probably be enumerated and checked in the
      pattern
    examples:
    - value: nmdc:mgmag-00-x012.1_7_c1
      description: https://github.com/microbiomedata/nmdc-schema/pull/499#discussion_r1018499248
    from_schema: https://w3id.org/nmdc/nmdc
    structured_aliases:
      workflow_execution_id:
        literal_form: workflow_execution_id
        predicate: NARROW_SYNONYM
        contexts:
        - https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
      data_object_id:
        literal_form: data_object_id
        predicate: NARROW_SYNONYM
        contexts:
        - https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
    rank: 1000
    identifier: true
    alias: id
    owner: DataGeneration
    domain_of:
    - NamedThing
    range: uriorcurie
    required: true
    pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
  name:
    name: name
    description: A human readable label for an entity
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: name
    owner: DataGeneration
    domain_of:
    - PersonValue
    - NamedThing
    - Protocol
    range: string
  description:
    name: description
    description: a human-readable description of a thing
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    slot_uri: dcterms:description
    alias: description
    owner: DataGeneration
    domain_of:
    - ImageValue
    - NamedThing
    - Protocol
    range: string
  alternative_identifiers:
    name: alternative_identifiers
    description: A list of alternative identifiers for the entity.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: alternative_identifiers
    owner: DataGeneration
    domain_of:
    - MetaboliteIdentification
    - NamedThing
    range: uriorcurie
    multivalued: true
    pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,\(\)\=\#]*$
  type:
    name: type
    description: the class_uri of the class that has been instantiated
    notes:
    - makes it easier to read example data files
    - required for polymorphic MongoDB collections
    examples:
    - value: nmdc:Biosample
    - value: nmdc:Study
    from_schema: https://w3id.org/nmdc/nmdc
    see_also:
    - https://github.com/microbiomedata/nmdc-schema/issues/1048
    - https://github.com/microbiomedata/nmdc-schema/issues/1233
    - https://github.com/microbiomedata/nmdc-schema/issues/248
    structured_aliases:
      workflow_execution_class:
        literal_form: workflow_execution_class
        predicate: NARROW_SYNONYM
        contexts:
        - https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
    rank: 1000
    slot_uri: rdf:type
    designates_type: true
    alias: type
    owner: DataGeneration
    domain_of:
    - EukEval
    - FunctionalAnnotationAggMember
    - PeptideQuantification
    - ProteinQuantification
    - MobilePhaseSegment
    - PortionOfSubstance
    - MagBin
    - MetaboliteIdentification
    - GenomeFeature
    - FunctionalAnnotation
    - AttributeValue
    - NamedThing
    - OntologyRelation
    - FailureCategorization
    - Protocol
    - CreditAssociation
    - Doi
    range: uriorcurie
    required: true
class_uri: nmdc:DataGeneration