Class: Metatranscriptome Sequence Data (Non-Interleaved) (MetatranscriptomeSequencingNonInterleavedDataInterface)

Interface for non-interleaved metatranscriptome sequencing data

URI: nmdc_sub_schema:MetatranscriptomeSequencingNonInterleavedDataInterface

classDiagram class MetatranscriptomeSequencingNonInterleavedDataInterface click MetatranscriptomeSequencingNonInterleavedDataInterface href "../MetatranscriptomeSequencingNonInterleavedDataInterface/" DhMultiviewCommonColumnsMixin <|-- MetatranscriptomeSequencingNonInterleavedDataInterface click DhMultiviewCommonColumnsMixin href "../DhMultiviewCommonColumnsMixin/" MetatranscriptomeSequencingNonInterleavedDataInterface : analysis_type MetatranscriptomeSequencingNonInterleavedDataInterface --> "1..*" AnalysisTypeEnum : analysis_type click AnalysisTypeEnum href "../AnalysisTypeEnum/" MetatranscriptomeSequencingNonInterleavedDataInterface : insdc_bioproject_identifiers MetatranscriptomeSequencingNonInterleavedDataInterface : insdc_experiment_identifiers MetatranscriptomeSequencingNonInterleavedDataInterface : model MetatranscriptomeSequencingNonInterleavedDataInterface --> "1" IlluminaInstrumentModelEnum : model click IlluminaInstrumentModelEnum href "../IlluminaInstrumentModelEnum/" MetatranscriptomeSequencingNonInterleavedDataInterface : processing_institution MetatranscriptomeSequencingNonInterleavedDataInterface --> "0..1" ProcessingInstitutionEnum : processing_institution click ProcessingInstitutionEnum href "../ProcessingInstitutionEnum/" MetatranscriptomeSequencingNonInterleavedDataInterface : protocol_link MetatranscriptomeSequencingNonInterleavedDataInterface : read_1_md5_checksum MetatranscriptomeSequencingNonInterleavedDataInterface : read_1_url MetatranscriptomeSequencingNonInterleavedDataInterface : read_2_md5_checksum MetatranscriptomeSequencingNonInterleavedDataInterface : read_2_url MetatranscriptomeSequencingNonInterleavedDataInterface : samp_name MetatranscriptomeSequencingNonInterleavedDataInterface : source_mat_id

Inheritance

MetatranscriptomeSequencingNonInterleavedDataInterface [ DhMultiviewCommonColumnsMixin]

Slots

Name	Cardinality and Range	Description	Inheritance
read_1_url	1 String	URL for FASTQ file of read 1 of a pair of reads	direct
read_1_md5_checksum	0..1 String	MD5 checksum of file in "read 1 FASTQ"	direct
read_2_url	1 String	URL for FASTQ file of read 2 of a pair of reads	direct
read_2_md5_checksum	0..1 String	MD5 checksum of file in "read 2 FASTQ"	direct
model	1 IlluminaInstrumentModelEnum	The model of the Illumina sequencing instrument used to generate the data	direct
insdc_bioproject_identifiers	0..1 String	identifiers for corresponding project in INSDC Bioproject	direct
insdc_experiment_identifiers	0..1 String	If multiple identifiers are provided, separate them with a semicolon	direct
processing_institution	0..1 ProcessingInstitutionEnum	The organization that processed the sample	direct
protocol_link	0..1 String	A URL to a description of the sequencing protocol used to generate the data	direct
analysis_type	1..* AnalysisTypeEnum	Select all the data types associated or available for this biosample	DhMultiviewCommonColumnsMixin
samp_name	1 String	A local identifier or name that for the material sample collected	DhMultiviewCommonColumnsMixin
source_mat_id	0..1 String	A globally unique identifier assigned to the parent sample or sample that is ...	DhMultiviewCommonColumnsMixin

Usages

used by	used in	type	used
SampleData	metatranscriptome_sequencing_non_interleaved_data	range	MetatranscriptomeSequencingNonInterleavedDataInterface

Identifier and Mapping Information

Schema Source

from schema: https://example.com/nmdc_submission_schema

Mappings

Mapping Type	Mapped Value
self	nmdc_sub_schema:MetatranscriptomeSequencingNonInterleavedDataInterface
native	nmdc_sub_schema:MetatranscriptomeSequencingNonInterleavedDataInterface

LinkML Source

Direct

name: MetatranscriptomeSequencingNonInterleavedDataInterface
description: Interface for non-interleaved metatranscriptome sequencing data
title: Metatranscriptome Sequence Data (Non-Interleaved)
from_schema: https://example.com/nmdc_submission_schema
mixins:
- DhMultiviewCommonColumnsMixin
slots:
- read_1_url
- read_1_md5_checksum
- read_2_url
- read_2_md5_checksum
- model
- insdc_bioproject_identifiers
- insdc_experiment_identifiers
- processing_institution
- protocol_link
slot_usage:
  insdc_bioproject_identifiers:
    name: insdc_bioproject_identifiers
    multivalued: false
  insdc_experiment_identifiers:
    name: insdc_experiment_identifiers
    multivalued: false

Induced

name: MetatranscriptomeSequencingNonInterleavedDataInterface
description: Interface for non-interleaved metatranscriptome sequencing data
title: Metatranscriptome Sequence Data (Non-Interleaved)
from_schema: https://example.com/nmdc_submission_schema
mixins:
- DhMultiviewCommonColumnsMixin
slot_usage:
  insdc_bioproject_identifiers:
    name: insdc_bioproject_identifiers
    multivalued: false
  insdc_experiment_identifiers:
    name: insdc_experiment_identifiers
    multivalued: false
attributes:
  read_1_url:
    name: read_1_url
    description: URL for FASTQ file of read 1 of a pair of reads.
    title: read 1 FASTQ
    comments:
    - If multiple runs were performed, separate each URL with a semi-colon.
    - External data urls should be available for at least a year. If you would like
      NMDC to submit your data to an appropriate raw data repository on your behalf
      please contact us at microbiomedata.science@gmail.com.
    from_schema: https://example.com/nmdc_submission_schema
    rank: 10
    alias: read_1_url
    owner: MetatranscriptomeSequencingNonInterleavedDataInterface
    domain_of:
    - MetagenomeSequencingNonInterleavedDataInterface
    - MetatranscriptomeSequencingNonInterleavedDataInterface
    slot_group: data_files_section
    range: string
    required: true
    pattern: ^https://[^\s;]+(?:\s*;\s*https://[^\s;]+)*$
  read_1_md5_checksum:
    name: read_1_md5_checksum
    description: MD5 checksum of file in "read 1 FASTQ".
    title: read 1 FASTQ MD5
    comments:
    - If multiple runs were performed, separate each checksum with a semi-colon. The
      number of checksums should match the number of URLs in "read 1 FASTQ".
    from_schema: https://example.com/nmdc_submission_schema
    rank: 11
    alias: read_1_md5_checksum
    owner: MetatranscriptomeSequencingNonInterleavedDataInterface
    domain_of:
    - MetagenomeSequencingNonInterleavedDataInterface
    - MetatranscriptomeSequencingNonInterleavedDataInterface
    slot_group: data_files_section
    range: string
    pattern: ^[a-fA-F0-9]{32}(?:\s*;\s*[a-fA-F0-9]{32})*$
  read_2_url:
    name: read_2_url
    description: URL for FASTQ file of read 2 of a pair of reads.
    title: read 2 FASTQ
    comments:
    - If multiple runs were performed, separate each URL with a semi-colon.
    - External data urls should be available for at least a year. If you would like
      NMDC to submit your data to an appropriate raw data repository on your behalf
      please contact us at microbiomedata.science@gmail.com.
    from_schema: https://example.com/nmdc_submission_schema
    rank: 12
    alias: read_2_url
    owner: MetatranscriptomeSequencingNonInterleavedDataInterface
    domain_of:
    - MetagenomeSequencingNonInterleavedDataInterface
    - MetatranscriptomeSequencingNonInterleavedDataInterface
    slot_group: data_files_section
    range: string
    required: true
    pattern: ^https://[^\s;]+(?:\s*;\s*https://[^\s;]+)*$
  read_2_md5_checksum:
    name: read_2_md5_checksum
    description: MD5 checksum of file in "read 2 FASTQ".
    title: read 2 FASTQ MD5
    comments:
    - If multiple runs were performed, separate each checksum with a semi-colon. The
      number of checksums should match the number of URLs in "read 2 FASTQ".
    from_schema: https://example.com/nmdc_submission_schema
    rank: 13
    alias: read_2_md5_checksum
    owner: MetatranscriptomeSequencingNonInterleavedDataInterface
    domain_of:
    - MetagenomeSequencingNonInterleavedDataInterface
    - MetatranscriptomeSequencingNonInterleavedDataInterface
    slot_group: data_files_section
    range: string
    pattern: ^[a-fA-F0-9]{32}(?:\s*;\s*[a-fA-F0-9]{32})*$
  model:
    name: model
    description: The model of the Illumina sequencing instrument used to generate
      the data.
    title: instrument model
    from_schema: https://example.com/nmdc_submission_schema
    rank: 2
    alias: model
    owner: MetatranscriptomeSequencingNonInterleavedDataInterface
    domain_of:
    - MetagenomeSequencingNonInterleavedDataInterface
    - MetagenomeSequencingInterleavedDataInterface
    - MetatranscriptomeSequencingNonInterleavedDataInterface
    - MetatranscriptomeSequencingInterleavedDataInterface
    slot_group: sequencing_section
    range: IlluminaInstrumentModelEnum
    required: true
  insdc_bioproject_identifiers:
    name: insdc_bioproject_identifiers
    description: identifiers for corresponding project in INSDC Bioproject
    title: INSDC bioproject identifier
    comments:
    - these are distinct IDs from INSDC SRA/ENA project identifiers, but are usually(?)
      one to one
    examples:
    - value: bioproject:PRJNA366857
    from_schema: https://example.com/nmdc_submission_schema
    see_also:
    - https://www.ncbi.nlm.nih.gov/bioproject/
    - https://www.ddbj.nig.ac.jp/bioproject/index-e.html
    aliases:
    - NCBI bioproject identifiers
    - DDBJ bioproject identifiers
    rank: 5
    is_a: study_identifiers
    mixins:
    - insdc_identifiers
    alias: insdc_bioproject_identifiers
    owner: MetatranscriptomeSequencingNonInterleavedDataInterface
    domain_of:
    - MetagenomeSequencingNonInterleavedDataInterface
    - MetagenomeSequencingInterleavedDataInterface
    - MetatranscriptomeSequencingNonInterleavedDataInterface
    - MetatranscriptomeSequencingInterleavedDataInterface
    slot_group: sequencing_section
    range: string
    multivalued: false
    pattern: ^bioproject:PRJ[DEN][A-Z][0-9]+$
  insdc_experiment_identifiers:
    name: insdc_experiment_identifiers
    description: If multiple identifiers are provided, separate them with a semicolon.
      The number of identifiers must match the number of sequencing files.
    title: INSDC experiment identifiers
    from_schema: https://example.com/nmdc_submission_schema
    rank: 6
    is_a: external_database_identifiers
    mixins:
    - insdc_identifiers
    alias: insdc_experiment_identifiers
    owner: MetatranscriptomeSequencingNonInterleavedDataInterface
    domain_of:
    - MetagenomeSequencingNonInterleavedDataInterface
    - MetagenomeSequencingInterleavedDataInterface
    - MetatranscriptomeSequencingNonInterleavedDataInterface
    - MetatranscriptomeSequencingInterleavedDataInterface
    slot_group: sequencing_section
    range: string
    multivalued: false
    pattern: ^insdc.sra:(E|D|S)RX[0-9]{6,}$
  processing_institution:
    name: processing_institution
    description: The organization that processed the sample.
    title: processing institution
    from_schema: https://example.com/nmdc_submission_schema
    rank: 3
    alias: processing_institution
    owner: MetatranscriptomeSequencingNonInterleavedDataInterface
    domain_of:
    - MetagenomeSequencingNonInterleavedDataInterface
    - MetagenomeSequencingInterleavedDataInterface
    - MetatranscriptomeSequencingNonInterleavedDataInterface
    - MetatranscriptomeSequencingInterleavedDataInterface
    slot_group: sequencing_section
    range: ProcessingInstitutionEnum
  protocol_link:
    name: protocol_link
    description: A URL to a description of the sequencing protocol used to generate
      the data.
    title: protocol
    from_schema: https://example.com/nmdc_submission_schema
    rank: 4
    alias: protocol_link
    owner: MetatranscriptomeSequencingNonInterleavedDataInterface
    domain_of:
    - MetagenomeSequencingNonInterleavedDataInterface
    - MetagenomeSequencingInterleavedDataInterface
    - MetatranscriptomeSequencingNonInterleavedDataInterface
    - MetatranscriptomeSequencingInterleavedDataInterface
    slot_group: sequencing_section
    range: string
  analysis_type:
    name: analysis_type
    description: Select all the data types associated or available for this biosample
    title: analysis/data type
    comments:
    - MIxS:investigation_type was included as a `see_also` but that term doesn't resolve
      any more
    examples:
    - value: metagenomics; metabolomics; metaproteomics
    from_schema: https://example.com/nmdc_submission_schema
    rank: 3
    alias: analysis_type
    owner: MetatranscriptomeSequencingNonInterleavedDataInterface
    domain_of:
    - DhMultiviewCommonColumnsMixin
    slot_group: sample_id_section
    range: AnalysisTypeEnum
    required: true
    multivalued: true
  samp_name:
    name: samp_name
    annotations:
      Preferred_unit:
        tag: Preferred_unit
        value: ''
    description: A local identifier or name that for the material sample collected.
      Refers to the original material collected or to any derived sub-samples.
    title: sample name
    comments:
    - It can have any format, but we suggest that you make it concise, unique and
      consistent within your lab, and as informative as possible.
    examples:
    - value: Rock core CB1178(5-6) from NSW
    from_schema: https://example.com/nmdc_submission_schema
    rank: 1
    keywords:
    - sample
    slot_uri: MIXS:0001107
    identifier: true
    alias: samp_name
    owner: MetatranscriptomeSequencingNonInterleavedDataInterface
    domain_of:
    - DhMultiviewCommonColumnsMixin
    slot_group: sample_id_section
    range: string
    required: true
  source_mat_id:
    name: source_mat_id
    annotations:
      Expected_value:
        tag: Expected_value
        value: 'for cultures of microorganisms: identifiers for two culture collections;
          for other material a unique arbitrary identifer'
    description: A globally unique identifier assigned to the parent sample or sample
      that is the source  of this sample.
    title: source material identifier
    todos:
    - Currently, the comments say to use UUIDs. However, if we implement assigning
      NMDC identifiers with the minter we dont need to require a GUID. It can be an
      optional field to fill out only if they already have a resolvable ID.
    comments:
    - Identifiers must be prefixed. Possible FAIR prefixes are IGSNs (http://www.geosamples.org/getigsn),
      NCBI biosample accession numbers, ARK identifiers (https://arks.org/). These
      IDs enable linking to derived analytes and subsamples. If you have not assigned
      FAIR identifiers to your samples, you can generate UUIDs (https://www.uuidgenerator.net/).
    - 'Identifiers must be prefixed. Possible FAIR prefixes are: `igsn` for International  Generic
      Sample Numbers (http://www.geosamples.org/getigsn), `biosample` for NCBI  biosample
      accession IDs, `gold` for GOLD identifiers.'
    examples:
    - value: igsn:AU1243
    - value: biosample:SAMEA2397676
    from_schema: https://example.com/nmdc_submission_schema
    rank: 2
    keywords:
    - identifier
    - material
    - source
    slot_uri: MIXS:0000026
    alias: source_mat_id
    owner: MetatranscriptomeSequencingNonInterleavedDataInterface
    domain_of:
    - DhMultiviewCommonColumnsMixin
    slot_group: sample_id_section
    range: string
    multivalued: false
    pattern: ^igsn:[a-zA-Z0-9]+|biosample:SAMN[a-zA-Z0-9]+|biosample:SAME[a-zA-Z0-9]+|biosample:SAMJ[a-zA-Z0-9]+|gold:Gb[0-9]+$