Class: Metagenome Sequence Data (Non-Interleaved) (MetagenomeSequencingNonInterleavedDataInterface)
Interface for non-interleaved metagenome sequencing data
URI: nmdc_sub_schema:MetagenomeSequencingNonInterleavedDataInterface
classDiagram
class MetagenomeSequencingNonInterleavedDataInterface
click MetagenomeSequencingNonInterleavedDataInterface href "../MetagenomeSequencingNonInterleavedDataInterface"
DhMultiviewCommonColumnsMixin <|-- MetagenomeSequencingNonInterleavedDataInterface
click DhMultiviewCommonColumnsMixin href "../DhMultiviewCommonColumnsMixin"
MetagenomeSequencingNonInterleavedDataInterface : analysis_type
MetagenomeSequencingNonInterleavedDataInterface --> "1..*" AnalysisTypeEnum : analysis_type
click AnalysisTypeEnum href "../AnalysisTypeEnum"
MetagenomeSequencingNonInterleavedDataInterface : insdc_bioproject_identifiers
MetagenomeSequencingNonInterleavedDataInterface : insdc_experiment_identifiers
MetagenomeSequencingNonInterleavedDataInterface : model
MetagenomeSequencingNonInterleavedDataInterface --> "1" IlluminaInstrumentModelEnum : model
click IlluminaInstrumentModelEnum href "../IlluminaInstrumentModelEnum"
MetagenomeSequencingNonInterleavedDataInterface : processing_institution
MetagenomeSequencingNonInterleavedDataInterface --> "0..1" ProcessingInstitutionEnum : processing_institution
click ProcessingInstitutionEnum href "../ProcessingInstitutionEnum"
MetagenomeSequencingNonInterleavedDataInterface : protocol_link
MetagenomeSequencingNonInterleavedDataInterface : read_1_md5_checksum
MetagenomeSequencingNonInterleavedDataInterface : read_1_url
MetagenomeSequencingNonInterleavedDataInterface : read_2_md5_checksum
MetagenomeSequencingNonInterleavedDataInterface : read_2_url
MetagenomeSequencingNonInterleavedDataInterface : samp_name
MetagenomeSequencingNonInterleavedDataInterface : source_mat_id
Inheritance
- MetagenomeSequencingNonInterleavedDataInterface [ DhMultiviewCommonColumnsMixin]
Slots
Name | Cardinality and Range | Description | Inheritance |
---|---|---|---|
read_1_url | 1 String |
URL for FASTQ file of read 1 of a pair of reads | direct |
read_1_md5_checksum | 0..1 String |
MD5 checksum of file in "read 1 FASTQ" | direct |
read_2_url | 1 String |
URL for FASTQ file of read 2 of a pair of reads | direct |
read_2_md5_checksum | 0..1 String |
MD5 checksum of file in "read 2 FASTQ" | direct |
model | 1 IlluminaInstrumentModelEnum |
The model of the Illumina sequencing instrument used to generate the data | direct |
processing_institution | 0..1 ProcessingInstitutionEnum |
The organization that processed the sample | direct |
protocol_link | 0..1 String |
A URL to a description of the sequencing protocol used to generate the data | direct |
insdc_bioproject_identifiers | 0..1 String |
identifiers for corresponding project in INSDC Bioproject | direct |
insdc_experiment_identifiers | 0..1 String |
If multiple identifiers are provided, separate them with a semicolon | direct |
analysis_type | 1..* AnalysisTypeEnum |
Select all the data types associated or available for this biosample | DhMultiviewCommonColumnsMixin |
samp_name | 1 String |
A local identifier or name that for the material sample collected | DhMultiviewCommonColumnsMixin |
source_mat_id | 0..1 String |
A globally unique identifier assigned to the biological sample | DhMultiviewCommonColumnsMixin |
Usages
used by | used in | type | used |
---|---|---|---|
SampleData | metagenome_sequencing_non_interleaved_data | range | MetagenomeSequencingNonInterleavedDataInterface |
Identifier and Mapping Information
Schema Source
- from schema: https://example.com/nmdc_submission_schema
Mappings
Mapping Type | Mapped Value |
---|---|
self | nmdc_sub_schema:MetagenomeSequencingNonInterleavedDataInterface |
native | nmdc_sub_schema:MetagenomeSequencingNonInterleavedDataInterface |
LinkML Source
Direct
name: MetagenomeSequencingNonInterleavedDataInterface
description: Interface for non-interleaved metagenome sequencing data
title: Metagenome Sequence Data (Non-Interleaved)
from_schema: https://example.com/nmdc_submission_schema
mixins:
- DhMultiviewCommonColumnsMixin
slots:
- read_1_url
- read_1_md5_checksum
- read_2_url
- read_2_md5_checksum
- model
- processing_institution
- protocol_link
- insdc_bioproject_identifiers
- insdc_experiment_identifiers
slot_usage:
model:
name: model
description: The model of the Illumina sequencing instrument used to generate
the data.
title: instrument model
from_schema: https://w3id.org/nmdc/nmdc
rank: 2
owner: Instrument
domain_of:
- Instrument
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: IlluminaInstrumentModelEnum
required: true
processing_institution:
name: processing_institution
description: The organization that processed the sample.
title: processing institution
from_schema: https://w3id.org/nmdc/nmdc
rank: 3
owner: NucleotideSequencing
domain_of:
- PlannedProcess
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: ProcessingInstitutionEnum
protocol_link:
name: protocol_link
description: A URL to a description of the sequencing protocol used to generate
the data.
title: protocol
from_schema: https://w3id.org/nmdc/nmdc
rank: 4
owner: NucleotideSequencing
domain_of:
- PlannedProcess
- Study
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: string
multivalued: false
insdc_bioproject_identifiers:
name: insdc_bioproject_identifiers
description: identifiers for corresponding project in INSDC Bioproject
title: INSDC bioproject identifier
comments:
- these are distinct IDs from INSDC SRA/ENA project identifiers, but are usually(?)
one to one
examples:
- value: bioproject:PRJNA366857
from_schema: https://w3id.org/nmdc/nmdc
see_also:
- https://www.ncbi.nlm.nih.gov/bioproject/
- https://www.ddbj.nig.ac.jp/bioproject/index-e.html
aliases:
- NCBI bioproject identifiers
- DDBJ bioproject identifiers
rank: 5
is_a: study_identifiers
mixins:
- insdc_identifiers
owner: NucleotideSequencing
domain_of:
- NucleotideSequencing
- Study
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: string
multivalued: false
pattern: ^bioproject:PRJ[DEN][A-Z][0-9]+$
insdc_experiment_identifiers:
name: insdc_experiment_identifiers
description: If multiple identifiers are provided, separate them with a semicolon.
The number of identifiers must match the number of sequencing files.
title: INSDC experiment identifiers
from_schema: https://w3id.org/nmdc/nmdc
rank: 6
is_a: external_database_identifiers
mixins:
- insdc_identifiers
owner: NucleotideSequencing
domain_of:
- NucleotideSequencing
- DataObject
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: string
multivalued: false
pattern: ^insdc.sra:(E|D|S)RX[0-9]{6,}$
Induced
name: MetagenomeSequencingNonInterleavedDataInterface
description: Interface for non-interleaved metagenome sequencing data
title: Metagenome Sequence Data (Non-Interleaved)
from_schema: https://example.com/nmdc_submission_schema
mixins:
- DhMultiviewCommonColumnsMixin
slot_usage:
model:
name: model
description: The model of the Illumina sequencing instrument used to generate
the data.
title: instrument model
from_schema: https://w3id.org/nmdc/nmdc
rank: 2
owner: Instrument
domain_of:
- Instrument
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: IlluminaInstrumentModelEnum
required: true
processing_institution:
name: processing_institution
description: The organization that processed the sample.
title: processing institution
from_schema: https://w3id.org/nmdc/nmdc
rank: 3
owner: NucleotideSequencing
domain_of:
- PlannedProcess
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: ProcessingInstitutionEnum
protocol_link:
name: protocol_link
description: A URL to a description of the sequencing protocol used to generate
the data.
title: protocol
from_schema: https://w3id.org/nmdc/nmdc
rank: 4
owner: NucleotideSequencing
domain_of:
- PlannedProcess
- Study
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: string
multivalued: false
insdc_bioproject_identifiers:
name: insdc_bioproject_identifiers
description: identifiers for corresponding project in INSDC Bioproject
title: INSDC bioproject identifier
comments:
- these are distinct IDs from INSDC SRA/ENA project identifiers, but are usually(?)
one to one
examples:
- value: bioproject:PRJNA366857
from_schema: https://w3id.org/nmdc/nmdc
see_also:
- https://www.ncbi.nlm.nih.gov/bioproject/
- https://www.ddbj.nig.ac.jp/bioproject/index-e.html
aliases:
- NCBI bioproject identifiers
- DDBJ bioproject identifiers
rank: 5
is_a: study_identifiers
mixins:
- insdc_identifiers
owner: NucleotideSequencing
domain_of:
- NucleotideSequencing
- Study
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: string
multivalued: false
pattern: ^bioproject:PRJ[DEN][A-Z][0-9]+$
insdc_experiment_identifiers:
name: insdc_experiment_identifiers
description: If multiple identifiers are provided, separate them with a semicolon.
The number of identifiers must match the number of sequencing files.
title: INSDC experiment identifiers
from_schema: https://w3id.org/nmdc/nmdc
rank: 6
is_a: external_database_identifiers
mixins:
- insdc_identifiers
owner: NucleotideSequencing
domain_of:
- NucleotideSequencing
- DataObject
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: string
multivalued: false
pattern: ^insdc.sra:(E|D|S)RX[0-9]{6,}$
attributes:
read_1_url:
name: read_1_url
description: URL for FASTQ file of read 1 of a pair of reads.
title: read 1 FASTQ
comments:
- If multiple runs were performed, separate each URL with a semi-colon.
- External data urls should be available for at least a year. If you would like
NMDC to submit your data to an appropriate raw data repository on your behalf
please contact us at microbiomedata.science@gmail.com.
from_schema: https://example.com/nmdc_submission_schema
rank: 10
alias: read_1_url
owner: MetagenomeSequencingNonInterleavedDataInterface
domain_of:
- MetagenomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
slot_group: data_files_section
range: string
required: true
multivalued: false
read_1_md5_checksum:
name: read_1_md5_checksum
description: MD5 checksum of file in "read 1 FASTQ".
title: read 1 FASTQ MD5
comments:
- If multiple runs were performed, separate each checksum with a semi-colon. The
number of checksums should match the number of URLs in "read 1 FASTQ".
from_schema: https://example.com/nmdc_submission_schema
rank: 11
alias: read_1_md5_checksum
owner: MetagenomeSequencingNonInterleavedDataInterface
domain_of:
- MetagenomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
slot_group: data_files_section
range: string
multivalued: false
read_2_url:
name: read_2_url
description: URL for FASTQ file of read 2 of a pair of reads.
title: read 2 FASTQ
comments:
- If multiple runs were performed, separate each URL with a semi-colon.
- External data urls should be available for at least a year. If you would like
NMDC to submit your data to an appropriate raw data repository on your behalf
please contact us at microbiomedata.science@gmail.com.
from_schema: https://example.com/nmdc_submission_schema
rank: 12
alias: read_2_url
owner: MetagenomeSequencingNonInterleavedDataInterface
domain_of:
- MetagenomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
slot_group: data_files_section
range: string
required: true
multivalued: false
read_2_md5_checksum:
name: read_2_md5_checksum
description: MD5 checksum of file in "read 2 FASTQ".
title: read 2 FASTQ MD5
comments:
- If multiple runs were performed, separate each checksum with a semi-colon. The
number of checksums should match the number of URLs in "read 2 FASTQ".
from_schema: https://example.com/nmdc_submission_schema
rank: 13
alias: read_2_md5_checksum
owner: MetagenomeSequencingNonInterleavedDataInterface
domain_of:
- MetagenomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
slot_group: data_files_section
range: string
multivalued: false
model:
name: model
description: The model of the Illumina sequencing instrument used to generate
the data.
title: instrument model
from_schema: https://w3id.org/nmdc/nmdc
rank: 2
alias: model
owner: MetagenomeSequencingNonInterleavedDataInterface
domain_of:
- Instrument
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: IlluminaInstrumentModelEnum
required: true
processing_institution:
name: processing_institution
description: The organization that processed the sample.
title: processing institution
from_schema: https://w3id.org/nmdc/nmdc
rank: 3
alias: processing_institution
owner: MetagenomeSequencingNonInterleavedDataInterface
domain_of:
- PlannedProcess
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: ProcessingInstitutionEnum
protocol_link:
name: protocol_link
description: A URL to a description of the sequencing protocol used to generate
the data.
title: protocol
from_schema: https://w3id.org/nmdc/nmdc
rank: 4
alias: protocol_link
owner: MetagenomeSequencingNonInterleavedDataInterface
domain_of:
- PlannedProcess
- Study
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: string
multivalued: false
insdc_bioproject_identifiers:
name: insdc_bioproject_identifiers
description: identifiers for corresponding project in INSDC Bioproject
title: INSDC bioproject identifier
comments:
- these are distinct IDs from INSDC SRA/ENA project identifiers, but are usually(?)
one to one
examples:
- value: bioproject:PRJNA366857
from_schema: https://w3id.org/nmdc/nmdc
see_also:
- https://www.ncbi.nlm.nih.gov/bioproject/
- https://www.ddbj.nig.ac.jp/bioproject/index-e.html
aliases:
- NCBI bioproject identifiers
- DDBJ bioproject identifiers
rank: 5
is_a: study_identifiers
mixins:
- insdc_identifiers
alias: insdc_bioproject_identifiers
owner: MetagenomeSequencingNonInterleavedDataInterface
domain_of:
- NucleotideSequencing
- Study
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: string
multivalued: false
pattern: ^bioproject:PRJ[DEN][A-Z][0-9]+$
insdc_experiment_identifiers:
name: insdc_experiment_identifiers
description: If multiple identifiers are provided, separate them with a semicolon.
The number of identifiers must match the number of sequencing files.
title: INSDC experiment identifiers
from_schema: https://w3id.org/nmdc/nmdc
rank: 6
is_a: external_database_identifiers
mixins:
- insdc_identifiers
alias: insdc_experiment_identifiers
owner: MetagenomeSequencingNonInterleavedDataInterface
domain_of:
- NucleotideSequencing
- DataObject
- MetagenomeSequencingNonInterleavedDataInterface
- MetagenomeSequencingInterleavedDataInterface
- MetatranscriptomeSequencingNonInterleavedDataInterface
- MetatranscriptomeSequencingInterleavedDataInterface
slot_group: sequencing_section
range: string
multivalued: false
pattern: ^insdc.sra:(E|D|S)RX[0-9]{6,}$
analysis_type:
name: analysis_type
description: Select all the data types associated or available for this biosample
title: analysis/data type
examples:
- value: metagenomics; metabolomics; metaproteomics
from_schema: https://w3id.org/nmdc/nmdc
see_also:
- MIxS:investigation_type
rank: 3
alias: analysis_type
owner: MetagenomeSequencingNonInterleavedDataInterface
domain_of:
- Biosample
- DhMultiviewCommonColumnsMixin
slot_group: sample_id_section
range: AnalysisTypeEnum
required: true
recommended: false
multivalued: true
samp_name:
name: samp_name
annotations:
expected_value:
tag: expected_value
value: text
description: A local identifier or name that for the material sample collected.
Refers to the original material collected or to any derived sub-samples.
title: sample name
comments:
- It can have any format, but we suggest that you make it concise, unique and
consistent within your lab, and as informative as possible.
examples:
- value: Rock core CB1178(5-6) from NSW
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- sample name
rank: 1
is_a: investigation field
string_serialization: '{text}'
slot_uri: MIXS:0001107
identifier: true
alias: samp_name
owner: MetagenomeSequencingNonInterleavedDataInterface
domain_of:
- Biosample
- DhMultiviewCommonColumnsMixin
slot_group: sample_id_section
range: string
required: true
multivalued: false
source_mat_id:
name: source_mat_id
annotations:
expected_value:
tag: expected_value
value: 'for cultures of microorganisms: identifiers for two culture collections;
for other material a unique arbitrary identifer'
description: A globally unique identifier assigned to the biological sample.
title: source material identifier
todos:
- Currently, the comments say to use UUIDs. However, if we implement assigning
NMDC identifiers with the minter we dont need to require a GUID. It can be an
optional field to fill out only if they already have a resolvable ID.
- Currently, the comments say to use UUIDs. However, if we implement assigning
NMDC identifiers with the minter we dont need to require a GUID. It can be an
optional field to fill out only if they already have a resolvable ID.
notes:
- The source material IS the Globally Unique ID
comments:
- Identifiers must be prefixed. Possible FAIR prefixes are IGSNs (http://www.geosamples.org/getigsn),
NCBI biosample accession numbers, ARK identifiers (https://arks.org/). These
IDs enable linking to derived analytes and subsamples. If you have not assigned
FAIR identifiers to your samples, you can generate UUIDs (https://www.uuidgenerator.net/).
- Identifiers must be prefixed. Possible FAIR prefixes are IGSNs (http://www.geosamples.org/getigsn),
NCBI biosample accession numbers, ARK identifiers (https://arks.org/). These
IDs enable linking to derived analytes and subsamples. If you have not assigned
FAIR identifiers to your samples, you can generate UUIDs (https://www.uuidgenerator.net/).
examples:
- value: IGSN:AU1243
- value: UUID:24f1467a-40f4-11ed-b878-0242ac120002
from_schema: https://w3id.org/nmdc/nmdc
aliases:
- source material identifiers
rank: 2
is_a: nucleic acid sequence source field
string_serialization: '{text}:{text}'
slot_uri: MIXS:0000026
alias: source_mat_id
owner: MetagenomeSequencingNonInterleavedDataInterface
domain_of:
- Biosample
- DhMultiviewCommonColumnsMixin
slot_group: sample_id_section
range: string
multivalued: false
pattern: '[^\:\n\r]+\:[^\:\n\r]+'