Skip to content

Class: Organism

A material entity that is a living or once-living individual. Organism instances represent the biological identity of what is in a sample, not the sample itself. Sub-species identity (strain, cultivar, lab isolate) is captured by slots on this class rather than via a separate Strain subclass.

URI: nmdc:Organism

classDiagram class Organism click Organism href "../Organism" MaterialEntity <|-- Organism click MaterialEntity href "../MaterialEntity" Organism : alternative_identifiers Organism : classified_as Organism --> "*" NcbiTaxon : classified_as click NcbiTaxon href "../NcbiTaxon" Organism : description Organism : estimated_size Organism : gc_content Organism --> "0..1" QuantityValue : gc_content click QuantityValue href "../QuantityValue" Organism : id Organism : isolate_name Organism : name Organism : organism_genus Organism : organism_species Organism : ref_biomaterial Organism --> "0..1" TextValue : ref_biomaterial click TextValue href "../TextValue" Organism : strain_name Organism : type

Inheritance

Slots

Name Cardinality and Range Description Inheritance
classified_as *
NcbiTaxon
Taxonomic classification of this organism direct
organism_genus 0..1
String
Genus of the organism direct
organism_species 0..1
String
Species of the organism direct
strain_name 0..1
String
Strain or cultivar name of the organism direct
isolate_name 0..1
String
Isolate or mutant name direct
estimated_size 0..1
String
Estimated genome size, as integer base pairs direct
gc_content 0..1
QuantityValue
Estimated GC content as a percentage direct
ref_biomaterial 0..1
TextValue
Reference for the organism, preferentially a DOI when a primary publication o... direct
id 1
Uriorcurie
A unique identifier for a thing NamedThing
name 0..1
String
A human readable label for an entity NamedThing
description 0..1
String
a human-readable description of a thing NamedThing
alternative_identifiers *
Uriorcurie
A list of alternative identifiers for the entity NamedThing
type 1
Uriorcurie
the class_uri of the class that has been instantiated NamedThing

Usages

used by used in type used
Database organism_set range Organism
OrganismSample expected_organism range Organism

Comments

  • Organism instances are stored in organism_set. An Organism is not a sample — it is the biological entity that an OrganismSample is expected to contain, linked via expected_organism. Sub-species identity (strain_name, isolate_name) is captured directly on Organism.

See Also

Identifier and Mapping Information

Schema Source

Mappings

Mapping Type Mapped Value
exact COB:0000022

LinkML Source

Direct

name: Organism
description: A material entity that is a living or once-living individual. Organism
  instances represent the biological identity of what is in a sample, not the sample
  itself. Sub-species identity (strain, cultivar, lab isolate) is captured by slots
  on this class rather than via a separate Strain subclass.
notes:
- 'DEBATED — `estimated_size` and `gc_content` placement on Organism. Montana argues
  these are analyte properties measured during sample QC (like concentration or absorbance)
  rather than stable organism properties, and belong only in submission-schema. Counterargument:
  genome size and GC% are reproducible biological properties of the organism that
  are useful for downstream data integration. Keeping on Organism pending resolution.'
comments:
- Organism instances are stored in organism_set. An Organism is not a sample — it
  is the biological entity that an OrganismSample is expected to contain, linked via
  expected_organism. Sub-species identity (strain_name, isolate_name) is captured
  directly on Organism.
from_schema: https://w3id.org/nmdc/nmdc
see_also:
- https://github.com/microbiomedata/nmdc-schema/issues/2959
- https://github.com/microbiomedata/nmdc-schema/issues/2803
- https://github.com/microbiomedata/nmdc-schema/issues/2971
exact_mappings:
- COB:0000022
is_a: MaterialEntity
slots:
- classified_as
- organism_genus
- organism_species
- strain_name
- isolate_name
- estimated_size
- gc_content
- ref_biomaterial
slot_usage:
  id:
    name: id
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:orgn-{id_shoulder}-{id_blade}$'
      interpolated: true
  classified_as:
    name: classified_as
    description: 'Taxonomic classification of this organism. Narrowed from the global
      OntologyClass range (defined on the slot itself) to NcbiTaxon, since organism
      identity at NMDC is anchored to NCBI Taxonomy. Per #3016 — the broader pattern
      is to narrow `classified_as` to `NcbiTaxon` on all organism-oriented classes
      via slot_usage.'
    range: NcbiTaxon
  estimated_size:
    name: estimated_size
    description: Estimated genome size, as integer base pairs. Reuses MIxS estimated_size
      (MIXS:0000024). The JGI isolate field reports in megabases (Mb); values must
      be converted to the MIxS integer-bp representation before validation and storage.
      The submission portal should auto-populate the "bp" suffix and enforce integer
      input.
    structured_aliases:
    - literal_form: Estimated Genome Size (Mb)
      predicate: BROAD_SYNONYM
      contexts:
      - https://jgi.doe.gov/isolate-submission-form/v19
    structured_pattern:
      syntax: ^[0-9]+ bp$
      interpolated: false
  ref_biomaterial:
    name: ref_biomaterial
    description: Reference for the organism, preferentially a DOI when a primary publication
      or genome report exists; PMID and URL are also accepted per the MIxS ref_biomaterial
      pattern (`{PMID}|{DOI}|{URL}`). Reuses MIxS ref_biomaterial (MIXS:0000025).
    comments:
    - The MIxS pattern accepts DOI, PMID, or URL. DOI is preferred when available
      — it gives a stable reference to the publication or genome report. See the `associated_dois`
      pattern elsewhere in the NMDC schema for DOI-structured alternatives.
    - JGI "Reference Genome" submissions sometimes carry non-publication identifiers
      such as IMG or Phytozome IDs, which do not match the MIxS pattern. Those are
      out of scope for this slot and should be captured separately (see `gold_organism_identifiers`
      and `insdc_nucleotide_identifiers` for genome / assembly references).
    - The MIxS name ref_biomaterial may be renamed in a future MIxS release. See ongoing
      MIxS renaming work.
    examples:
    - description: DOI form (preferred when a primary publication exists)
      object:
        type: nmdc:TextValue
        has_raw_value: doi:10.1016/j.syapm.2018.01.009
    - description: PubMed ID form
      object:
        type: nmdc:TextValue
        has_raw_value: PMID:24296464
    - description: URL form (e.g. NCBI Genome record)
      object:
        type: nmdc:TextValue
        has_raw_value: https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_000016065.1/
    structured_aliases:
    - literal_form: Reference Genome
      predicate: RELATED_SYNONYM
      contexts:
      - https://jgi.doe.gov/isolate-submission-form/v19
class_uri: nmdc:Organism

Induced

name: Organism
description: A material entity that is a living or once-living individual. Organism
  instances represent the biological identity of what is in a sample, not the sample
  itself. Sub-species identity (strain, cultivar, lab isolate) is captured by slots
  on this class rather than via a separate Strain subclass.
notes:
- 'DEBATED — `estimated_size` and `gc_content` placement on Organism. Montana argues
  these are analyte properties measured during sample QC (like concentration or absorbance)
  rather than stable organism properties, and belong only in submission-schema. Counterargument:
  genome size and GC% are reproducible biological properties of the organism that
  are useful for downstream data integration. Keeping on Organism pending resolution.'
comments:
- Organism instances are stored in organism_set. An Organism is not a sample — it
  is the biological entity that an OrganismSample is expected to contain, linked via
  expected_organism. Sub-species identity (strain_name, isolate_name) is captured
  directly on Organism.
from_schema: https://w3id.org/nmdc/nmdc
see_also:
- https://github.com/microbiomedata/nmdc-schema/issues/2959
- https://github.com/microbiomedata/nmdc-schema/issues/2803
- https://github.com/microbiomedata/nmdc-schema/issues/2971
exact_mappings:
- COB:0000022
is_a: MaterialEntity
slot_usage:
  id:
    name: id
    required: true
    structured_pattern:
      syntax: '{id_nmdc_prefix}:orgn-{id_shoulder}-{id_blade}$'
      interpolated: true
  classified_as:
    name: classified_as
    description: 'Taxonomic classification of this organism. Narrowed from the global
      OntologyClass range (defined on the slot itself) to NcbiTaxon, since organism
      identity at NMDC is anchored to NCBI Taxonomy. Per #3016 — the broader pattern
      is to narrow `classified_as` to `NcbiTaxon` on all organism-oriented classes
      via slot_usage.'
    range: NcbiTaxon
  estimated_size:
    name: estimated_size
    description: Estimated genome size, as integer base pairs. Reuses MIxS estimated_size
      (MIXS:0000024). The JGI isolate field reports in megabases (Mb); values must
      be converted to the MIxS integer-bp representation before validation and storage.
      The submission portal should auto-populate the "bp" suffix and enforce integer
      input.
    structured_aliases:
    - literal_form: Estimated Genome Size (Mb)
      predicate: BROAD_SYNONYM
      contexts:
      - https://jgi.doe.gov/isolate-submission-form/v19
    structured_pattern:
      syntax: ^[0-9]+ bp$
      interpolated: false
  ref_biomaterial:
    name: ref_biomaterial
    description: Reference for the organism, preferentially a DOI when a primary publication
      or genome report exists; PMID and URL are also accepted per the MIxS ref_biomaterial
      pattern (`{PMID}|{DOI}|{URL}`). Reuses MIxS ref_biomaterial (MIXS:0000025).
    comments:
    - The MIxS pattern accepts DOI, PMID, or URL. DOI is preferred when available
      — it gives a stable reference to the publication or genome report. See the `associated_dois`
      pattern elsewhere in the NMDC schema for DOI-structured alternatives.
    - JGI "Reference Genome" submissions sometimes carry non-publication identifiers
      such as IMG or Phytozome IDs, which do not match the MIxS pattern. Those are
      out of scope for this slot and should be captured separately (see `gold_organism_identifiers`
      and `insdc_nucleotide_identifiers` for genome / assembly references).
    - The MIxS name ref_biomaterial may be renamed in a future MIxS release. See ongoing
      MIxS renaming work.
    examples:
    - description: DOI form (preferred when a primary publication exists)
      object:
        type: nmdc:TextValue
        has_raw_value: doi:10.1016/j.syapm.2018.01.009
    - description: PubMed ID form
      object:
        type: nmdc:TextValue
        has_raw_value: PMID:24296464
    - description: URL form (e.g. NCBI Genome record)
      object:
        type: nmdc:TextValue
        has_raw_value: https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_000016065.1/
    structured_aliases:
    - literal_form: Reference Genome
      predicate: RELATED_SYNONYM
      contexts:
      - https://jgi.doe.gov/isolate-submission-form/v19
attributes:
  classified_as:
    name: classified_as
    description: 'Taxonomic classification of this organism. Narrowed from the global
      OntologyClass range (defined on the slot itself) to NcbiTaxon, since organism
      identity at NMDC is anchored to NCBI Taxonomy. Per #3016 — the broader pattern
      is to narrow `classified_as` to `NcbiTaxon` on all organism-oriented classes
      via slot_usage.'
    comments:
    - 'Taxonomy-oriented uses (e.g. on Organism) should point to NcbiTaxon instances.
      OrganismSample reaches taxonomy indirectly via expected_organism.classified_as.
      The global range stays OntologyClass; narrowing to NcbiTaxon via slot_usage
      is tracked in #3016.'
    from_schema: https://w3id.org/nmdc/nmdc
    see_also:
    - https://github.com/microbiomedata/nmdc-schema/issues/2959
    narrow_mappings:
    - biolink:in_taxon
    rank: 1000
    alias: classified_as
    owner: Organism
    domain_of:
    - Organism
    range: NcbiTaxon
    multivalued: true
    inlined: true
    inlined_as_list: true
  organism_genus:
    name: organism_genus
    description: Genus of the organism.
    comments:
    - Free-text submitter-provided genus name. For an ontology-grounded classification,
      use `classified_as` with a NcbiTaxon instance on the parent Organism class.
    examples:
    - value: Shewanella
      description: GOLD organism_v2 Go0000189 (Shewanella loihica PV-4, queried 2026-04-21)
    - value: Ruegeria
      description: GOLD organism_v2 Go0000514 (Ruegeria pomeroyi DSS-3, queried 2026-04-21)
    - value: Campylobacter
      description: GOLD organism_v2 (Go0000058, queried 2026-04-14)
    in_subset:
    - jgi_isolate
    from_schema: https://w3id.org/nmdc/nmdc
    structured_aliases:
    - literal_form: Genus
      predicate: EXACT_SYNONYM
      contexts:
      - https://jgi.doe.gov/isolate-submission-form/v19
    rank: 1000
    alias: organism_genus
    owner: Organism
    domain_of:
    - Organism
    range: string
  organism_species:
    name: organism_species
    description: Species of the organism.
    comments:
    - Free-text submitter-provided species name. For an ontology-grounded classification,
      use `classified_as` with a NcbiTaxon instance on the parent Organism class.
    examples:
    - value: loihica
      description: GOLD organism_v2 Go0000189 (Shewanella loihica PV-4, queried 2026-04-21)
    - value: pomeroyi
      description: GOLD organism_v2 Go0000514 (Ruegeria pomeroyi DSS-3, queried 2026-04-21)
    - value: sp.
      description: GOLD organism_v2.species (n=37 records, queried 2026-04-30) — used
        when the isolate has not yet been assigned a species name
    in_subset:
    - jgi_isolate
    from_schema: https://w3id.org/nmdc/nmdc
    structured_aliases:
    - literal_form: Species
      predicate: EXACT_SYNONYM
      contexts:
      - https://jgi.doe.gov/isolate-submission-form/v19
    rank: 1000
    alias: organism_species
    owner: Organism
    domain_of:
    - Organism
    range: string
  strain_name:
    name: strain_name
    description: Strain or cultivar name of the organism.
    comments:
    - 'Microbial strain identifiers and plant cultivar names (governed by the International
      Code of Nomenclature for Cultivated Plants, ICNCP) are nomenclaturally distinct,
      but this slot accepts both for now to match the JGI Isolate (NA) v19 form''s
      combined "Strain or cultivar" field. A separate `cultivar_name` slot may be
      added if a plant-specific use case emerges; see #3056.'
    - MIxS `subspecf_gen_lin` (MIXS:0000020) covers this concept along with cultivar,
      serovar, biotype, ecotype, and other sub-species lineage types in a single slot
      using a rank-prefix encoding (e.g. "strain:PV-4"). NMDC splits the concept into
      separate slots; this slot covers the strain rank specifically.
    examples:
    - value: PV-4
      description: GOLD organism_v2 Go0000189 (Shewanella loihica PV-4, queried 2026-04-21)
    - value: DSS-3
      description: GOLD organism_v2 Go0000514 (Ruegeria pomeroyi DSS-3, queried 2026-04-21)
    - value: DSM 6724
      description: GOLD organism_v2 Dictyoglomus turgidum (Go0000002, queried 2026-04-14)
    in_subset:
    - jgi_isolate
    from_schema: https://w3id.org/nmdc/nmdc
    structured_aliases:
    - literal_form: Strain or cultivar
      predicate: EXACT_SYNONYM
      contexts:
      - https://jgi.doe.gov/isolate-submission-form/v19
    related_mappings:
    - MIXS:0000020
    rank: 1000
    alias: strain_name
    owner: Organism
    domain_of:
    - Organism
    range: string
  isolate_name:
    name: isolate_name
    description: Isolate or mutant name.
    comments:
    - MIxS `subspecf_gen_lin` (MIXS:0000020) covers this concept along with strain,
      cultivar, serovar, biotype, ecotype, and other sub-species lineage types in
      a single slot using a rank-prefix encoding. NMDC uses a separate slot for the
      isolate rank specifically.
    examples:
    - value: Bd21-3
      description: GOLD dw_sample_taxonomy_info.isolate (n=260 records, queried 2026-04-30)
        — Brachypodium distachyon Bd21-3 reference accession
    - value: MR164
      description: GOLD dw_sample_taxonomy_info.isolate (n=555 records, queried 2026-04-30)
    - value: Isolate
      description: GOLD dw_sample_taxonomy_info.isolate (n=918 records, queried 2026-04-30)
        — generic placeholder used when no specific mutant/isolate name is recorded
    in_subset:
    - jgi_isolate
    from_schema: https://w3id.org/nmdc/nmdc
    structured_aliases:
    - literal_form: Isolate
      predicate: EXACT_SYNONYM
      contexts:
      - https://jgi.doe.gov/isolate-submission-form/v19
    related_mappings:
    - MIXS:0000020
    rank: 1000
    alias: isolate_name
    owner: Organism
    domain_of:
    - Organism
    range: string
  estimated_size:
    name: estimated_size
    annotations:
      Expected_value:
        tag: Expected_value
        value: number of base pairs
    description: Estimated genome size, as integer base pairs. Reuses MIxS estimated_size
      (MIXS:0000024). The JGI isolate field reports in megabases (Mb); values must
      be converted to the MIxS integer-bp representation before validation and storage.
      The submission portal should auto-populate the "bp" suffix and enforce integer
      input.
    title: estimated size
    examples:
    - value: 300000 bp
    from_schema: https://w3id.org/nmdc/nmdc
    structured_aliases:
    - literal_form: Estimated Genome Size (Mb)
      predicate: BROAD_SYNONYM
      contexts:
      - https://jgi.doe.gov/isolate-submission-form/v19
    rank: 1000
    keywords:
    - size
    string_serialization: '{integer} bp'
    slot_uri: MIXS:0000024
    alias: estimated_size
    owner: Organism
    domain_of:
    - Organism
    range: string
    structured_pattern:
      syntax: ^[0-9]+ bp$
      interpolated: false
  gc_content:
    name: gc_content
    annotations:
      storage_units:
        tag: storage_units
        value: '%'
    description: Estimated GC content as a percentage.
    comments:
    - Expected `has_numeric_value` range is 0–100 (percentage units).
    examples:
    - description: GOLD project Gp0000189 (Shewanella loihica PV-4, queried 2026-04-21)
      object:
        type: nmdc:QuantityValue
        has_numeric_value: 54.0
        has_unit: '%'
    - description: GOLD project Gp0000514 (Ruegeria pomeroyi DSS-3, queried 2026-04-21)
      object:
        type: nmdc:QuantityValue
        has_numeric_value: 64.0
        has_unit: '%'
    in_subset:
    - jgi_isolate
    from_schema: https://w3id.org/nmdc/nmdc
    structured_aliases:
    - literal_form: GC Content %
      predicate: EXACT_SYNONYM
      contexts:
      - https://jgi.doe.gov/isolate-submission-form/v19
    rank: 1000
    alias: gc_content
    owner: Organism
    domain_of:
    - Organism
    range: QuantityValue
  ref_biomaterial:
    name: ref_biomaterial
    description: Reference for the organism, preferentially a DOI when a primary publication
      or genome report exists; PMID and URL are also accepted per the MIxS ref_biomaterial
      pattern (`{PMID}|{DOI}|{URL}`). Reuses MIxS ref_biomaterial (MIXS:0000025).
    title: reference for biomaterial
    comments:
    - The MIxS pattern accepts DOI, PMID, or URL. DOI is preferred when available
      — it gives a stable reference to the publication or genome report. See the `associated_dois`
      pattern elsewhere in the NMDC schema for DOI-structured alternatives.
    - JGI "Reference Genome" submissions sometimes carry non-publication identifiers
      such as IMG or Phytozome IDs, which do not match the MIxS pattern. Those are
      out of scope for this slot and should be captured separately (see `gold_organism_identifiers`
      and `insdc_nucleotide_identifiers` for genome / assembly references).
    - The MIxS name ref_biomaterial may be renamed in a future MIxS release. See ongoing
      MIxS renaming work.
    examples:
    - description: DOI form (preferred when a primary publication exists)
      object:
        type: nmdc:TextValue
        has_raw_value: doi:10.1016/j.syapm.2018.01.009
    - description: PubMed ID form
      object:
        type: nmdc:TextValue
        has_raw_value: PMID:24296464
    - description: URL form (e.g. NCBI Genome record)
      object:
        type: nmdc:TextValue
        has_raw_value: https://www.ncbi.nlm.nih.gov/datasets/genome/GCA_000016065.1/
    from_schema: https://w3id.org/nmdc/nmdc
    structured_aliases:
    - literal_form: Reference Genome
      predicate: RELATED_SYNONYM
      contexts:
      - https://jgi.doe.gov/isolate-submission-form/v19
    rank: 1000
    slot_uri: MIXS:0000025
    alias: ref_biomaterial
    owner: Organism
    domain_of:
    - Organism
    range: TextValue
    structured_pattern:
      syntax: ^({PMID}|{DOI}|{URL})$
      interpolated: true
      partial_match: true
  id:
    name: id
    description: A unique identifier for a thing. Must be either a CURIE shorthand
      for a URI or a complete URI
    notes:
    - 'abstracted pattern: prefix:typecode-authshoulder-blade(.version)?(_seqsuffix)?'
    - a minimum length of 3 characters is suggested for typecodes, but 1 or 2 characters
      will be accepted
    - typecodes must correspond 1:1 to a class in the NMDC schema. this will be checked
      via per-class id slot usage assertions
    - minting authority shoulders should probably be enumerated and checked in the
      pattern
    examples:
    - value: nmdc:mgmag-00-x012.1_7_c1
      description: https://github.com/microbiomedata/nmdc-schema/pull/499#discussion_r1018499248
    from_schema: https://w3id.org/nmdc/nmdc
    structured_aliases:
    - literal_form: workflow_execution_id
      predicate: NARROW_SYNONYM
      contexts:
      - https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
    - literal_form: data_object_id
      predicate: NARROW_SYNONYM
      contexts:
      - https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
    rank: 1000
    identifier: true
    alias: id
    owner: Organism
    domain_of:
    - NamedThing
    range: uriorcurie
    required: true
    pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,]*$
    structured_pattern:
      syntax: '{id_nmdc_prefix}:orgn-{id_shoulder}-{id_blade}$'
      interpolated: true
  name:
    name: name
    description: A human readable label for an entity
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: name
    owner: Organism
    domain_of:
    - PersonValue
    - NamedThing
    - Protocol
    range: string
  description:
    name: description
    description: a human-readable description of a thing
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    slot_uri: dcterms:description
    alias: description
    owner: Organism
    domain_of:
    - ImageValue
    - NamedThing
    - Protocol
    range: string
  alternative_identifiers:
    name: alternative_identifiers
    description: A list of alternative identifiers for the entity.
    from_schema: https://w3id.org/nmdc/nmdc
    rank: 1000
    alias: alternative_identifiers
    owner: Organism
    domain_of:
    - NamedThing
    - MetaboliteIdentification
    range: uriorcurie
    multivalued: true
    pattern: ^[a-zA-Z0-9][a-zA-Z0-9_\.]+:[a-zA-Z0-9_][a-zA-Z0-9_\-\/\.,\(\)\=\#]*$
  type:
    name: type
    description: the class_uri of the class that has been instantiated
    notes:
    - makes it easier to read example data files
    - required for polymorphic MongoDB collections
    examples:
    - value: nmdc:Biosample
    - value: nmdc:Study
    from_schema: https://w3id.org/nmdc/nmdc
    see_also:
    - https://github.com/microbiomedata/nmdc-schema/issues/1048
    - https://github.com/microbiomedata/nmdc-schema/issues/1233
    - https://github.com/microbiomedata/nmdc-schema/issues/248
    structured_aliases:
    - literal_form: workflow_execution_class
      predicate: NARROW_SYNONYM
      contexts:
      - https://bitbucket.org/berkeleylab/jgi-jat/macros/nmdc_metadata.yaml
    rank: 1000
    slot_uri: rdf:type
    designates_type: true
    alias: type
    owner: Organism
    domain_of:
    - EukEval
    - FunctionalAnnotationAggMember
    - PeptideQuantification
    - ProteinQuantification
    - GenomeFeature
    - FunctionalAnnotation
    - AttributeValue
    - NamedThing
    - OntologyRelation
    - FailureCategorization
    - Protocol
    - CreditAssociation
    - Doi
    - ProvenanceMetadata
    - MobilePhaseSegment
    - PortionOfSubstance
    - MagBin
    - MetaboliteIdentification
    range: uriorcurie
    required: true
class_uri: nmdc:Organism