sample_annotator package

Submodules

sample_annotator.report_model module

class sample_annotator.report_model.AnnotationMultiSampleReport(reports: List[AnnotationReport] | None = None)

Bases: object

Multi-report of a set of samples

all_outputs() → List[Dict[str, Any]]

as_dataframe()

reports: List[AnnotationReport] = None

Bases: object

Annotation report for a single sample

add_message(*args, **kwargs)

annotation_sufficiency_score = 0.0

as_dataframe()

input: Dict[str, Any] = None

max_severity()

messages: List[Message] = None

messages_by_category() → Dict

output: Dict[str, Any] = None

package: PackageCombo = None

passes()

sample_id: str = None

class sample_annotator.report_model.Category(value)

Bases: Enum

An enumeration.

BadNull = 'bad-null'

ControlledVocabulary = 'controlled-vocabulary'

Core = 'core'

Geo = 'geo'

Identifier = 'identifier'

Inapplicable = 'inapplicable'

MeasurementSyntax = 'measurement-syntax'

MissingCore = 'missing-core'

Unclassified = 'unclassified'

Units = 'units'

UnknownField = 'unknown-field'

static list()

class sample_annotator.report_model.Message(description: str | None = None, severity: int = 1, was_repaired: bool | None = None, category: Category = Category.Unclassified, field: str | None = None)

Bases: object

Individual report message

as_dict() → Dict

category: Category = 'unclassified'

description: str = None

field: str = None

severity: int = 1

was_repaired: bool = None

class sample_annotator.report_model.PackageCombo(environmental_package: str | None = None, checklist: str | None = None)

Bases: object

Tuple of environmental package and checklist

checklist: str = None

environmental_package: str = None

sample_annotator.sample_annotator module

class sample_annotator.sample_annotator.SampleAnnotator(target_class: ClassDefinition | None = None, geoengine: GeoEngine = GeoEngine(googlemaps_api_key=None), measurement_engine: MeasurementEngine = MeasurementEngine(), schema: SampleSchema = SampleSchema(object=None))

Bases: object

TODO

annotate(sample: Dict[str, Any], study: Dict[str, Any] | None = None) → AnnotationReport

Annotate a sample

Returns an AnnotationReport object that includes a transformed sample representation, plus reports of all errors/warnings found, and repairs made

Performs a sequential series of tidy activities. Each report

annotate_all(samples: List[Dict[str, Any]], study: Dict[str, Any] | None = None) → AnnotationMultiSampleReport: Annotate a list of samples

geoengine: GeoEngine = GeoEngine(googlemaps_api_key=None)

infer_package(sample: Dict[str, Any], report: AnnotationReport): Infer the environment package / checklist combo, either from directly asserted fields, or other means

measurement_engine: MeasurementEngine = MeasurementEngine()

perform_geolocation_inference(sample: Dict[str, Any], report: AnnotationReport): Performs inference using geolocation information

perform_inference(sample: Dict[str, Any], report: AnnotationReport): Performs Machine Learning inference

perform_text_mining(sample: Dict[str, Any], report: AnnotationReport): Performs text mining

schema: SampleSchema = SampleSchema(object=None)

target_class: ClassDefinition = None

tidy_enumerations(sample: Dict[str, Any], report: AnnotationReport): Tidies measurement fields

tidy_keys(sample: Dict[str, Any], report: AnnotationReport)

Performs tidying on all keys/fields/slots in the sample dictionary

uses mappings, e.g. between MIxS5 vs 6

performs case normalization

tidy_measurements(sample: Dict[str, Any], report: AnnotationReport): Tidies measurement fields

tidy_nulls(sample: Dict[str, Any], report: AnnotationReport)

Normalizes to EBI standard null values

https://ena-docs.readthedocs.io/en/latest/submit/samples/missing-values.html

validate_identifier(sample: Dict[str, Any], report: AnnotationReport)

sample_annotator.sample_utils module

sample_annotator.sample_utils.create_tests(samples: List[Dict[str, Any]]): Takes normalized samples and uses this to create tests

sample_annotator package

Submodules

sample_annotator.report_model module

sample_annotator.sample_annotator module

sample_annotator.sample_utils module

Module contents