sample_annotator package

Submodules

sample_annotator.report_model module

class sample_annotator.report_model.AnnotationMultiSampleReport(reports: Optional[List[sample_annotator.report_model.AnnotationReport]] = None, **kwargs)

Bases: object

Multi-report of a set of samples

all_outputs() List[Dict[str, Any]]
as_dataframe()
reports: List[sample_annotator.report_model.AnnotationReport] = None
class sample_annotator.report_model.AnnotationReport(messages: Optional[List[sample_annotator.report_model.Message]] = None, package: Optional[sample_annotator.report_model.PackageCombo] = None, input: Optional[Dict[str, Any]] = None, output: Optional[Dict[str, Any]] = None, sample_id: Optional[str] = None, **kwargs)

Bases: object

Annotation report for a single sample

add_message(*args, **kwargs)
annotation_sufficiency_score = 0.0
as_dataframe()
input: Dict[str, Any] = None
max_severity()
messages: List[sample_annotator.report_model.Message] = None
messages_by_category() Dict
output: Dict[str, Any] = None
package: sample_annotator.report_model.PackageCombo = None
passes()
sample_id: str = None
class sample_annotator.report_model.Category(value)

Bases: enum.Enum

An enumeration.

BadNull = 'bad-null'
ControlledVocabulary = 'controlled-vocabulary'
Core = 'core'
Geo = 'geo'
Identifier = 'identifier'
Inapplicable = 'inapplicable'
MeasurementSyntax = 'measurement-syntax'
MissingCore = 'missing-core'
Unclassified = 'unclassified'
Units = 'units'
UnknownField = 'unknown-field'
static list()
class sample_annotator.report_model.Message(description: Optional[str] = None, severity: int = 1, was_repaired: Optional[bool] = None, category: sample_annotator.report_model.Category = Category.Unclassified, field: Optional[str] = None, **kwargs)

Bases: object

Individual report message

as_dict() Dict
category: sample_annotator.report_model.Category = 'unclassified'
description: str = None
field: str = None
severity: int = 1
was_repaired: bool = None
class sample_annotator.report_model.PackageCombo(environmental_package: Optional[str] = None, checklist: Optional[str] = None, **kwargs)

Bases: object

Tuple of environmental package and checklist

checklist: str = None
environmental_package: str = None

sample_annotator.sample_annotator module

class sample_annotator.sample_annotator.SampleAnnotator(target_class: Optional[linkml_runtime.linkml_model.meta.ClassDefinition] = None, geoengine: Optional[sample_annotator.geolocation.geotools.GeoEngine] = None, measurement_engine: sample_annotator.measurements.measurements.MeasurementEngine = MeasurementEngine(), schema: sample_annotator.metadata.sample_schema.SampleSchema = SampleSchema(object=None), **kwargs)

Bases: object

TODO

annotate(sample: Dict[str, Any], study: Optional[Dict[str, Any]] = None) sample_annotator.report_model.AnnotationReport

Annotate a sample

Returns an AnnotationReport object that includes a transformed sample representation, plus reports of all errors/warnings found, and repairs made

Performs a sequential series of tidy activities. Each report

annotate_all(samples: List[Dict[str, Any]], study: Optional[Dict[str, Any]] = None) sample_annotator.report_model.AnnotationMultiSampleReport

Annotate a list of samples

geoengine: sample_annotator.geolocation.geotools.GeoEngine = None
infer_package(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)

Infer the environment package / checklist combo, either from directly asserted fields, or other means

measurement_engine: sample_annotator.measurements.measurements.MeasurementEngine = MeasurementEngine()
perform_geolocation_inference(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)

Performs inference using geolocation information

perform_inference(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)

Performs Machine Learning inference

perform_text_mining(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)

Performs text mining

schema: sample_annotator.metadata.sample_schema.SampleSchema = SampleSchema(object=None)
target_class: linkml_runtime.linkml_model.meta.ClassDefinition = None
tidy_enumerations(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)

Tidies measurement fields

tidy_keys(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)

Performs tidying on all keys/fields/slots in the sample dictionary

  • uses mappings, e.g. between MIxS5 vs 6

  • performs case normalization

tidy_measurements(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)

Tidies measurement fields

tidy_nulls(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)

Normalizes to EBI standard null values

https://ena-docs.readthedocs.io/en/latest/submit/samples/missing-values.html

validate_identifier(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)

sample_annotator.sample_utils module

sample_annotator.sample_utils.create_tests(samples: List[Dict[str, Any]])

Takes normalized samples and uses this to create tests

Module contents