sample_annotator package¶
Submodules¶
sample_annotator.report_model module¶
- class sample_annotator.report_model.AnnotationMultiSampleReport(reports: Optional[List[sample_annotator.report_model.AnnotationReport]] = None, **kwargs)¶
Bases:
object
Multi-report of a set of samples
- all_outputs() List[Dict[str, Any]] ¶
- as_dataframe()¶
- reports: List[sample_annotator.report_model.AnnotationReport] = None¶
- class sample_annotator.report_model.AnnotationReport(messages: Optional[List[sample_annotator.report_model.Message]] = None, package: Optional[sample_annotator.report_model.PackageCombo] = None, input: Optional[Dict[str, Any]] = None, output: Optional[Dict[str, Any]] = None, sample_id: Optional[str] = None, **kwargs)¶
Bases:
object
Annotation report for a single sample
- add_message(*args, **kwargs)¶
- annotation_sufficiency_score = 0.0¶
- as_dataframe()¶
- input: Dict[str, Any] = None¶
- max_severity()¶
- messages: List[sample_annotator.report_model.Message] = None¶
- messages_by_category() Dict ¶
- output: Dict[str, Any] = None¶
- package: sample_annotator.report_model.PackageCombo = None¶
- passes()¶
- sample_id: str = None¶
- class sample_annotator.report_model.Category(value)¶
Bases:
enum.Enum
An enumeration.
- BadNull = 'bad-null'¶
- ControlledVocabulary = 'controlled-vocabulary'¶
- Core = 'core'¶
- Geo = 'geo'¶
- Identifier = 'identifier'¶
- Inapplicable = 'inapplicable'¶
- MeasurementSyntax = 'measurement-syntax'¶
- MissingCore = 'missing-core'¶
- Unclassified = 'unclassified'¶
- Units = 'units'¶
- UnknownField = 'unknown-field'¶
- static list()¶
- class sample_annotator.report_model.Message(description: Optional[str] = None, severity: int = 1, was_repaired: Optional[bool] = None, category: sample_annotator.report_model.Category = Category.Unclassified, field: Optional[str] = None, **kwargs)¶
Bases:
object
Individual report message
- as_dict() Dict ¶
- category: sample_annotator.report_model.Category = 'unclassified'¶
- description: str = None¶
- field: str = None¶
- severity: int = 1¶
- was_repaired: bool = None¶
sample_annotator.sample_annotator module¶
- class sample_annotator.sample_annotator.SampleAnnotator(target_class: Optional[linkml_runtime.linkml_model.meta.ClassDefinition] = None, geoengine: Optional[sample_annotator.geolocation.geotools.GeoEngine] = None, measurement_engine: sample_annotator.measurements.measurements.MeasurementEngine = MeasurementEngine(), schema: sample_annotator.metadata.sample_schema.SampleSchema = SampleSchema(object=None), **kwargs)¶
Bases:
object
TODO
- annotate(sample: Dict[str, Any], study: Optional[Dict[str, Any]] = None) sample_annotator.report_model.AnnotationReport ¶
Annotate a sample
Returns an AnnotationReport object that includes a transformed sample representation, plus reports of all errors/warnings found, and repairs made
Performs a sequential series of tidy activities. Each report
- annotate_all(samples: List[Dict[str, Any]], study: Optional[Dict[str, Any]] = None) sample_annotator.report_model.AnnotationMultiSampleReport ¶
Annotate a list of samples
- geoengine: sample_annotator.geolocation.geotools.GeoEngine = None¶
- infer_package(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)¶
Infer the environment package / checklist combo, either from directly asserted fields, or other means
- measurement_engine: sample_annotator.measurements.measurements.MeasurementEngine = MeasurementEngine()¶
- perform_geolocation_inference(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)¶
Performs inference using geolocation information
- perform_inference(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)¶
Performs Machine Learning inference
- perform_text_mining(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)¶
Performs text mining
- schema: sample_annotator.metadata.sample_schema.SampleSchema = SampleSchema(object=None)¶
- target_class: linkml_runtime.linkml_model.meta.ClassDefinition = None¶
- tidy_enumerations(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)¶
Tidies measurement fields
- tidy_keys(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)¶
Performs tidying on all keys/fields/slots in the sample dictionary
uses mappings, e.g. between MIxS5 vs 6
performs case normalization
- tidy_measurements(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)¶
Tidies measurement fields
- tidy_nulls(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)¶
Normalizes to EBI standard null values
https://ena-docs.readthedocs.io/en/latest/submit/samples/missing-values.html
- validate_identifier(sample: Dict[str, Any], report: sample_annotator.report_model.AnnotationReport)¶
sample_annotator.sample_utils module¶
- sample_annotator.sample_utils.create_tests(samples: List[Dict[str, Any]])¶
Takes normalized samples and uses this to create tests