ehrapy.preprocessing.qc_lab_measurements

ehrapy.preprocessing.qc_lab_measurements(adata, reference_table=None, measurements=None, unit=None, layer=None, threshold=20, age_col=None, age_range=None, sex_col=None, sex=None, ethnicity_col=None, ethnicity=None, copy=False, verbose=False)[source]

Examines lab measurements for reference ranges and outliers.

Source:

The used reference values were obtained from https://accessmedicine.mhmedical.com/content.aspx?bookid=1069&sectionid=60775149 . This table is compiled from data in the following sources:

  • Tietz NW, ed. Clinical Guide to Laboratory Tests. 3rd ed. Philadelphia: WB Saunders Co; 1995;

  • Laposata M. SI Unit Conversion Guide. Boston: NEJM Books; 1992;

  • American Medical Association Manual of Style: A Guide for Authors and Editors. 9th ed. Chicago: AMA; 1998:486–503. Copyright 1998, American Medical Association;

  • Jacobs DS, DeMott WR, Oxley DK, eds. Jacobs & DeMott Laboratory Test Handbook With Key Word Index. 5th ed. Hudson, OH: Lexi-Comp Inc; 2001;

  • Henry JB, ed. Clinical Diagnosis and Management by Laboratory Methods. 20th ed. Philadelphia: WB Saunders Co; 2001;

  • Kratz A, et al. Laboratory reference values. N Engl J Med. 2006;351:1548–1563; 7) Burtis CA, ed. Tietz Textbook of Clinical Chemistry and Molecular Diagnostics. 5th ed. St. Louis: Elsevier; 2012.

This version of the table of reference ranges was reviewed and updated by Jessica Franco-Colon, PhD, and Kay Brooks.

Limitations:
  • Reference ranges differ between continents, countries and even laboratories (https://informatics.bmj.com/content/28/1/e100419). The default values used here are only one of many options.

  • Ensure that the values used as input are provided with the correct units. We recommend the usage of SI values.

  • The reference values pertain to adults. Many of the reference ranges need to be adapted for children.

  • By default if no gender is provided and no unisex values are available, we use the male reference ranges.

  • The used reference ranges may be biased for ethnicity. Please examine the primary sources if required.

  • We recommend a glance at https://www.nature.com/articles/s41591-021-01468-6 for the effect of such covariates.

Additional values:

If you want to specify your own table as a Pandas DataFrame please examine the existing default table. Ethnicity and age columns can be added. https://github.com/theislab/ehrapy/blob/main/ehrapy/preprocessing/laboratory_reference_tables/laposata.tsv

Parameters:
  • adata (AnnData) – Annotated data matrix.

  • reference_table (DataFrame) – A custom DataFrame with reference values. Defaults to the laposata table if not specified.

  • measurements (list[str]) – A list of measurements to check.

  • unit (Literal['traditional', 'SI']) – The unit of the measurements. Defaults to ‘traditional’.

  • layer (str) – Layer containing the matrix to calculate the metrics for.

  • threshold (int) – Minimum required matching confidence score of the fuzzysearch. 0 = no matches, 100 = all must match. Defaults to 20.

  • age_col (str) – Column containing age values.

  • age_range (str) – The inclusive age-range to filter for such as 5-99.

  • sex_col (str) – Column containing sex values. Column must contain ‘U’, ‘M’ or ‘F’.

  • sex (str) – Sex to filter the reference values for. Use U for unisex which uses male values when male and female conflict. Defaults to ‘U|M’.

  • ethnicity_col (str) – Column containing ethnicity values.

  • ethnicity (str) – Ethnicity to filter for.

  • copy (bool) – Whether to return a copy. Defaults to False.

  • verbose (bool) – Whether to have verbose stdout. Notifies user of matched columns and value ranges.

Return type:

AnnData

Returns:

A modified AnnData object (copy if specified).

Examples

>>> import ehrapy as ep
>>> adata = ep.dt.mimic_2(encoded=True)
>>> ep.pp.qc_lab_measurements(adata, measurements=["potassium_first"], verbose=True)