ehrapy.preprocessing.qc_lab_measurements¶
- ehrapy.preprocessing.qc_lab_measurements(adata, reference_table=None, measurements=None, unit=None, layer=None, threshold=20, age_col=None, age_range=None, sex_col=None, sex=None, ethnicity_col=None, ethnicity=None, copy=False, verbose=False)[source]¶
Examines lab measurements for reference ranges and outliers.
- Source:
The used reference values were obtained from https://accessmedicine.mhmedical.com/content.aspx?bookid=1069§ionid=60775149 . This table is compiled from data in the following sources:
Tietz NW, ed. Clinical Guide to Laboratory Tests. 3rd ed. Philadelphia: WB Saunders Co; 1995;
Laposata M. SI Unit Conversion Guide. Boston: NEJM Books; 1992;
American Medical Association Manual of Style: A Guide for Authors and Editors. 9th ed. Chicago: AMA; 1998:486–503. Copyright 1998, American Medical Association;
Jacobs DS, DeMott WR, Oxley DK, eds. Jacobs & DeMott Laboratory Test Handbook With Key Word Index. 5th ed. Hudson, OH: Lexi-Comp Inc; 2001;
Henry JB, ed. Clinical Diagnosis and Management by Laboratory Methods. 20th ed. Philadelphia: WB Saunders Co; 2001;
Kratz A, et al. Laboratory reference values. N Engl J Med. 2006;351:1548–1563; 7) Burtis CA, ed. Tietz Textbook of Clinical Chemistry and Molecular Diagnostics. 5th ed. St. Louis: Elsevier; 2012.
This version of the table of reference ranges was reviewed and updated by Jessica Franco-Colon, PhD, and Kay Brooks.
- Limitations:
Reference ranges differ between continents, countries and even laboratories (https://informatics.bmj.com/content/28/1/e100419). The default values used here are only one of many options.
Ensure that the values used as input are provided with the correct units. We recommend the usage of SI values.
The reference values pertain to adults. Many of the reference ranges need to be adapted for children.
By default if no gender is provided and no unisex values are available, we use the male reference ranges.
The used reference ranges may be biased for ethnicity. Please examine the primary sources if required.
We recommend a glance at https://www.nature.com/articles/s41591-021-01468-6 for the effect of such covariates.
- Additional values:
Interleukin-6 based on https://pubmed.ncbi.nlm.nih.gov/33155686/
If you want to specify your own table as a Pandas DataFrame please examine the existing default table. Ethnicity and age columns can be added. https://github.com/theislab/ehrapy/blob/main/ehrapy/preprocessing/laboratory_reference_tables/laposata.tsv
- Parameters:
adata (
AnnData
) – Annotated data matrix.reference_table (
DataFrame
, default:None
) – A custom DataFrame with reference values. Defaults to the laposata table if not specified.measurements (
list
[str
], default:None
) – A list of measurements to check.unit (
Literal
['traditional'
,'SI'
], default:None
) – The unit of the measurements.layer (
str
, default:None
) – Layer containing the matrix to calculate the metrics for.threshold (
int
, default:20
) – Minimum required matching confidence score of the fuzzysearch. 0 = no matches, 100 = all must match.age_col (
str
, default:None
) – Column containing age values.age_range (
str
, default:None
) – The inclusive age-range to filter for such as 5-99.sex_col (
str
, default:None
) – Column containing sex values. Column must contain ‘U’, ‘M’ or ‘F’.sex (
str
, default:None
) – Sex to filter the reference values for. Use U for unisex which uses male values when male and female conflict.ethnicity_col (
str
, default:None
) – Column containing ethnicity values.ethnicity (
str
, default:None
) – Ethnicity to filter for.copy (
bool
, default:False
) – Whether to return a copy.verbose (
bool
, default:False
) – Whether to have verbose stdout. Notifies user of matched columns and value ranges.
- Return type:
- Returns:
A modified AnnData object (copy if specified).
Examples
>>> import ehrapy as ep >>> adata = ep.dt.mimic_2(encoded=True) >>> ep.pp.qc_lab_measurements(adata, measurements=["potassium_first"], verbose=True)