ehrapy.preprocessing.qc_metrics

ehrapy.preprocessing.qc_metrics(adata, qc_vars=(), layer=None)[source]

Calculates various quality control metrics.

Uses the original values to calculate the metrics and not the encoded ones. Look at the return type for a more in depth description of the calculated metrics.

Parameters:
  • adata (AnnData) – Annotated data matrix.

  • qc_vars (Collection[str]) – Optional List of vars to calculate additional metrics for.

  • layer (str) – Layer to use to calculate the metrics.

Return type:

tuple[DataFrame, DataFrame] | None

Returns:

Two Pandas DataFrames of all calculated QC metrics for obs and var respectively.

Observation level metrics include:

  • missing_values_abs: Absolute amount of missing values.

  • missing_values_pct: Relative amount of missing values in percent.

Feature level metrics include:

  • missing_values_abs: Absolute amount of missing values.

  • missing_values_pct: Relative amount of missing values in percent.

  • mean: Mean value of the features.

  • median: Median value of the features.

  • std: Standard deviation of the features.

  • min: Minimum value of the features.

  • max: Maximum value of the features.

Examples

>>> import ehrapy as ep
>>> adata = ep.dt.mimic_2(encoded=True)
>>> obs_qc, var_qc = ep.pp.qc_metrics(adata)
>>> obs_qc["missing_values_pct"].plot(kind="hist", bins=20)