ehrapy.preprocessing.explicit_impute

ehrapy.preprocessing.explicit_impute(adata, replacement, *, impute_empty_strings=True, warning_threshold=70, copy=False)[source]

Replaces all missing values in all columns or a subset of columns specified by the user with the passed replacement value.

There are two scenarios to cover: 1. Replace all missing values with the specified value. 2. Replace all missing values in a subset of columns with a specified value per column.

Parameters:
  • adata (AnnData) – AnnData object containing X to impute values in.

  • replacement (str | int | dict[str, str | int]) – The value to replace missing values with. If a dictionary is provided, the keys represent column names and the values represent replacement values for those columns.

  • impute_empty_strings (bool) – If True, empty strings are also replaced. Defaults to True.

  • warning_threshold (int) – Threshold of percentage of missing values to display a warning for. Defaults to 70.

  • copy (bool) – If True, returns a modified copy of the original AnnData object. If False, modifies the object in place.

Return type:

AnnData

Returns:

If copy is True, a modified copy of the original AnnData object with imputed X. If copy is False, the original AnnData object is modified in place.

Examples

Replace all missing values in adata with the value 0:

>>> import ehrapy as ep
>>> adata = ep.dt.mimic_2(encoded=True)
>>> ep.pp.explicit_impute(adata, replacement=0)