ehrapy.preprocessing.knn_impute#
- ehrapy.preprocessing.knn_impute(adata, var_names=None, n_neighbours=5, copy=False, warning_threshold=30)[source]#
Imputes missing values in the input AnnData object using K-nearest neighbor imputation.
When using KNN Imputation with mixed data (non-numerical and numerical), encoding using ordinal encoding is required since KNN Imputation can only work on numerical data. The encoding itself is just a utility and will be undone once imputation ran successfully. :rtype:
AnnData
- Args:
adata: An annotated data matrix containing gene expression values. var_names: A list of variable names indicating which columns to impute.
If None, all columns are imputed. Default is None.
n_neighbours: Number of neighbors to use when performing the imputation. Defaults to 5. copy: Whether to perform the imputation on a copy of the original AnnData object.
If True, the original object remains unmodified. Defaults to False.
warning_threshold: Percentage of missing values above which a warning is issued. Defaults to 30.
- Returns:
An updated AnnData object with imputed values.
- Raises:
ValueError: If the input data matrix contains only categorical (non-numeric) values.
- Examples:
>>> import ehrapy as ep >>> adata = ep.dt.mimic_2(encoded=True) >>> ep.pp.knn_impute(adata)