ehrapy.preprocessing.mice_forest_impute¶
- ehrapy.preprocessing.mice_forest_impute(adata, var_names=None, *, warning_threshold=70, save_all_iterations=True, random_state=None, inplace=False, iterations=5, variable_parameters=None, verbose=False, copy=False)[source]¶
Impute data using the miceforest.
See https://github.com/AnotherSamWilson/miceforest Fast, memory efficient Multiple Imputation by Chained Equations (MICE) with lightgbm.
- Parameters:
adata (
AnnData
) – The AnnData object containing the data to impute.var_names (
Iterable
[str
] |None
, default:None
) – A list of variable names to impute. If None, impute all variables.warning_threshold (
int
, default:70
) – Threshold of percentage of missing values to display a warning for.save_all_iterations (
bool
, default:True
) – Whether to save all imputed values from all iterations or just the latest. Saving all iterations allows for additional plotting, but may take more memory.random_state (
int
|None
, default:None
) – The random state ensures script reproducibility.inplace (
bool
, default:False
) – If True, modify the input AnnData object in-place and return None. If False, return a copy of the modified AnnData object. Default is False.iterations (
int
, default:5
) – The number of iterations to run.variable_parameters (
dict
|None
, default:None
) – Model parameters can be specified by variable here. Keys should be variable names or indices, and values should be a dict of parameter which should apply to that variable only.verbose (
bool
, default:False
) – Whether to print information about the imputation process.copy (
bool
, default:False
) – Whether to return a copy of the AnnData object or modify it in-place.
- Return type:
- Returns:
The imputed AnnData object.
Examples
>>> import ehrapy as ep >>> adata = ep.dt.mimic_2(encoded=True) >>> ep.ad.infer_feature_types(adata) >>> ep.pp.mice_forest_impute(adata)