ehrapy.preprocessing.combat

Contents

ehrapy.preprocessing.combat#

ehrapy.preprocessing.combat(edata, *, key='batch', covariates=None, layer=None, inplace=True)[source]#

ComBat function for batch effect correction [JLR06], [LJP+17], [Ped12].

Corrects for batch effects by fitting linear models, gains statistical power via an EB framework where information is borrowed across features. This uses the implementation combat.py[Ped12].

Parameters:
  • edata (EHRData) – Central data object.

  • key (str, default: 'batch') – Key to a categorical annotation from .obs that will be used for batch effect removal.

  • covariates (Collection[str] | None, default: None) – Additional covariates besides the batch variable such as adjustment variables or biological condition. This parameter refers to the design matrix X in Equation 2.1 in [JLR06] and to the mod argument in the original combat function in the sva R package. Note that not including covariates may introduce bias or lead to the removal of signal in unbalanced designs.

  • layer (str | None, default: None) – The layer to operate on.

  • inplace (bool, default: True) – Whether to replace edata.X or to return the corrected data

Return type:

EHRData | ndarray | None

Returns:

Depending on the value of inplace, either returns the corrected matrix or modifies edata.X.