ehrapy.preprocessing.log_norm

Contents

ehrapy.preprocessing.log_norm#

ehrapy.preprocessing.log_norm(edata, vars=None, base=None, offset=1, layer=None, copy=False)[source]#

Apply log normalization.

Computes \(x = \\log(x + offset)\), where \(log\) denotes the natural logarithm unless a different base is given and the default \(offset\) is \(1\).

Supports both 2D and 3D data:

  • 2D data: Standard normalization across observations

  • 3D data: Applied to all elements across samples and timestamps

Parameters:
  • edata (EHRData) – Central data object.

  • vars (str | Sequence[str] | None, default: None) – List of the names of the numeric variables to normalize. If None all numeric variables will be normalized.

  • base (int | float | None, default: None) – Numeric base for logarithm. If None the natural logarithm is used.

  • offset (int | float, default: 1) – Offset added to values before computing the logarithm.

  • layer (str | None, default: None) – The layer to normalize.

  • copy (bool, default: False) – Whether to return a copy or act in place.

Return type:

EHRData | None

Returns:

None if copy=False and modifies the passed edata, else returns an updated object. Also stores a record of applied normalizations as a dictionary in edata.uns[“normalization”].

Examples

>>> import ehrdata as ed
>>> import ehrapy as ep
>>> import numpy as np
>>> edata = ed.dt.physionet2012(layer="tem_data")
>>> ep.pp.offset_negative_values(edata, layer="tem_data")
>>> np.nanmax(edata.layers["tem_data"])
36400.0
>>> ep.pp.log_norm(edata, layer="tem_data")
>>> np.nanmax(edata.layers["tem_data"])
10.502379