ehrapy.tools.kaplan_meier¶

ehrapy.tools.kaplan_meier(adata, duration_col, event_col=None, *, uns_key='kaplan_meier', timeline=None, entry=None, label=None, alpha=None, ci_labels=None, weights=None, fit_options=None, censoring='right')[source]¶

Fit the Kaplan-Meier estimate for the survival function.

The Kaplan–Meier estimator, also known as the product limit estimator, is a non-parametric statistic used to estimate the survival function from lifetime data. In medical research, it is often used to measure the fraction of patients living for a certain amount of time after treatment. The results will be stored in the .uns slot of the AnnData object under the key ‘kaplan_meier’ unless specified otherwise in the uns_key parameter.

See https://en.wikipedia.org/wiki/Kaplan%E2%80%93Meier_estimator: https://lifelines.readthedocs.io/en/latest/fitters/univariate/KaplanMeierFitter.html#module-lifelines.fitters.kaplan_meier_fitter

Parameters:

adata (AnnData) – AnnData object.
duration_col (str) – The name of the column in the AnnData object that contains the subjects’ lifetimes.
event_col (str | None, default: None) – The name of the column in the AnnData object that specifies whether the event has been observed, or censored. Column values are True if the event was observed, False if the event was lost (right-censored). If left None, all individuals are assumed to be uncensored.
uns_key (str, default: 'kaplan_meier') – The key to use for the .uns slot in the AnnData object.
timeline (list[float] | None, default: None) – Return the best estimate at the values in timelines (positively increasing)
entry (str | None, default: None) – Relative time when a subject entered the study. This is useful for left-truncated (not left-censored) observations. If None, all members of the population entered study when they were “born”.
label (str | None, default: None) – A string to name the column of the estimate.
alpha (float | None, default: None) – The alpha value in the confidence intervals. Overrides the initializing alpha for this call to fit only.
ci_labels (list[str] | None, default: None) – Add custom column names to the generated confidence intervals as a length-2 list: [<lower-bound name>, <upper-bound name>] (default: <label>_lower_<1-alpha/2>).
weights (list[float] | None, default: None) – If providing a weighted dataset. For example, instead of providing every subject as a single element of durations and event_observed, one could weigh subject differently.
fit_options (dict | None, default: None) – Additional keyword arguments to pass into the estimator.
censoring (Literal['right', 'left'], default: 'right') – ‘right’ for fitting the model to a right-censored dataset. (default, calls fit). ‘left’ for fitting the model to a left-censored dataset (calls fit_left_censoring).

Return type:

KaplanMeierFitter

Returns:

Fitted KaplanMeierFitter.

Examples

>>> import ehrapy as ep
>>> adata = ep.dt.mimic_2(encoded=False)
>>> # Flip 'censor_fl' because 0 = death and 1 = censored
>>> adata[:, ["censor_flg"]].X = np.where(adata[:, ["censor_flg"]].X == 0, 1, 0)
>>> kmf = ep.tl.kaplan_meier(adata, "mort_day_censored", "censor_flg", label="Mortality")