ehrapy.tools.iptw

Contents

ehrapy.tools.iptw#

ehrapy.tools.iptw(edata, treatment, outcome, *, covariates, propensity_model='logistic', stabilized=True, clip=(0.01, 0.99), n_bootstrap=200, random_state=None, layer=None)[source]#

Estimate the average treatment effect by inverse probability of treatment weighting (IPTW).

Fits a propensity model e(X) = P(T=1 | X) and forms weights w_i = T_i / e_i + (1-T_i) / (1-e_i). With stabilized=True the weights are multiplied by the marginal treatment probabilities, which typically reduces variance with negligible bias. The ATE is the difference of weighted means of Y between treated and untreated groups.

Parameters:
  • edata (EHRData) – Central data object.

  • treatment (str) – Column name of the binary (0/1) treatment variable.

  • outcome (str) – Column name of the outcome variable.

  • covariates (Sequence[str]) – Adjustment set used to fit the propensity model. Each entry must refer to a name in edata.var_names or edata.obs.columns.

  • propensity_model (str | BaseEstimator, default: 'logistic') – Specification of the propensity model. Accepts one of the strings 'logistic', 'gradient_boosting', 'random_forest', or any sklearn-compatible classifier (it must implement predict_proba).

  • stabilized (bool, default: True) – Whether to use stabilized weights instead of the basic inverse-probability weights.

  • clip (tuple[float, float] | None, default: (0.01, 0.99)) – (lo, hi) propensity-score clipping range applied before forming weights. Use None to disable clipping.

  • n_bootstrap (int, default: 200) – Number of bootstrap resamples used for the SE and 95% percentile confidence interval. Set to 0 to skip uncertainty estimation.

  • random_state (int | None, default: None) – Seed for the bootstrap resampler.

  • layer (str | None, default: None) – Layer of edata to draw the var-side variables from. If None, edata.X is used.

Return type:

CausalEstimate

Returns:

A CausalEstimate whose params dict contains the fitted propensity_scores and the IPTW weights.

Examples

>>> import ehrapy as ep
>>> import ehrdata as ed
>>> edata = ed.dt.mimic_2_preprocessed()
>>> est = ep.tl.iptw(
...     edata,
...     "aline_flg",
...     "day_28_flg",
...     covariates=["age", "sofa_first", "sapsi_first"],
...     random_state=0,
... )
>>> print(est.summary())
Causal effect of 'aline_flg' on 'day_28_flg'
  method: iptw_stabilized
  ATE:    -0.0644
  SE:     0.0332
  95% CI: [-0.1313, -0.0089]
  n:      1776