ehrapy.tools.positivity_check

ehrapy.tools.positivity_check#

ehrapy.tools.positivity_check(edata, treatment, *, covariates, propensity_model='logistic', eps=0.05, layer=None)[source]#

Diagnose the positivity assumption by inspecting the propensity score distribution.

Returns summary statistics of the fitted propensity scores by treatment arm together with the fraction of observations whose propensity lies inside [eps, 1 eps] (the “common support” region). Severe positivity violations show up as bimodal propensity distributions or small support fractions.

Parameters:
  • edata (EHRData) – Central data object.

  • treatment (str) – Column name of the binary (0/1) treatment variable.

  • covariates (Sequence[str]) – Adjustment set used to fit the propensity model. Each entry must refer to a name in edata.var_names or edata.obs.columns.

  • propensity_model (str | BaseEstimator, default: 'logistic') – Propensity model specification (see iptw() for the accepted values).

  • eps (float, default: 0.05) – Lower (and 1 eps upper) boundary of the common-support interval.

  • layer (str | None, default: None) – Layer of edata to draw the var-side variables from. If None, edata.X is used.

Return type:

dict

Returns:

A dict with keys propensity_scores, treatment, index, eps, support_fraction, n_outside_support, summary_treated, and summary_untreated.

Examples

>>> import ehrapy as ep
>>> import ehrdata as ed
>>> edata = ed.dt.mimic_2_preprocessed()
>>> info = ep.tl.positivity_check(
...     edata,
...     "aline_flg",
...     covariates=["age", "sofa_first", "sapsi_first"],
... )
>>> print(f"support_fraction={info['support_fraction']:.3f}  n_outside_support={info['n_outside_support']}")
support_fraction=0.981  n_outside_support=34