MIMIC-II Treatment Effect Estimation with ehrapy

MIMIC-II Treatment Effect Estimation with ehrapy#

Here, we estimate treatment effects using survival analysis techniques.

Case Study Data#

This tutorial explores the MIMIC-II IAC dataset. It was created for the purpose of a case study in the book: Secondary Analysis of Electronic Health Records, published by Springer in 2016. In particular, the dataset was used throughout Chapter 16 (Data Analysis) by Raffa J. et al. to investigate the effectiveness of indwelling arterial catheters in hemodynamically stable patients with respiratory failure for mortality outcomes. The dataset is derived from MIMIC-II, the publicly accessible critical care database. It contains summary clinical data and outcomes for 1,776 patients.

References:

[1] Critical Data, M.I.T., 2016. Secondary analysis of electronic health records (p. 244). Springer Nature. (https://link.springer.com/book/10.1007/978-3-319-43742-2)

[2] https://github.com/MIT-LCP/critical-data-book/tree/master/part_ii/chapter_16/jupyter

import ehrapy as ep
import ehrdata as ed
import pandas as pd

edata = ed.dt.mimic_2()

! File ehrapy_data/ehrapy_mimic2.csv already exists! Using already downloaded dataset...

ed.infer_feature_types(edata)
ed.replace_feature_types(edata, ["day_icu_intime_num", "hour_icu_intime"], "numeric")

! Features 'aline_flg', 'gender_num', 'service_num', 'day_icu_intime_num', 'hour_icu_intime', 'hosp_exp_flg', 'icu_exp_flg', 'day_28_flg', 'censor_flg', 'sepsis_flg', 'chf_flg', 'afib_flg', 'renal_flg', 'liver_flg', 'copd_flg', 'cad_flg', 'stroke_flg', 'mal_flg', 'resp_flg' were detected as categorical features stored numerically.Please verify and adjust if necessary using `ed.replace_feature_types`.

 Detected feature types for EHRData object with 1776 obs and 46 vars
╠══ 📅 Date features
╠══ 📐 Numerical features
║   ╠══ abg_count
║   ╠══ age
║   ╠══ bmi
║   ╠══ bun_first
║   ╠══ chloride_first
║   ╠══ creatinine_first
║   ╠══ hgb_first
║   ╠══ hospital_los_day
║   ╠══ hr_1st
║   ╠══ icu_los_day
║   ╠══ iv_day_1
║   ╠══ map_1st
║   ╠══ mort_day_censored
║   ╠══ pco2_first
║   ╠══ platelet_first
║   ╠══ po2_first
║   ╠══ potassium_first
║   ╠══ sapsi_first
║   ╠══ sodium_first
║   ╠══ sofa_first
║   ╠══ spo2_1st
║   ╠══ tco2_first
║   ╠══ temp_1st
║   ╠══ wbc_first
║   ╚══ weight_first
╚══ 🗂️ Categorical features
    ╠══ afib_flg (2 categories)
    ╠══ aline_flg (2 categories)
    ╠══ cad_flg (2 categories)
    ╠══ censor_flg (2 categories)
    ╠══ chf_flg (2 categories)
    ╠══ copd_flg (2 categories)
    ╠══ day_28_flg (2 categories)
    ╠══ day_icu_intime (7 categories)
    ╠══ day_icu_intime_num (7 categories)
    ╠══ gender_num (2 categories)
    ╠══ hosp_exp_flg (2 categories)
    ╠══ hour_icu_intime (24 categories)
    ╠══ icu_exp_flg (2 categories)
    ╠══ liver_flg (2 categories)
    ╠══ mal_flg (2 categories)
    ╠══ renal_flg (2 categories)
    ╠══ resp_flg (2 categories)
    ╠══ sepsis_flg (1 categories)
    ╠══ service_num (2 categories)
    ╠══ service_unit (3 categories)
    ╚══ stroke_flg (2 categories)

continuous_vars = edata.var.index[edata.var["feature_type"] == "numeric"]
categorical_vars = edata.var.index[edata.var["feature_type"] == "categorical"]

continuous_vars = continuous_vars.tolist()
categorical_vars = categorical_vars.tolist()
all_vars = continuous_vars + categorical_vars

Case Study and Summary#

Our Objective for this case study is as follows:

To estimate the effect that administration of IAC (Indwelling Arterial Catheter) during an ICU (Intensive Care Unit) admission has on 28 day mortality in patients without sepsis who received mechanical ventilation within MIMIC II, while adjusting for age, gender, severity of illness and comorbidities.

We will also not want to include the sepsis_flg variable as a covariate in any of our models, as there are no patients with sepsis within this study to estimate the effect of sepsis.

Usually, we will want to see how the population differs for different values of the covariate of interest. In our case study, if the treated group (IAC) differed substantially from the untreated group (no IAC), then this may account for any effect we demonstrate. We can do this by summarizing the two groups using the tableone package.

import tableone

table = tableone.TableOne(edata.to_df(), all_vars, categorical=categorical_vars, groupby="aline_flg")
table.tabulate(tablefmt="html")

		Missing	Overall	0	1
n			1776	792	984
icu_los_day, mean (SD)		0	3.3 (3.4)	2.1 (1.9)	4.3 (3.9)
hospital_los_day, mean (SD)		0	8.1 (8.2)	5.4 (5.4)	10.3 (9.3)
age, mean (SD)		0	54.4 (21.1)	53.0 (21.7)	55.5 (20.5)
weight_first, mean (SD)		110	80.1 (22.5)	79.2 (22.6)	80.7 (22.4)
bmi, mean (SD)		466	27.8 (8.2)	28.0 (9.1)	27.7 (7.5)
sapsi_first, mean (SD)		85	14.1 (4.1)	12.7 (3.8)	15.2 (4.0)
sofa_first, mean (SD)		6	5.8 (2.3)	4.8 (2.1)	6.6 (2.2)
day_icu_intime_num, mean (SD)		0	4.1 (2.0)	4.0 (2.0)	4.1 (2.0)
hour_icu_intime, mean (SD)		0	10.6 (7.9)	9.9 (7.7)	11.2 (8.1)
mort_day_censored, mean (SD)		0	614.3 (403.1)	619.1 (388.3)	610.5 (414.8)
map_1st, mean (SD)		0	88.2 (17.6)	87.5 (15.9)	88.9 (18.8)
hr_1st, mean (SD)		0	87.9 (18.8)	88.4 (18.8)	87.5 (18.7)
temp_1st, mean (SD)		3	97.8 (4.5)	97.9 (3.8)	97.7 (5.1)
spo2_1st, mean (SD)		0	98.4 (5.5)	98.4 (5.7)	98.5 (5.4)
abg_count, mean (SD)		0	6.0 (8.7)	1.4 (1.6)	9.7 (10.2)
wbc_first, mean (SD)		8	12.3 (6.6)	11.7 (6.5)	12.8 (6.6)
hgb_first, mean (SD)		8	12.6 (2.2)	12.7 (2.2)	12.4 (2.2)
platelet_first, mean (SD)		8	246.1 (99.9)	254.3 (104.5)	239.5 (95.6)
sodium_first, mean (SD)		5	139.6 (4.7)	139.8 (4.8)	139.4 (4.7)
potassium_first, mean (SD)		5	4.1 (0.8)	4.1 (0.8)	4.1 (0.8)
tco2_first, mean (SD)		5	24.4 (5.0)	24.7 (4.9)	24.2 (5.1)
chloride_first, mean (SD)		5	103.8 (5.7)	103.3 (5.4)	104.3 (5.9)
bun_first, mean (SD)		5	19.3 (14.4)	18.9 (14.5)	19.6 (14.3)
creatinine_first, mean (SD)		6	1.1 (1.1)	1.1 (1.2)	1.1 (1.0)
po2_first, mean (SD)		186	227.6 (144.9)	223.8 (152.9)	230.1 (139.6)
pco2_first, mean (SD)		186	43.4 (14.0)	44.9 (15.9)	42.5 (12.5)
iv_day_1, mean (SD)		143	1622.9 (1677.1)	1364.2 (1406.8)	1808.4 (1825.0)
aline_flg, n (%)	0		792 (44.6)	792 (100.0)
	1		984 (55.4)		984 (100.0)
gender_num, n (%)	0.0		750 (42.2)	344 (43.4)	406 (41.3)
	1.0		1025 (57.7)	447 (56.4)	578 (58.7)
	None		1 (0.1)	1 (0.1)
service_unit, n (%)	FICU		62 (3.5)	24 (3.0)	38 (3.9)
	MICU		732 (41.2)	480 (60.6)	252 (25.6)
	SICU		982 (55.3)	288 (36.4)	694 (70.5)
service_num, n (%)	0		794 (44.7)	504 (63.6)	290 (29.5)
	1		982 (55.3)	288 (36.4)	694 (70.5)
day_icu_intime, n (%)	Friday		234 (13.2)	87 (11.0)	147 (14.9)
	Monday		260 (14.6)	116 (14.6)	144 (14.6)
	Saturday		278 (15.7)	138 (17.4)	140 (14.2)
	Sunday		230 (13.0)	114 (14.4)	116 (11.8)
	Thursday		261 (14.7)	110 (13.9)	151 (15.3)
	Tuesday		257 (14.5)	118 (14.9)	139 (14.1)
	Wednesday		256 (14.4)	109 (13.8)	147 (14.9)
hosp_exp_flg, n (%)	0		1532 (86.3)	702 (88.6)	830 (84.3)
	1		244 (13.7)	90 (11.4)	154 (15.7)
icu_exp_flg, n (%)	0		1606 (90.4)	734 (92.7)	872 (88.6)
	1		170 (9.6)	58 (7.3)	112 (11.4)
day_28_flg, n (%)	0		1493 (84.1)	679 (85.7)	814 (82.7)
	1		283 (15.9)	113 (14.3)	170 (17.3)
censor_flg, n (%)	0		497 (28.0)	213 (26.9)	284 (28.9)
	1		1279 (72.0)	579 (73.1)	700 (71.1)
sepsis_flg, n (%)	0		1776 (100.0)	792 (100.0)	984 (100.0)
chf_flg, n (%)	0		1563 (88.0)	695 (87.8)	868 (88.2)
	1		213 (12.0)	97 (12.2)	116 (11.8)
afib_flg, n (%)	0		1569 (88.3)	710 (89.6)	859 (87.3)
	1		207 (11.7)	82 (10.4)	125 (12.7)
renal_flg, n (%)	0		1716 (96.6)	764 (96.5)	952 (96.7)
	1		60 (3.4)	28 (3.5)	32 (3.3)
liver_flg, n (%)	0		1677 (94.4)	754 (95.2)	923 (93.8)
	1		99 (5.6)	38 (4.8)	61 (6.2)
copd_flg, n (%)	0		1619 (91.2)	711 (89.8)	908 (92.3)
	1		157 (8.8)	81 (10.2)	76 (7.7)
cad_flg, n (%)	0		1653 (93.1)	741 (93.6)	912 (92.7)
	1		123 (6.9)	51 (6.4)	72 (7.3)
stroke_flg, n (%)	0		1554 (87.5)	722 (91.2)	832 (84.6)
	1		222 (12.5)	70 (8.8)	152 (15.4)
mal_flg, n (%)	0		1520 (85.6)	700 (88.4)	820 (83.3)
	1		256 (14.4)	92 (11.6)	164 (16.7)
resp_flg, n (%)	0		1211 (68.2)	514 (64.9)	697 (70.8)
	1		565 (31.8)	278 (35.1)	287 (29.2)

The IAC group differs in many respects to the non-IAC group. Patients who were given IAC tended to have higher severity of illness at baseline (sapsi_first and sofa_first), slightly older, less likely to be from the MICU, and have slightly different comorbidity profiles when compared to the non-IAC group.

Next, we can see how the covariates are distributed among the different outcomes (death within 28 days versus alive at 28 days). This will give us an idea of which covariates may be important for affecting the outcome.

table = tableone.TableOne(edata.to_df(), all_vars, categorical=categorical_vars, groupby="day_28_flg")
table.tabulate(tablefmt="html")

		Missing	Overall	0	1
n			1776	1493	283
icu_los_day, mean (SD)		0	3.3 (3.4)	3.2 (3.2)	4.0 (4.0)
hospital_los_day, mean (SD)		0	8.1 (8.2)	8.4 (8.4)	6.4 (6.4)
age, mean (SD)		0	54.4 (21.1)	50.8 (20.1)	73.3 (15.3)
weight_first, mean (SD)		110	80.1 (22.5)	81.4 (22.7)	72.4 (19.9)
bmi, mean (SD)		466	27.8 (8.2)	28.2 (8.3)	26.0 (7.2)
sapsi_first, mean (SD)		85	14.1 (4.1)	13.6 (3.9)	17.3 (3.8)
sofa_first, mean (SD)		6	5.8 (2.3)	5.7 (2.3)	6.6 (2.4)
day_icu_intime_num, mean (SD)		0	4.1 (2.0)	4.0 (2.0)	4.1 (2.0)
hour_icu_intime, mean (SD)		0	10.6 (7.9)	10.5 (7.9)	11.0 (8.0)
mort_day_censored, mean (SD)		0	614.3 (403.1)	729.6 (331.4)	6.1 (6.4)
map_1st, mean (SD)		0	88.2 (17.6)	88.2 (17.5)	88.3 (17.9)
hr_1st, mean (SD)		0	87.9 (18.8)	88.3 (18.4)	85.8 (20.6)
temp_1st, mean (SD)		3	97.8 (4.5)	97.8 (4.6)	97.7 (4.5)
spo2_1st, mean (SD)		0	98.4 (5.5)	98.6 (5.0)	97.8 (7.6)
abg_count, mean (SD)		0	6.0 (8.7)	5.7 (7.7)	7.5 (12.5)
wbc_first, mean (SD)		8	12.3 (6.6)	12.2 (6.4)	12.7 (7.5)
hgb_first, mean (SD)		8	12.6 (2.2)	12.7 (2.2)	11.9 (2.1)
platelet_first, mean (SD)		8	246.1 (99.9)	246.8 (97.3)	242.1 (112.6)
sodium_first, mean (SD)		5	139.6 (4.7)	139.6 (4.6)	139.1 (5.4)
potassium_first, mean (SD)		5	4.1 (0.8)	4.1 (0.8)	4.2 (0.9)
tco2_first, mean (SD)		5	24.4 (5.0)	24.3 (4.8)	25.0 (5.8)
chloride_first, mean (SD)		5	103.8 (5.7)	104.1 (5.6)	102.6 (6.4)
bun_first, mean (SD)		5	19.3 (14.4)	18.0 (12.9)	26.2 (19.0)
creatinine_first, mean (SD)		6	1.1 (1.1)	1.1 (1.1)	1.2 (0.9)
po2_first, mean (SD)		186	227.6 (144.9)	231.3 (146.3)	207.9 (135.8)
pco2_first, mean (SD)		186	43.4 (14.0)	43.3 (12.9)	43.8 (18.6)
iv_day_1, mean (SD)		143	1622.9 (1677.1)	1694.2 (1709.5)	1258.0 (1449.4)
aline_flg, n (%)	0		792 (44.6)	679 (45.5)	113 (39.9)
	1		984 (55.4)	814 (54.5)	170 (60.1)
gender_num, n (%)	0.0		750 (42.2)	606 (40.6)	144 (50.9)
	1.0		1025 (57.7)	886 (59.3)	139 (49.1)
	None		1 (0.1)	1 (0.1)
service_unit, n (%)	FICU		62 (3.5)	59 (4.0)	3 (1.1)
	MICU		732 (41.2)	605 (40.5)	127 (44.9)
	SICU		982 (55.3)	829 (55.5)	153 (54.1)
service_num, n (%)	0		794 (44.7)	664 (44.5)	130 (45.9)
	1		982 (55.3)	829 (55.5)	153 (54.1)
day_icu_intime, n (%)	Friday		234 (13.2)	192 (12.9)	42 (14.8)
	Monday		260 (14.6)	221 (14.8)	39 (13.8)
	Saturday		278 (15.7)	235 (15.7)	43 (15.2)
	Sunday		230 (13.0)	198 (13.3)	32 (11.3)
	Thursday		261 (14.7)	223 (14.9)	38 (13.4)
	Tuesday		257 (14.5)	211 (14.1)	46 (16.3)
	Wednesday		256 (14.4)	213 (14.3)	43 (15.2)
hosp_exp_flg, n (%)	0		1532 (86.3)	1490 (99.8)	42 (14.8)
	1		244 (13.7)	3 (0.2)	241 (85.2)
icu_exp_flg, n (%)	0		1606 (90.4)	1493 (100.0)	113 (39.9)
	1		170 (9.6)		170 (60.1)
day_28_flg, n (%)	0		1493 (84.1)	1493 (100.0)
	1		283 (15.9)		283 (100.0)
censor_flg, n (%)	0		497 (28.0)	214 (14.3)	283 (100.0)
	1		1279 (72.0)	1279 (85.7)
sepsis_flg, n (%)	0		1776 (100.0)	1493 (100.0)	283 (100.0)
chf_flg, n (%)	0		1563 (88.0)	1348 (90.3)	215 (76.0)
	1		213 (12.0)	145 (9.7)	68 (24.0)
afib_flg, n (%)	0		1569 (88.3)	1372 (91.9)	197 (69.6)
	1		207 (11.7)	121 (8.1)	86 (30.4)
renal_flg, n (%)	0		1716 (96.6)	1447 (96.9)	269 (95.1)
	1		60 (3.4)	46 (3.1)	14 (4.9)
liver_flg, n (%)	0		1677 (94.4)	1413 (94.6)	264 (93.3)
	1		99 (5.6)	80 (5.4)	19 (6.7)
copd_flg, n (%)	0		1619 (91.2)	1377 (92.2)	242 (85.5)
	1		157 (8.8)	116 (7.8)	41 (14.5)
cad_flg, n (%)	0		1653 (93.1)	1403 (94.0)	250 (88.3)
	1		123 (6.9)	90 (6.0)	33 (11.7)
stroke_flg, n (%)	0		1554 (87.5)	1386 (92.8)	168 (59.4)
	1		222 (12.5)	107 (7.2)	115 (40.6)
mal_flg, n (%)	0		1520 (85.6)	1294 (86.7)	226 (79.9)
	1		256 (14.4)	199 (13.3)	57 (20.1)
resp_flg, n (%)	0		1211 (68.2)	1056 (70.7)	155 (54.8)
	1		565 (31.8)	437 (29.3)	128 (45.2)

Of the 984 subjects receiving IAC, 170 (17.2 %) died within 28 days, whereas 113 of 792 (14.2 %) died in the no-IAC group. In a univariate analysis we can assess if the lower rate of mortality is statistically significant, by fitting a single covariate aline_flg logistic regression.

To do this, we have to pay attention to our feature types. The target of the GLM model has to be numerical (https://en.wikipedia.org/wiki/Generalized_linear_model), so we have to convert the day_28_flg to a numerical value.

ep.ad.replace_feature_types(edata, ["day_28_flg"], "numeric")
glm_model = ep.tl.glm(
    edata,
    formula="day_28_flg ~ aline_flg",
    var_names=["day_28_flg", "aline_flg"],
    family="Binomial",
    use_feature_types=True,
)
res = glm_model.fit()
res.summary()

Generalized Linear Model Regression Results
Dep. Variable:	day_28_flg	No. Observations:	1776
Model:	GLM	Df Residuals:	1774
Model Family:	Binomial	Df Model:	1
Link Function:	Logit	Scale:	1.0000
Method:	IRLS	Log-Likelihood:	-777.43
Date:	Fri, 05 Sep 2025	Deviance:	1554.9
Time:	17:16:58	Pearson chi2:	1.78e+03
No. Iterations:	4	Pseudo R-squ. (CS):	0.001680
Covariance Type:	nonrobust

	coef	std err	z	P>\|z\|	[0.025	0.975]
Intercept	-1.7932	0.102	-17.650	0.000	-1.992	-1.594
aline_flg[T.1]	0.2271	0.132	1.720	0.085	-0.032	0.486

Indeed, the p-value for aline_flg is about 0.09. There are likely several important covariates that differed among those who received IAC and those who did not. These may serve as confounders, and the possible association we observed in the univariate analysis may be stronger, nonexistent or in the opposite direction (i.e., IAC having lower rates of mortality) depending on the situation. Our next step would be to adjust for these confounders. A common approach is to fit all univariate models (one covariate at a time, as we did with aline_flg, but separately for each covariate and without aline_flg), and perform a hypothesis test on each model. Any variables which had statistical significance under the univariate models would then be included in a multivariable model. Another approach begins with the model we just fit (which only has aline_flg as a covariate), and then sequentially adds variables one at a time. This approach is often called stepwise forward selection.

In our stepwise backwards elimination procedure, we are going to fit a full model and eliminate one variable at a time, until we are left with a model with only statistically significant covariates.

Because aline_flg is the covariate of interest, it will remain in the model regardless of its statistical significance. At each step, we need to define a criterion for eliminating variables. One approach is to perform a hypothesis test for each covariate and remove the covariate with the largest p-value, unless all p-values are < 0.05 or the largest p-value corresponds to aline_flg.

Most covariates are binary or categorical, and we’ve already converted them to factors. The disease severity scores (SAPS and SOFA) are continuous. Adding them directly assumes a linear trend in mortality odds as scores change, which may not be appropriate. To address this, we can discretize the scores into quantiles using cut2.

We will apply this function to discretize the sapsi_first and sofa_first scores into quantiles. These discretized scores can then be added to the survival model to better capture nonlinear relationships with mortality risk.

def cut2(edata: ed.EHRData, column_name: str, bins: int = 5):
    """Cut a continuous variable into bins and add it as a new categorical variable to the EHRData object."""
    created_col_name = f"{column_name}_{bins}_bins"
    if created_col_name in edata.obs.columns:
        return edata
    df = ed.io.to_pandas(edata)
    x = df[column_name]

    # Cut the data into bins
    bin_labels = [f"{column_name}_bin_{i + 1}" for i in range(bins)]
    binned = pd.qcut(x, q=bins, labels=bin_labels)
    binned.name = created_col_name

    df = pd.concat([df, binned], axis=1)

    new_obs = edata.obs.copy()
    new_obs = pd.concat([new_obs, binned], axis=1)

    new_var = edata.var.copy()
    new_var.loc[created_col_name] = "categorical"

    new_edata = ed.EHRData(
        X=df.values,
        obs=new_obs,
        var=new_var,
    )

    return new_edata

cut2_edata = cut2(edata, "sapsi_first")
cut2_edata = cut2(cut2_edata, "sofa_first")
cut2_edata.var.index

Index(['aline_flg', 'icu_los_day', 'hospital_los_day', 'age', 'gender_num',
       'weight_first', 'bmi', 'sapsi_first', 'sofa_first', 'service_unit',
       'service_num', 'day_icu_intime', 'day_icu_intime_num',
       'hour_icu_intime', 'hosp_exp_flg', 'icu_exp_flg', 'day_28_flg',
       'mort_day_censored', 'censor_flg', 'sepsis_flg', 'chf_flg', 'afib_flg',
       'renal_flg', 'liver_flg', 'copd_flg', 'cad_flg', 'stroke_flg',
       'mal_flg', 'resp_flg', 'map_1st', 'hr_1st', 'temp_1st', 'spo2_1st',
       'abg_count', 'wbc_first', 'hgb_first', 'platelet_first', 'sodium_first',
       'potassium_first', 'tco2_first', 'chloride_first', 'bun_first',
       'creatinine_first', 'po2_first', 'pco2_first', 'iv_day_1',
       'sapsi_first_5_bins', 'sofa_first_5_bins'],
      dtype='object')

We added a new column in the EHRData object for each of the sapsi_first and sofa_first bins. Now every observation has a value for sapsi_first_bin and sofa_first_bin which is the quartile that the observation falls into. We can now add these to the model, instead of the continuous sapsi_first and sofa_first scores.

dependent_var = "day_28_flg"
not_dependent_vars = [
    "aline_flg",
    "age",
    "gender_num",
    "sapsi_first_5_bins",
    "sofa_first_5_bins",
    "service_unit",
    "chf_flg",
    "afib_flg",
    "renal_flg",
    "liver_flg",
    "copd_flg",
    "cad_flg",
    "stroke_flg",
    "mal_flg",
    "resp_flg",
]
formula = f"{dependent_var} ~ {' + '.join(not_dependent_vars)}"
var_names = not_dependent_vars + [dependent_var]
full_model = ep.tl.glm(
    cut2_edata,
    var_names,
    formula,
    missing="drop",
    family="Binomial",
    use_feature_types=True,
)
full_model_result = full_model.fit()
full_model_result.summary()

Generalized Linear Model Regression Results
Dep. Variable:	day_28_flg	No. Observations:	1684
Model:	GLM	Df Residuals:	1661
Model Family:	Binomial	Df Model:	22
Link Function:	Logit	Scale:	1.0000
Method:	IRLS	Log-Likelihood:	-479.06
Date:	Fri, 05 Sep 2025	Deviance:	958.12
Time:	17:16:58	Pearson chi2:	1.35e+03
No. Iterations:	7	Pseudo R-squ. (CS):	0.2311
Covariance Type:	nonrobust

	coef	std err	z	P>\|z\|	[0.025	0.975]
Intercept	-7.4832	0.849	-8.809	0.000	-9.148	-5.818
aline_flg[T.1]	0.0327	0.204	0.160	0.873	-0.367	0.433
gender_num[T.1.0]	0.1422	0.172	0.824	0.410	-0.196	0.480
sapsi_first_5_bins[T.sapsi_first_bin_2]	0.0390	0.342	0.114	0.909	-0.631	0.709
sapsi_first_5_bins[T.sapsi_first_bin_3]	0.6417	0.293	2.190	0.028	0.067	1.216
sapsi_first_5_bins[T.sapsi_first_bin_4]	0.5174	0.293	1.769	0.077	-0.056	1.091
sapsi_first_5_bins[T.sapsi_first_bin_5]	1.3695	0.295	4.637	0.000	0.791	1.948
sofa_first_5_bins[T.sofa_first_bin_2]	0.5223	0.302	1.729	0.084	-0.070	1.114
sofa_first_5_bins[T.sofa_first_bin_3]	0.6462	0.301	2.144	0.032	0.055	1.237
sofa_first_5_bins[T.sofa_first_bin_4]	0.7943	0.290	2.735	0.006	0.225	1.363
sofa_first_5_bins[T.sofa_first_bin_5]	0.8867	0.336	2.639	0.008	0.228	1.545
service_unit[T.MICU]	1.0609	0.681	1.559	0.119	-0.273	2.395
service_unit[T.SICU]	0.6666	0.674	0.990	0.322	-0.654	1.987
chf_flg[T.1]	0.2347	0.234	1.005	0.315	-0.223	0.693
afib_flg[T.1]	0.5544	0.211	2.628	0.009	0.141	0.968
renal_flg[T.1]	-0.7834	0.409	-1.917	0.055	-1.584	0.018
liver_flg[T.1]	0.4643	0.341	1.360	0.174	-0.205	1.133
copd_flg[T.1]	0.2533	0.244	1.037	0.300	-0.225	0.732
cad_flg[T.1]	-0.2444	0.288	-0.849	0.396	-0.808	0.319
stroke_flg[T.1]	2.0563	0.220	9.348	0.000	1.625	2.487
mal_flg[T.1]	0.4879	0.209	2.334	0.020	0.078	0.898
resp_flg[T.1]	0.7172	0.192	3.735	0.000	0.341	1.094
age	0.0425	0.006	6.841	0.000	0.030	0.055

Some of the covariates are statistically, while others may be expendable. Ideally, we would like as simple of a model as possible that can explain as much of the variation in the outcome as possible. We will attempt to remove our first covariate by the procedure we outlined above. For each of the variables we consider removing, we could fit a logistic regression model without that covariate, and then test it against the current model.

The drop1 function will take the data, the dependent variable and a list of independent variables and return the p-value for each variable. We can then choose to remove the variable with the largest p-value, this process would then be repeated until all p-values are less than 0.05. As this is an example, we will only do one iteration.

from scipy.stats import chi2


def drop1(
    edata: ed.EHRData,
    dependent_var: str,
    not_dependent_vars: list[str],
    missing: str = "drop",
):
    """Python implementation of R's drop1 function using ehrapy.tl.glm.

    Args:
        edata: The EHRData object for the OLS model.
        dependent_var: The name of the dependent variable.
        independent_vars: A list of independent variables.
        missing: How to handle missing data. Default is "drop".

    Returns:
        pd.DataFrame: A DataFrame containing AIC, BIC, and LRT p-values for each model with one term dropped.
    """
    # Fit the full model
    formula = f"{dependent_var} ~ {' + '.join(not_dependent_vars)}"
    var_names = not_dependent_vars + [dependent_var]
    full_model = ep.tl.glm(
        edata,
        var_names=var_names,
        formula=formula,
        missing=missing,
        family="Binomial",
        use_feature_types=True,
    )
    full_model_result = full_model.fit()

    results = []

    # Drop one term at a time and compare models
    for term in not_dependent_vars:
        reduced_formula = (
            f"{formula.split('~')[0].strip()} ~ {' + '.join([t for t in not_dependent_vars if t != term])}"
        )
        reduced_model = ep.tl.glm(
            edata,
            var_names=var_names,
            formula=reduced_formula,
            missing=missing,
            family="Binomial",
            use_feature_types=True,
        )
        reduced_model_result = reduced_model.fit()

        df = full_model.df_model - reduced_model.df_model
        aic = reduced_model_result.aic
        deviance = reduced_model_result.deviance

        lrt_stat = -2 * (reduced_model_result.llf - full_model_result.llf)
        lrt_pvalue = chi2.sf(lrt_stat, df=df)  # p-value using chi-squared distribution

        results.append(
            {
                "Dropped Term": term,
                "Deviance": deviance,
                "AIC": aic,
                "LRT Statistic": lrt_stat,
                "LRT p-value": lrt_pvalue,
            }
        )

    results_df = pd.DataFrame(results)
    return results_df

drop1(cut2_edata, dependent_var, not_dependent_vars, missing="drop")

	Dropped Term	Deviance	AIC	LRT Statistic	LRT p-value
0	aline_flg	958.146491	1002.146491	0.025627	8.728142e-01
1	age	1010.700508	1054.700508	52.579644	4.131493e-13
2	gender_num	958.832055	1002.832055	0.711191	3.990487e-01
3	sapsi_first_5_bins	1069.795683	1107.795683	111.674819	3.197094e-23
4	sofa_first_5_bins	978.050947	1016.050947	19.930083	5.155226e-04
5	service_unit	963.069661	1005.069661	4.948797	8.421362e-02
6	chf_flg	959.122961	1003.122961	1.002098	3.168034e-01
7	afib_flg	964.939191	1008.939191	6.818327	9.022704e-03
8	renal_flg	962.106642	1006.106642	3.985778	4.588591e-02
9	liver_flg	959.886794	1003.886794	1.765931	1.838865e-01
10	copd_flg	959.180901	1003.180901	1.060038	3.032066e-01
11	cad_flg	958.857519	1002.857519	0.736655	3.907345e-01
12	stroke_flg	1049.584017	1093.584017	91.463154	1.136895e-21
13	mal_flg	963.410697	1007.410697	5.289834	2.145027e-02
14	resp_flg	972.184704	1016.184704	14.063840	1.767086e-04

aline_flg has the largest p-value, but we stipulated in our model selection plan that we would retain this covariate as it’s our covariate of interest. We will then go to the next largest p-value which is the gender_num variable. From here we can update our model, and repeat the backwards elimination step on the updated model.

reduced_vars = [ind_var for ind_var in not_dependent_vars if ind_var != "gender_num"]
drop1(cut2_edata, dependent_var, reduced_vars, missing="drop")

	Dropped Term	Deviance	AIC	LRT Statistic	LRT p-value
0	aline_flg	958.853545	1000.853545	0.021491	8.834499e-01
1	age	1010.759880	1052.759880	51.927825	5.757865e-13
2	sapsi_first_5_bins	1070.187660	1106.187660	111.355606	3.739818e-23
3	sofa_first_5_bins	979.692198	1015.692198	20.860144	3.375408e-04
4	service_unit	963.599162	1003.599162	4.767108	9.222224e-02
5	chf_flg	959.775853	1001.775853	0.943799	3.313029e-01
6	afib_flg	965.669262	1007.669262	6.837207	8.927832e-03
7	renal_flg	962.629266	1004.629266	3.797212	5.133801e-02
8	liver_flg	960.558143	1002.558143	1.726089	1.889112e-01
9	copd_flg	959.933978	1001.933978	1.101923	2.938444e-01
10	cad_flg	959.490514	1001.490514	0.658459	4.171044e-01
11	stroke_flg	1049.620719	1091.620719	90.788664	1.598666e-21
12	mal_flg	964.027821	1006.027821	5.195767	2.264197e-02
13	resp_flg	972.926783	1014.926783	14.094729	1.738302e-04

Conclusion#

We see that aline_flg still has the largest p-value, but cad_flg has the second largest, so we would choose to remove it next.

MIMIC-II Treatment Effect Estimation with ehrapy

Contents

MIMIC-II Treatment Effect Estimation with ehrapy#

Case Study Data#

Case Study and Summary#

Conclusion#