ehrapy.tools.stratified_table_one

ehrapy.tools.stratified_table_one#

ehrapy.tools.stratified_table_one(edata, *, groupby, columns=None, categorical=None, nonnormal=None, pval_adjust=None, htest=None, missing=True, key_added='stratified_table_one', copy=False, **tableone_kwargs)[source]#

Build a stratified “Table 1” comparing baseline characteristics across groups.

Produces a publication-ready table stratified by groupby with appropriate per-variable hypothesis tests (chi-square / Fisher’s exact for categorical variables, t-test / ANOVA for normally distributed continuous variables, Mann-Whitney U / Kruskal-Wallis for variables listed in nonnormal). Wraps the tableone package [1].

The rendered table and the intermediate data needed for plotting are stored in edata.uns[key_added]. Access the table via edata.uns[key_added]["table"]. Use ehrapy.plot.stratified_table_one() to visualize.

Parameters:
  • edata (EHRData) – Central data object.

  • groupby (str) – Column in edata.obs to stratify by.

  • columns (Sequence | None, default: None) – Columns to include in the table. If None, all of edata.obs except groupby is used.

  • categorical (Sequence | None, default: None) – Columns that contain categorical variables. If None, types are inferred.

  • nonnormal (Sequence | None, default: None) – Continuous columns that should use a non-parametric test (Mann-Whitney U / Kruskal-Wallis) and report median [Q1, Q3] instead of mean (SD).

  • pval_adjust (str | None, default: None) – Multiple-testing correction (e.g. "bonferroni", "holm", "fdr_bh").

  • htest (dict | None, default: None) – Mapping of column name to a custom hypothesis-test function (advanced).

  • missing (bool, default: True) – If True, include a Missing column in the rendered table.

  • key_added (str, default: 'stratified_table_one') – Key under which results are stored in edata.uns.

  • copy (bool, default: False) – If True, return a modified copy of edata; otherwise modify in place and return None.

  • **tableone_kwargs – Extra keyword arguments forwarded to tableone.TableOne.

Return type:

EHRData | None

Returns:

None (default) or a copy of edata with results stored in .uns[key_added] when copy=True.

References

[1] Tom Pollard, Alistair E.W. Johnson, Jesse D. Raffa, Roger G. Mark; tableone: An open source Python package for producing summary statistics for research papers, Journal of the American Medical Informatics Association, Volume 24, Issue 2, 1 March 2017, Pages 267-271, https://doi.org/10.1093/jamia/ocw117

Examples

>>> import ehrdata as ed
>>> import ehrapy as ep
>>> edata = ed.dt.diabetes_130_fairlearn(
...     columns_obs_only=["gender", "race", "age", "readmit_binary", "num_procedures"]
... )
>>> ep.tl.stratified_table_one(
...     edata,
...     groupby="readmit_binary",
...     columns=["gender", "race", "age", "num_procedures"],
...     nonnormal=["num_procedures"],
... )
>>> edata.uns["stratified_table_one"]["table"]  # the rendered Table 1
>>> ep.pl.stratified_table_one(edata)