ehrapy.tools.glm#
- ehrapy.tools.glm(edata, var_names=None, formula=None, *, family='Gaussian', use_feature_types=False, missing='none', as_continuous=None, layer=None)[source]#
Create a Generalized Linear Model (GLM) from a formula, a distribution, and the data object.
- Parameters:
edata (
EHRData) – Central data object.var_names (
Iterable[str] |None, default:None) – A list of var names indicating which columns are for the GLM model.formula (
str|None, default:None) – The formula specifying the model.family (
Literal['Gaussian','Binomial','Gamma','InverseGaussian'], default:'Gaussian') – The distribution families. Available options are ‘Gaussian’, ‘Binomial’, ‘Gamma’, and ‘InverseGaussian’.use_feature_types (
bool, default:False) – If True, the feature types in the data objects .var are used.missing (
Literal['none','drop','raise'], default:'none') – Available options are ‘none’, ‘drop’, and ‘raise’. If ‘none’, no nan checking is done. If ‘drop’, any observations with nans are dropped. If ‘raise’, an error is raised.as_continuous (
Iterable[str] |None, default:None) – A list of var names indicating which columns are continuous rather than categorical. The corresponding columns will be set as type float.
- Return type:
- Returns:
The GLM model instance.
Examples
>>> import ehrdata as ed >>> import ehrapy as ep >>> edata = ed.dt.mimic_2() >>> formula = "day_28_flg ~ age" >>> var_names = ["day_28_flg", "age"] >>> family = "Binomial" >>> glm = ep.tl.glm(edata, var_names, formula, family, missing="drop", as_continuous=["age"])