Tools#
Any transformation of the data matrix that is not preprocessing. In contrast to a preprocessing function, a tool usually adds an easily interpretable annotation to the data matrix, which can then be visualized with a corresponding plotting function.
Embeddings#
Clustering and trajectory inference#
Feature Ranking#
Rank features for characterizing groups. |
|
Filters out features based on fold change and fraction of features containing the feature within and outside the groupby categories. |
|
Calculate feature importances for predicting a specified feature in adata.var. |
Dataset integration#
Map labels and embeddings from reference data to new data. |
Survival Analysis#
Create an Ordinary Least Squares (OLS) Model from a formula and the data object. |
|
Create a Generalized Linear Model (GLM) from a formula, a distribution, and the data object. |
|
Fit the Kaplan-Meier estimate for the survival function. |
|
Calculates the p-value for the logrank test comparing the survival functions of two groups. |
|
Calculate the P value indicating if a larger GLM, encompassing a smaller GLM's parameters, adds explanatory power. |
|
Fit the Cox’s proportional hazard for the survival function. |
|
Compute CoxPH adjusted survival curves stratified by a grouping variable. |
|
Fit the Weibull accelerated failure time regression for the survival function. |
|
Fit the log logistic accelerated failure time regression for the survival function. |
|
Employ the Nelson-Aalen estimator to estimate the cumulative hazard function from censored survival data. |
|
Employ the Weibull model in univariate survival analysis to understand event occurrence dynamics. |
Causal Inference#
ehrapy ships a small, dependency-light set of causal inference estimators built directly on top of scikit-learn. ATE estimators handle binary treatments via inverse probability of treatment weighting (IPTW), parametric g-computation, the doubly-robust augmented IPW (AIPW), and propensity score matching. Heterogeneous treatment effects (CATE) are available via the T-, S-, and X-learner meta-learners. Two diagnostics — covariate balance and positivity — round out the toolkit.
Estimate the average treatment effect by inverse probability of treatment weighting (IPTW). |
|
Estimate the ATE by parametric g-computation (a.k.a. |
|
Estimate the ATE by the augmented inverse-probability-weighted (AIPW) doubly robust estimator. |
|
Estimate the treatment effect by 1-to-\(k\) propensity score matching on the logit scale. |
|
Two-model (T-learner) CATE estimator. |
|
Single-model (S-learner) CATE estimator. |
|
X-learner CATE estimator of Künzel et al. (2019). |
|
Report standardised mean differences (SMD) for each covariate, before and after weighting. |
|
Diagnose the positivity assumption by inspecting the propensity score distribution. |
|
Result of a causal effect estimation. |
Normalized Complexity Profile#
Non-negative CP (PARAFAC) decomposition of a 3D temporal EHR layer. |
Cohort Tracking & summaries#
Track cohort changes over multiple filtering or processing steps. |
|
Build a stratified "Table 1" comparing baseline characteristics across groups. |