Tools

Tools#

Any transformation of the data matrix that is not preprocessing. In contrast to a preprocessing function, a tool usually adds an easily interpretable annotation to the data matrix, which can then be visualized with a corresponding plotting function.

Embeddings#

`tools.tsne`	Calculates t-SNE [vdMH08], [ADT+13], and [PVG+11].
`tools.umap`	Embed the neighborhood graph using UMAP [MHM18].
`tools.draw_graph`	Force-directed graph drawing [IKM+11], [JVHB14], and [Chi18].
`tools.diffmap`	Diffusion Maps [CLL+05], [HBT15], [WHP+19].
`tools.embedding_density`	Calculate the density of observation in an embedding (per condition).

Clustering and trajectory inference#

`tools.leiden`	Cluster observations into subgroups [TWvE19].
`tools.dendrogram`	Computes a hierarchical clustering for the given groupby categories.
`tools.dpt`	Infer progression of observations through geodesic distance along the graph [HBW+16], [WHP+19].
`tools.paga`	Mapping out the coarse-grained connectivity structures of complex manifolds [WHP+19].

Feature Ranking#

`tools.rank_features_groups`	Rank features for characterizing groups.
`tools.filter_rank_features_groups`	Filters out features based on fold change and fraction of features containing the feature within and outside the groupby categories.
`tools.rank_features_supervised`	Calculate feature importances for predicting a specified feature in adata.var.

Dataset integration#

tools.ingest

Map labels and embeddings from reference data to new data.

Survival Analysis#

`tools.ols`	Create an Ordinary Least Squares (OLS) Model from a formula and the data object.
`tools.glm`	Create a Generalized Linear Model (GLM) from a formula, a distribution, and the data object.
`tools.kaplan_meier`	Fit the Kaplan-Meier estimate for the survival function.
`tools.test_kmf_logrank`	Calculates the p-value for the logrank test comparing the survival functions of two groups.
`tools.test_nested_f_statistic`	Calculate the P value indicating if a larger GLM, encompassing a smaller GLM's parameters, adds explanatory power.
`tools.cox_ph`	Fit the Cox’s proportional hazard for the survival function.
`tools.cox_ph_adjusted_curves`	Compute CoxPH adjusted survival curves stratified by a grouping variable.
`tools.weibull_aft`	Fit the Weibull accelerated failure time regression for the survival function.
`tools.log_logistic_aft`	Fit the log logistic accelerated failure time regression for the survival function.
`tools.nelson_aalen`	Employ the Nelson-Aalen estimator to estimate the cumulative hazard function from censored survival data.
`tools.weibull`	Employ the Weibull model in univariate survival analysis to understand event occurrence dynamics.

Causal Inference#

ehrapy ships a small, dependency-light set of causal inference estimators built directly on top of scikit-learn. ATE estimators handle binary treatments via inverse probability of treatment weighting (IPTW), parametric g-computation, the doubly-robust augmented IPW (AIPW), and propensity score matching. Heterogeneous treatment effects (CATE) are available via the T-, S-, and X-learner meta-learners. Two diagnostics — covariate balance and positivity — round out the toolkit.

`tools.iptw`	Estimate the average treatment effect by inverse probability of treatment weighting (IPTW).
`tools.g_computation`	Estimate the ATE by parametric g-computation (a.k.a.
`tools.aipw`	Estimate the ATE by the augmented inverse-probability-weighted (AIPW) doubly robust estimator.
`tools.propensity_score_matching`	Estimate the treatment effect by 1-to-\(k\) propensity score matching on the logit scale.
`tools.t_learner`	Two-model (T-learner) CATE estimator.
`tools.s_learner`	Single-model (S-learner) CATE estimator.
`tools.x_learner`	X-learner CATE estimator of Künzel et al. (2019).
`tools.covariate_balance`	Report standardised mean differences (SMD) for each covariate, before and after weighting.
`tools.positivity_check`	Diagnose the positivity assumption by inspecting the propensity score distribution.
`tools.CausalEstimate`	Result of a causal effect estimation.

Normalized Complexity Profile#

tools.ncp

Non-negative CP (PARAFAC) decomposition of a 3D temporal EHR layer.

Cohort Tracking & summaries#

`tools.CohortTracker`	Track cohort changes over multiple filtering or processing steps.
`tools.stratified_table_one`	Build a stratified "Table 1" comparing baseline characteristics across groups.