ehrapy.tools.filter_rank_features_groups

ehrapy.tools.filter_rank_features_groups#

ehrapy.tools.filter_rank_features_groups(edata, *, key='rank_features_groups', groupby=None, key_added='rank_features_groups_filtered', min_in_group_fraction=0.25, min_fold_change=1, max_out_group_fraction=0.5)[source]#

Filters out features based on fold change and fraction of features containing the feature within and outside the groupby categories.

See rank_features_groups().

Results are stored in edata.uns[key_added] (default: ‘rank_genes_groups_filtered’).

To preserve the original structure of edata.uns[‘rank_genes_groups’], filtered genes are set to NaN.

Parameters:
  • edata (EHRData | AnnData) – Central data object.

  • key (str, default: 'rank_features_groups') – Key previously added by rank_features_groups()

  • groupby (str | None, default: None) – The key of the observations grouping to consider.

  • key_added (str, default: 'rank_features_groups_filtered') – The key in edata.uns information is saved to.

  • min_in_group_fraction (float, default: 0.25) – Minimum in group fraction (default: 0.25).

  • min_fold_change (int, default: 1) – Miniumum fold change (default: 1).

  • max_out_group_fraction (float, default: 0.5) – Maximum out group fraction (default: 0.5).

Return type:

None

Returns:

Same output as ehrapy.tools.rank_features_groups() but with filtered feature names set to nan

Examples

>>> import ehrapy as ep
>>> import ehrdata as ed
>>> edata = ed.dt.mimic_2()
>>> edata = ep.ad.move_to_obs(edata, to_obs=["service_unit"])
>>> ep.tl.rank_features_groups(edata, "service_unit")
>>> ep.pl.rank_features_groups(edata)