, groupby, n_pcs=None, use_rep=None, var_names=None, cor_method='pearson', linkage_method='complete', optimal_ordering=False, key_added=None, inplace=True)[source]

Computes a hierarchical clustering for the given groupby categories.

By default, the PCA representation is used unless .X has less than 50 variables. Alternatively, a list of var_names (e.g. genes) can be given. Average values of either var_names or components are used to compute a correlation matrix.

The hierarchical clustering can be visualized using or multiple other visualizations that can include a dendrogram: matrixplot(), heatmap(), dotplot(), and stacked_violin().


The computation of the hierarchical clustering is based on predefined groups and not per observation. The correlation matrix is computed using by default pearson but other methods are available.

  • adata (AnnData) – AnnData object containing all observations.

  • groupby (str) – Key to group by

  • n_pcs (int | None, default: None) – Use this many PCs. If n_pcs==0 use .X if use_rep is None.

  • use_rep (str | None, default: None) – Use the indicated representation. ‘X’ or any key for .obsm is valid. If None, the representation is chosen automatically: For .n_vars < 50, .X is used, otherwise ‘X_pca’ is used. If ‘X_pca’ is not present, it’s computed with default parameters.

  • var_names (Sequence[str] | None, default: None) – List of var_names to use for computing the hierarchical clustering. If var_names is given, then use_rep and n_pcs is ignored.

  • cor_method (str, default: 'pearson') – correlation method to use. Options are ‘pearson’, ‘kendall’, and ‘spearman’

  • linkage_method (str, default: 'complete') – linkage method to use. See scipy.cluster.hierarchy.linkage() for more information.

  • optimal_ordering (bool, default: False) – Same as the optimal_ordering argument of scipy.cluster.hierarchy.linkage() which reorders the linkage matrix so that the distance between successive leaves is minimal.

  • key_added (str | None, default: None) – By default, the dendrogram information is added to .uns[f’dendrogram_{{groupby}}’]. Notice that the groupby information is added to the dendrogram.

  • inplace (bool, default: True) – If True, adds dendrogram information to adata.uns[key_added], else this function returns the information.

Return type:

dict[str, Any] | None


If inplace=False, returns dendrogram information, else adata.uns[key_added] is updated with it.


>>> import ehrapy as ep
>>> adata =
>>>, groupby="service_unit")