The easiest way to get familiar with ehrapy is to follow along with our tutorials. Many are also designed to work seamlessly in Binder, a free cloud computing platform.


For questions about the usage of ehrapy use Github Discussions.

Quick start#


AnnData is short for Annotated Data and is the primary datastructure that ehrapy uses. It is based on the principle of a single Numpy matrix X embraced by two Pandas Dataframes. All rows are called observations (in our case patients/patient visits or similar) and the columns are known as variables (any feature such as e.g. age, B12 level or similar). For a more in depth introduction please read the AnnData paper.

The implementation of ehrapy is based on scanpy, a framework to analyze single-cell sequencing data. ehrapy reuses the implemented algorithms in scanpy and wraps them for simple access. For a more in depth introduction please read the Scanpy paper.