The easiest way to get familiar with ehrapy is to follow along with our tutorials. Many are also designed to work seamlessly in Binder, a free cloud computing platform.


For questions about the usage of ehrapy use Github Discussions.

Quick start


AnnData is short for Annotated Data and is the primary datastructure that ehrapy uses. It is based on the principle of a single Numpy matrix X embraced by two Pandas Dataframes. All rows are called observations (in our case patients/patient visits or similar) and the columns are known as variables (any feature such as e.g. age, B12 level or similar). For a more in depth introduction please read the AnnData paper.

The implementation of ehrapy is based on scanpy, a framework to analyze single-cell sequencing data. ehrapy reuses the implemented algorithms in scanpy and wraps them for simple access. For a more in depth introduction please read the Scanpy paper.