ehrapy.tools.annotate_text

ehrapy.tools.annotate_text(adata, cat, text_column, key_added='medcat_annotations', n_proc=2, batch_size_chars=500000, copy=False)[source]

Annotate the original free text data. Note this will only annotate non null rows.

The result is a DataFrame. This DataFrame serves as the base for all further analyses, for example coloring UMAPs by specific diseases.

Parameters:
  • adata (AnnData) – AnnData object that holds the data to annotate.

  • cat (CAT) – MedCAT object.

  • text_column (str) – Name of the column that should be annotated.

  • key_added (str) – Key to add to adata.uns for the annotated results.

  • n_proc (int) – Number of processors to use.

  • batch_size_chars (int) – batch size to use for CAT’s multiprocessing method.

  • copy (bool) – Whether to copy adata or not.

Return type:

AnnData | None

Returns:

Returns None if copy=False, else returns an AnnData object. Sets the following fields;

adata.uns[key_added]pandas.DataFrame

DataFrame with the annotated results.