ehrapy.tools.annotate_text¶
- ehrapy.tools.annotate_text(adata, cat, text_column, key_added='medcat_annotations', n_proc=2, batch_size_chars=500000, copy=False)[source]¶
Annotate the original free text data. Note this will only annotate non null rows.
The result is a DataFrame. This DataFrame serves as the base for all further analyses, for example coloring UMAPs by specific diseases.
- Parameters:
adata (AnnData) – AnnData object that holds the data to annotate.
cat (CAT) – MedCAT object.
text_column (str) – Name of the column that should be annotated.
key_added (str, default:
'medcat_annotations'
) – Key to add to adata.uns for the annotated results.n_proc (int, default:
2
) – Number of processors to use.batch_size_chars (int, default:
500000
) – batch size to use for CAT’s multiprocessing method.copy (bool, default:
False
) – Whether to copy adata or not.
- Return type:
AnnData | None
- Returns:
Returns None if copy=False, else returns an AnnData object. Sets the following fields;
- adata.uns[key_added]
pandas.DataFrame
DataFrame with the annotated results.
- adata.uns[key_added]