ehrapy.tools.annotate_text¶
- ehrapy.tools.annotate_text(adata, cat, text_column, key_added='medcat_annotations', n_proc=2, batch_size_chars=500000, copy=False)[source]¶
Annotate the original free text data. Note this will only annotate non null rows.
The result is a DataFrame. This DataFrame serves as the base for all further analyses, for example coloring UMAPs by specific diseases.
- Parameters:
adata (
AnnData) – AnnData object that holds the data to annotate.cat (
CAT) – MedCAT object.text_column (
str) – Name of the column that should be annotated.key_added (
str, default:'medcat_annotations') – Key to add to adata.uns for the annotated results.n_proc (
int, default:2) – Number of processors to use.batch_size_chars (
int, default:500000) – batch size to use for CAT’s multiprocessing method.copy (
bool, default:False) – Whether to copy adata or not.
- Return type:
- Returns:
Returns None if copy=False, else returns an AnnData object. Sets the following fields;
- adata.uns[key_added]
pandas.DataFrame DataFrame with the annotated results.
- adata.uns[key_added]