ehrapy.tools.mc.annotate_text#

static mc.annotate_text(medcat_obj, text_column, n_proc=2, batch_size_chars=500000)#

Annotate the original free text data. Note this will only annotate non null rows. The result will be a DataFrame. It will be set as the annotated_results attribute for the passed MedCat object. This dataframe will be the base for all further analyses, for example coloring umaps by specific diseases.

Parameters:
  • medcat_obj (MedCAT) – Ehrapy’s custom MedCAT object. The annotated_results attribute will be set here.

  • text_column (str) – Name of the column that should be annotated

  • n_proc (int) – Number of processors to use

  • batch_size_chars (int) – batch size to control for the variability between document sizes

Return type:

None