ehrapy.tools.MedCAT#
- class ehrapy.tools.MedCAT(anndata, vocabulary=None, concept_db=None, model_pack_path=None)[source]#
Wrapper class for Medcat. This class will hold references to the current AnnData object, which holds the data, the current model (with vocab and concept database) and should be passed to all functions exposed to the ehrapy nlp API when required.
Methods table#
|
Creates a MedCAT concept database and sets it for the MedCAT object. |
|
Creates a MedCAT Vocab and sets it for the MedCAT object. |
|
Loads the concept database. |
|
Loads a vocabulary. |
|
Saves a concept database. |
|
Saves a MedCAT model pack. |
|
Saves a vocabulary. |
|
Restrict results of annotation step to certain tui's (type unique identifiers). |
|
Updates the current MedCAT instance with new Vocabularies and Concept Databases. |
|
Updates the MedCAT configuration. |
Methods#
create_concept_db#
- static MedCAT.create_concept_db(csv_path, config=None)[source]#
Creates a MedCAT concept database and sets it for the MedCAT object.
- Parameters:
List of paths to one or more csv files containing all concepts. The concept csvs must look like:
cui,name 1,kidney failure 7,coronavirus
config (Config) – Optional MedCAT concept database configuration. If not provided a default configuration with config.general[‘spacy_model’] = ‘en_core_sci_md’ is created.
- Return type:
CDB
- Returns:
Instance of a MedCAT CDB concept database
create_vocabulary#
load_concept_db#
load_vocabulary#
save_concept_db#
save_model_pack#
save_vocabulary#
set_filter_by_tui#
- MedCAT.set_filter_by_tui(tuis)[source]#
Restrict results of annotation step to certain tui’s (type unique identifiers).
Note that this will change the MedCat object by updating the concept database config. In every annotation process that will be run afterwards, entities are shown, only if they fall into the tui’s type. A full list of tui’s can be found at: https://lhncbc.nlm.nih.gov/ii/tools/MetaMap/Docs/SemanticTypes_2018AB.txt
As an example: Setting tuis=[“T047”, “T048”] will only annotate concepts (identified by a CUI (concept unique identifier)) in UMLS that are either diseases or syndroms (T047) or mental/behavioural dysfunctions (T048).