ehrapy.io.read_fhir#
- ehrapy.io.read_fhir(dataset_path, format='json', columns_obs_only=None, columns_x_only=None, return_df=False, cache=False, backup_url=None, index_column=None, download_dataset_name=None, archive_format=None)[source]#
Reads one or multiple FHIR files using fhiry.
Uses https://github.com/dermatologist/fhiry to read the FHIR file into a Pandas DataFrame which is subsequently transformed into an AnnData object.
- Parameters:
dataset_path (
str) – Path to one or multiple FHIR files.format (
Literal['json','ndjson']) – The file format of the FHIR data. One of ‘json’ or ‘ndjson’. Defaults to ‘json’.columns_obs_only (
Optional[list[str]]) – These columns will be added to obs only and not X.columns_x_only (
Optional[list[str]]) – These columns will be added to X only and all remaining columns to obs. Note that datetime columns will always be added to .obs though.return_df (
bool) – Whether to return one or several Pandas DataFrames.cache (
bool) – Whether to write to cache when reading or not. Defaults to False.download_dataset_name (
Optional[str]) – Name of the file or directory in case the dataset is downloadedindex_column (
Union[str,int,None]) – The index column for the generated object. Usually the patient or visit ID.backup_url (
Optional[str]) – URL to download the data file(s) from if not yet existing.
- Return type:
- Returns:
A Pandas DataFrame or AnnData object of the read in FHIR file(s).
Examples
>>> import ehrapy as ep >>> adata = ep.io.read_fhir("/path/to/fhir/resources")