ehrapy.io.read_fhir#
- ehrapy.io.read_fhir(dataset_path, format='json', columns_obs_only=None, columns_x_only=None, return_df=False, cache=False, backup_url=None, index_column=None, download_dataset_name=None, archive_format=None)[source]#
Reads one or multiple FHIR files using fhiry.
Uses https://github.com/dermatologist/fhiry to read the FHIR file into a Pandas DataFrame which is subsequently transformed into an AnnData object.
- Parameters:
dataset_path (
str
) – Path to one or multiple FHIR files.format (
Literal
['json'
,'ndjson'
]) – The file format of the FHIR data. One of ‘json’ or ‘ndjson’. Defaults to ‘json’.columns_obs_only (
Optional
[list
[str
]]) – These columns will be added to obs only and not X.columns_x_only (
Optional
[list
[str
]]) – These columns will be added to X only and all remaining columns to obs. Note that datetime columns will always be added to .obs though.return_df (
bool
) – Whether to return one or several Pandas DataFrames.cache (
bool
) – Whether to write to cache when reading or not. Defaults to False.download_dataset_name (
Optional
[str
]) – Name of the file or directory in case the dataset is downloadedindex_column (
Union
[str
,int
,None
]) – The index column for the generated object. Usually the patient or visit ID.backup_url (
Optional
[str
]) – URL to download the data file(s) from if not yet existing.
- Return type:
- Returns:
A Pandas DataFrame or AnnData object of the read in FHIR file(s).
Examples
>>> import ehrapy as ep >>> adata = ep.io.read_fhir("/path/to/fhir/resources")