ehrapy.io.read_csv#
- ehrapy.io.read_csv(dataset_path, sep=',', index_column=None, columns_obs_only=None, columns_x_only=None, return_dfs=False, cache=False, download_dataset_name=None, backup_url=None, archive_format=None, **kwargs)[source]#
Reads or downloads a desired directory of csv/tsv files or a single csv/tsv file.
- Parameters:
dataset_path (
Path|str) – Path to the file or directory to read.sep (
str, default:',') – Separator in the file. Delegates to pandas.read_csv().index_column (
dict[str,str|int] |str|int|None, default:None) – The index column of obs. Usually the patient visit ID or the patient ID.columns_obs_only (
dict[str,list[str]] |list[str] |None, default:None) – These columns will be added to obs only and not X.columns_x_only (
dict[str,list[str]] |list[str] |None, default:None) – These columns will be added to X only and all remaining columns to obs. Note that datetime columns will always be added to .obs though.return_dfs (
bool, default:False) – Whether to return one or several Pandas DataFrames.cache (
bool, default:False) – Whether to write to cache when reading or not.download_dataset_name (
str|None, default:None) – Name of the file or directory after download.backup_url (
str|None, default:None) – URL to download the data file(s) from, if the dataset is not yet on disk.archive_format (
Literal['zip','tar','tar.gz','tgz'], default:None) – Whether the downloaded file is an archive.**kwargs – Passed to
pandas.read_csv()
- Return type:
- Returns:
An
AnnDataobject or a dict with an identifier (the filename, without extension) for eachAnnDataobject in the dict
Examples
>>> import ehrapy as ep >>> adata = ep.io.read_csv("myfile.csv")