Changelog#
This project adheres to Semantic Versioning.
v0.13.0#
π Features#
Transitioning from AnnData to EHRData
EHRDatareplacesAnnDataas ehrapyβs core data structure to better support time-series electronic health record data. The key enhancement is native support for 3D tensors (observations Γ variables Γ timesteps) alongside the existing 2D matrices, enabling efficient storage of longitudinal patient data. A new.temDataFrame provides time-point annotations, complementing the existing.obsand.varannotations for comprehensive temporal data description. WhileEHRDatamaintains full backward compatibility with AnnDataβs API, users can now seamlessly work with time-series data and leverage specialized methods for temporal analysis. Existing code usingAnnDataobjects will continue to work, but migration toEHRDatais strongly recommended to access enhanced time-series functionality.The preferred central data object is now
EHRData(#908) @eroellThe
layersargument is now available for all functions operating on X or layers (#908) @eroellUpdate expected behaviour of
io.read_fhir(#922) @eroellMove
mimic_2,mimic_2_preprocessed,diabetes_130_raw,diabetes_130_fairlearntoehrdata.dt(#908)Deprecate all
ep.dt.*, refer to datasets inehrdata(#908) @eroellSupport Python 3.14 (#996) @Zethson
Move kaplan_meier & cox_ph plots to holoviews (#995) @Zethson
Longitudinal normalization (#958) @agerardy
Add interactive
olsplot (#992) @ZethsonLongitudinal and new qc_metrics (#967) @sueoglu
Simple Impute for timeseries (#975) @eroell
Simple implementation of balanced sampling (#937) @sueoglu
Add Sankey diagram visualization functions (#989) @sueoglu
Add
ep.pl.timeseries()to visualize variables over time (#994) @sueogluAdd GPU CI & skeleton (#998) @Zethson
Add FAMD (#976) @Zethson
3D enabled implementation of ep.pp.filter_observations, ep.pp.filter_features (#953) @sueoglu
Add time series distances (#954) @Zethson
π Bug Fixes#
π§° Maintenance#
Update actions (#977) @Zethson
Cleanup simple_impute tests (#974) @eroell
Move to ehrdata 0.0.10 (#971) @eroell
Improved notebook CI (#959) @Zethson
Switch to template (#960) @Zethson
Tests for more plots (#919) @sueoglu
Lowerbound cvxpy (#935) @Zethson
Optimize var_metrics (#927) @Zethson
Refactor Dask usage pattern (#926) @Zethson
Add cover to README & remove some tokens (#923) @Zethson
Update test coverage reporting (#918) @eroell
Fix changelog links (#915) @Zethson
Fixed structure of Returns in _rank_features_groups.py documentation (#911) @agerardy
Add EHRData transition code (#897) @Zethson @eroell
Make test that downloads dermatology dataset more robust (#906) @Zethson
Update image source in README.md (#986) @eroell
Fix plot docs formatting (#952) @Zethson
Typo in the documentation of ehrapy.data.mimic_2_preprocessed (#917) @sueoglu
v0.12.1#
π Features#
v0.12.0#
π Features#
Improved KM plot data depth and functionality (#853) @aGuyLearning
New Feature: Forestplot for CoxPH model (#838) @aGuyLearning
Datatype Support in Quality Control and Impute (#865) @aGuyLearning
Revamp survival analysis interface (#842) @aGuyLearning
Improve submodule documentation (#859) @Zethson
Update Kaplan Meier plots in survival analysis notebook (#864) @aGuyLearning
π Bug Fixes#
π§° Maintenance#
Fix a typo in
pl.paga_compare:pos->pos,(#846) @VladimirShitov
v0.11.0#
β¨ Features#
Add array type handling for normalization (#835) @eroell @Zethson
π Bug Fixes#
v0.9.0 & 0.10.0#
π Features#
Make all imputation methods consistent in regard to encoding requirements (#827) @nicolassidoux
Add approximate KNN backend (#791) @nicolassidoux
Improve survival analysis interface (#825) @aGuyLearning
Python 3.12 support (#794) @Lilly-May
Python 3.10+ & use uv for docs & fix RTD & support numpy 2 (#830) @Zethson
π Bug Fixes#
v0.8.0#
π Features#
remove pyyaml & explicit scikit-learn (#729) @Zethson
Remove fancyimpute (#728) @Zethson
Unify feature type detection (#724) @Lilly-May
catplot (#721) @eroell
Simplify ehrapy (#719) @Zethson
Use all (#715) @Zethson
Add bias detection to preprocessing (#690) @Lilly-May
Use lamin logger (#707) @Zethson
Add faiss backend for KNN imputation (#704) @Zethson
Build RTD docs with uv (#700) @Zethson
Refactor feature importance ranking (#698) @Zethson
Simplify CI (#694) @Zethson
Refactor outliers and IQR (#692) @Zethson
Calculation of feature importances in a supervised setting (#677) @Lilly-May
Speed up winsorize (#681) @Zethson
Remove notebook prefix in tutorial URLs (#679) @Zethson
Add cohort tracking notebook (#678) @Zethson
Switch to uv (#674) @Zethson
Style: typing of _scale_func_group (#727) @eroell
Improved support of encoded features in detect_bias (#725) @Lilly-May
Enable Synchronous dataloader write (#722) @wxicu
Feature scaling on training set when computing feature importances (#716) @Lilly-May
add batch-wise normalization argument (#711) @eroell
add functools.wraps to type check (#705) @eroell
add bias notebook to list of notebooks (#696) @eroell
basic sampling (#686) @eroell
add options for subitles in legend of cohorttrackers barplot (#688) @eroell
doc fix imputation: 70 instead of 30 (#683) @eroell
π Bug Fixes#
π§° Maintenance#
v0.7.0#
π Features#
Cohort Tracker (#658) @eroell
change diabetes-130 datasets which are provided (#672) @eroell
More sa functions (#664) @fatisati
Coxphfitter (#643) @fatisati
Implement littleβs test (#667) @Zethson
Improve test design (#651) @Zethson
Improve QC docstring (#639) @Zethson
Refactor _missing_values calculation (#638) @Zethson
π Bug Fixes#
Fix one-hot encoding tests (#644) @Zethson
v0.6.0#
π Features#
Breaking changes#
Move information on numerical/non_numerical/encoded_non_numerical from .uns to .var (#630) @eroell
Make older AnnData objects compatible using
def move_type_info_from_uns_to_var(adata, copy=False):
"""Move type information from adata.uns to adata.var['ehrapy_column_type'].
The latter is the current, updated flavor used by ehrapy.
"""
if copy:
adata = adata.copy()
adata.var['ehrapy_column_type'] = 'unknown'
if 'numerical_columns' in adata.uns.keys():
for key in adata.uns['numerical_columns']:
adata.var.loc[key, 'ehrapy_column_type'] = 'numeric'
if 'non_numerical_columns' in adata.uns.keys():
for key in adata.uns['non_numerical_columns']:
adata.var.loc[key, 'ehrapy_column_type'] = 'non_numeric'
if 'encoded_non_numerical_columns' in adata.uns.keys():
for key in adata.uns['encoded_non_numerical_columns']:
adata.var.loc[key, 'ehrapy_column_type'] = 'non_numeric_encoded'
if copy:
return adata
New features#
π Bug Fixes#
Use fixtures for preprocessing tests (#577) @Zethson
π§° Maintenance#
v0.5.0#
π Features#
π Bug Fixes#
v0.4.0#
π Features#
Add Synthea dataset (#510) @namsaraeva
Added tiny examples to every function (#498) @namsaraeva
add a title parameter (#494) @xinyuejohn
Changed the hue of grey (#493) @namsaraeva
Logger info message when writing to .h5ad files (#458) @namsaraeva
Modified docstrings (#533) @namsaraeva
Added examples to missing modules (#531) @namsaraeva
Allow Python 3.11 (#523) @Zethson
Add test_kmf_logrank (#516) @Zethson
Add scget functions (#484) @Zethson
Add FHIR parsing support (#463) @Zethson
Add new tutorial & switch to python 3.10 (#454) @Zethson
Add docs group (#437) @Zethson
Add thefuzz (#434) @Zethson
π Bug Fixes#
π§° Maintenance#
v0.3.0#
π Features#
Add winsorize, clip quantiles and filter quantiles (#418) @Zethson
Remove PDF support (#430) @Zethson
Logging instance, issue #246 (#426) @namsaraeva
Negative values offset (#420) @Zethson
Missing values visualization, ref issue #271 (#419) @namsaraeva
Add copy_obs parameter to move_to_obs (#404) @namsaraeva
add anova_glm function (#400) @xinyuejohn
issue #397 βcheck for neighbors run before UMAPβ fixed (#401) @namsaraeva
add more tutorials to CI (#382) @Zethson
add support for reading multiple files into Pandas DFs & adapted MIMIC-III Demo (#386) @Zethson
#321: Add X_only option for reading (#380) @Imipenem
π Bug Fixes#
π§° Maintenance#
v0.2.0#
π Features#
Important cookietemple template update 2.1.0 released! (#343) @Zethson
add chronic kidney disease dataloader (#301) @xinyuejohn
dataloader for diabetes dataset (#292) @HorlavaNastassya
Add X_only option for reading (#380) @Imipenem
MedCAT API improvements & function renaming (#381) @Zethson
minor changes (#379) @xinyuejohn
add functions related to survival analysis (#371) @xinyuejohn
MedCat [#101]: extract biomedical concepts/entities from (free) text (#367) @Imipenem
Add heart dataset to docs (#377) @xinyuejohn
add heart disease data set to ehrapy (#376) @xinyuejohn
add highly_variable_features (#364) @xinyuejohn
add SoftImpute and IterativeSVD to imputation (#353) @xinyuejohn
(#307) Improve KNN with n_neighbours parameter (#365) @Imipenem
add furo theme & switch to markdown (#359) @Zethson
add several datasets and change Docstring examples (#355) @xinyuejohn
Add ability to compare laboratory measurements to reference values (#352) @Zethson
(Feature) New read API #263 (#351) @Imipenem
(Feature) Set index column #305 (#350) @Imipenem
Add encoded parameter to all new datasets amd fix import (#336) @xinyuejohn
(FEATURE) #314: Autodetect binary (0,1) columns (#327) @Imipenem
(FEATURE) Display QC metrics of var #239 (#323) @Imipenem
Add several dataset loaders (#322) @xinyuejohn
(FEATURE) Improve type_overview #306 (#308) @Imipenem
Feature/deep translator integration (#303) @MxMstrmn
remove CLI module (#298) @Zethson
Improve missforest interface (#284) @Zethson
Add example calls and preview images to all plotting functions (#289) @xinyuejohn
add heart failure dataloader (#291) @Zethson
add highly_variable_features (#364) @xinyuejohn
add SoftImpute and IterativeSVD to imputation (#353) @xinyuejohn
add furo theme & switch to markdown (#359) @Zethson
π Bug Fixes#
(FIX) #255: Encode mutates input adata object (#348) @Imipenem
(FIX) Write .h5ad files (#347) @Imipenem
Fix #331: Improved autodetect docs (#344) @Imipenem
Add encoded parameter to all new datasets amd fix import (#336) @xinyuejohn
(FIX) Autodetect encode + specify encode mode for autodetect (#310) @Imipenem
π§° Maintenance#
v0.1.0#
π Features#
Input and output of CSVs, PDFs, h5ad files
Several encoding modes (one-hot, label, β¦)
Several imputation methods (simple, KNN, MissForest, β¦)
Several normalization methods (log, scale, β¦)
Full Scanpy API support
Initial MedCAT integration
DeepL & Google Translator support