Changelog

Changelog#

This project adheres to Semantic Versioning.

v0.12.1#

🚀 Features#

Make dowhy optional & remove medcat #903 @Zethson
Add about page & improve citations #902 @Zethson
Overhaul doc structure #895 @Zethson
Move to biome & improve CI & reenable CR #890 @Zethson
Clean up Round - cut down anndata extension functionality #880 @eroell

v0.12.0#

🚀 Features#

Improved KM plot data depth and functionality #853 @aGuyLearning
New Feature: Forestplot for CoxPH model #838 @aGuyLearning
Datatype Support in Quality Control and Impute #865 @aGuyLearning
Revamp survival analysis interface #842 @aGuyLearning
Improve submodule documentation #859 @Zethson
Update Kaplan Meier plots in survival analysis notebook #864 @aGuyLearning

🐛 Bug Fixes#

Pass all non-nan features along desired var_names to impute (KNN) #867 @nicolassidoux
Remove Syntax warnings #869 @Zethson
Fix test_norm_power_group #862 @Zethson

🧰 Maintenance#

Fix a typo in pl.paga_compare: pos -> pos, #846 @VladimirShitov

v0.11.0#

✨ Features#

Add array type handling for normalization #835 @eroell @Zethson

🐛 Bug Fixes#

Fix scipy array support #844 @Zethson
Fix casting to float when assigning numeric values; fixes normalization of integer arrays #837 @eroell

v0.9.0 & 0.10.0#

🚀 Features#

Make all imputation methods consistent in regard to encoding requirements #827 @nicolassidoux
Add approximate KNN backend #791 @nicolassidoux
Improve survival analysis interface #825 @aGuyLearning
Python 3.12 support #794 @Lilly-May
Python 3.10+ & use uv for docs & fix RTD & support numpy 2 #830 @Zethson

🐛 Bug Fixes#

move_to_x: Fix name of non-implemented argument “copy” to “copy_x”, implement & test #832 @eroell
Contributing typo fix #821 @aGuyLearning
Fix miceforest #800 @Zethson
style: == to is for type comparison #774 @eroell

v0.8.0#

🚀 Features#

remove pyyaml & explicit scikit-learn #729 @Zethson
Remove fancyimpute #728 @Zethson
Unify feature type detection #724 @Lilly-May
catplot #721 @eroell
Simplify ehrapy #719 @Zethson
Use all #715 @Zethson
Add bias detection to preprocessing #690 @Lilly-May
Use lamin logger #707 @Zethson
Add faiss backend for KNN imputation #704 @Zethson
Build RTD docs with uv #700 @Zethson
Refactor feature importance ranking #698 @Zethson
Simplify CI #694 @Zethson
Refactor outliers and IQR #692 @Zethson
Calculation of feature importances in a supervised setting #677 @Lilly-May
Speed up winsorize #681 @Zethson
Remove notebook prefix in tutorial URLs #679 @Zethson
Add cohort tracking notebook #678 @Zethson
Switch to uv #674 @Zethson
Style: typing of _scale_func_group #727 @eroell
Improved support of encoded features in detect_bias #725 @Lilly-May
Enable Synchronous dataloader write #722 @wxicu
Feature scaling on training set when computing feature importances #716 @Lilly-May
add batch-wise normalization argument #711 @eroell
add functools.wraps to type check #705 @eroell
add bias notebook to list of notebooks #696 @eroell
basic sampling #686 @eroell
add options for subitles in legend of cohorttrackers barplot #688 @eroell
doc fix imputation: 70 instead of 30 #683 @eroell

🐛 Bug Fixes#

Encoded dtype to float32 instead of np.number #714 @Zethson
Fix feature importance warnings #708 @Zethson
Remove notebook prefix in tutorial URLs #679 @Zethson
fix name of log_rogistic_aft to log_logistic_aft #676 @eroell

🧰 Maintenance#

Remove notebook prefix in tutorial URLs #679 @Zethson
Add cohort tracking notebook #678 @Zethson
knni amendments #706 @eroell

v0.7.0#

🚀 Features#

Cohort Tracker #658 @eroell
change diabetes-130 datasets which are provided #672 @eroell
More sa functions #664 @fatisati
Coxphfitter #643 @fatisati
Implement little’s test #667 @Zethson
Improve test design #651 @Zethson
Improve QC docstring #639 @Zethson
Refactor _missing_values calculation #638 @Zethson

🐛 Bug Fixes#

Fix one-hot encoding tests #644 @Zethson

v0.6.0#

🚀 Features#

Breaking changes#

Move information on numerical/non_numerical/encoded_non_numerical from .uns to .var #630 @eroell

Make older AnnData objects compatible using

def move_type_info_from_uns_to_var(adata, copy=False):
    """
    Move type information from adata.uns to adata.var['ehrapy_column_type'].

    The latter is the current, updated flavor used by ehrapy.
    """
    if copy:
        adata = adata.copy()

    adata.var['ehrapy_column_type'] = 'unknown'

    if 'numerical_columns' in adata.uns.keys():
        for key in adata.uns['numerical_columns']:
            adata.var.loc[key, 'ehrapy_column_type'] = 'numeric'
    if 'non_numerical_columns' in adata.uns.keys():
        for key in adata.uns['non_numerical_columns']:
            adata.var.loc[key, 'ehrapy_column_type'] = 'non_numeric'
    if 'encoded_non_numerical_columns' in adata.uns.keys():
        for key in adata.uns['encoded_non_numerical_columns']:
            adata.var.loc[key, 'ehrapy_column_type'] = 'non_numeric_encoded'

    if copy:
        return adata

New features#

Medcat refresh #623 @eroell
Rank features groups obs #622 @eroell
Add FHIR tutorial and simplify code #626 @Zethson
Add input checks for imputers #625 @Zethson
Removed unused dependencies #615 @Zethson
Refactor encoding #588 @Zethson

🐛 Bug Fixes#

Use fixtures for preprocessing tests #577 @Zethson

🧰 Maintenance#

Refactoring #627 @Zethson
Add FHIR tutorial and simplify code #626 @Zethson
pre-commit #587 @Zethson
Small edits #599 @eroell

v0.5.0#

🚀 Features#

Add g-tests for rank features group #546 @VladimirShitov
Causal Inference with dowhy #502 @timtreis
Remove MuData support #545 @Zethson

🐛 Bug Fixes#

Fixed reading format warnings #569 @namsaraeva
Fixed inability to normalize AnnData that does not require encoding #568 @namsaraeva
Fixed adata.uns[“non_numericlal_columns”] being empty in mimic_2 dataset #567 @namsaraeva

v0.4.0#

🚀 Features#

Add Synthea dataset #510 @namsaraeva
Added tiny examples to every function #498 @namsaraeva
add a title parameter #494 @xinyuejohn
Changed the hue of grey #493 @namsaraeva
Logger info message when writing to .h5ad files #458 @namsaraeva
Modified docstrings #533 @namsaraeva
Added examples to missing modules #531 @namsaraeva
Allow Python 3.11 #523 @Zethson
Add test_kmf_logrank #516 @Zethson
Add scget functions #484 @Zethson
Add FHIR parsing support #463 @Zethson
Add new tutorial & switch to python 3.10 #454 @Zethson
Add docs group #437 @Zethson
Add thefuzz #434 @Zethson

🐛 Bug Fixes#

Fix CI #524 @Zethson
Error message and minor fixes, issue #447 #504 @namsaraeva
fix quality control #495 @xinyuejohn
Fix MacOS CI #435 @Zethson

🧰 Maintenance#

Add test_kmf_logrank #516 @Zethson
Add scget functions #484 @Zethson
Add new tutorial & switch to python 3.10 #454 @Zethson

v0.3.0#

🚀 Features#

Add winsorize, clip quantiles and filter quantiles #418 @Zethson
Remove PDF support #430 @Zethson
Logging instance, issue #246 #426 @namsaraeva
Negative values offset #420 @Zethson
Missing values visualization, ref issue #271 #419 @namsaraeva
Add copy_obs parameter to move_to_obs #404 @namsaraeva
add anova_glm function #400 @xinyuejohn
issue #397 “check for neighbors run before UMAP” fixed #401 @namsaraeva
add more tutorials to CI #382 @Zethson
add support for reading multiple files into Pandas DFs & adapted MIMIC-III Demo #386 @Zethson
#321: Add X_only option for reading #380 @Imipenem

🐛 Bug Fixes#

KeyError fix issue #423 #428 @namsaraeva
fix qc_metrics bug #425 @xinyuejohn
df_to_anndata logical XOR to OR, issue #422 #429 @namsaraeva
Fix docs CI #392 @Zethson
small fix in the qc_metrics() example #407 @namsaraeva

🧰 Maintenance#

Add winsorize, clip quantiles and filter quantiles #418 @Zethson
Remove PDF support #430 @Zethson
Negative values offset #420 @Zethson
Missing values visualization, ref issue #271 #419 @namsaraeva
Fix docs CI #392 @Zethson

v0.2.0#

🚀 Features#

Important cookietemple template update 2.1.0 released! #343 @Zethson
add chronic kidney disease dataloader #301 @xinyuejohn
dataloader for diabetes dataset #292 @HorlavaNastassya
Add X_only option for reading #380 @Imipenem
MedCAT API improvements & function renaming #381 @Zethson
minor changes #379 @xinyuejohn
add functions related to survival analysis #371 @xinyuejohn
MedCat [#101]: extract biomedical concepts/entities from (free) text #367 @Imipenem
Add heart dataset to docs #377 @xinyuejohn
add heart disease data set to ehrapy #376 @xinyuejohn
add highly_variable_features #364 @xinyuejohn
add SoftImpute and IterativeSVD to imputation #353 @xinyuejohn
[#307] Improve KNN with n_neighbours parameter #365 @Imipenem
add furo theme & switch to markdown #359 @Zethson
add several datasets and change Docstring examples #355 @xinyuejohn
Add ability to compare laboratory measurements to reference values #352 @Zethson
[Feature] New read API #263 #351 @Imipenem
[Feature] Set index column #305 #350 @Imipenem
Add encoded parameter to all new datasets amd fix import #336 @xinyuejohn
[FEATURE] #314: Autodetect binary (0,1) columns #327 @Imipenem
[FEATURE] Display QC metrics of var #239 #323 @Imipenem
Add several dataset loaders #322 @xinyuejohn
[FEATURE] Improve type_overview #306 #308 @Imipenem
Feature/deep translator integration #303 @MxMstrmn
remove CLI module #298 @Zethson
Improve missforest interface #284 @Zethson
Add example calls and preview images to all plotting functions #289 @xinyuejohn
add heart failure dataloader #291 @Zethson
add highly_variable_features #364 @xinyuejohn
add SoftImpute and IterativeSVD to imputation #353 @xinyuejohn
add furo theme & switch to markdown #359 @Zethson

🐛 Bug Fixes#

[FIX] #255: Encode mutates input adata object #348 @Imipenem
[FIX] Write .h5ad files #347 @Imipenem
Fix #331: Improved autodetect docs #344 @Imipenem
Add encoded parameter to all new datasets amd fix import #336 @xinyuejohn
[FIX] Autodetect encode + specify encode mode for autodetect #310 @Imipenem

🧰 Maintenance#

MedCAT API improvements & function renaming #381 @Zethson
add functions related to survival analysis #371 @xinyuejohn
MedCat [#101]: extract biomedical concepts/entities from (free) text #367 @Imipenem
Add heart dataset to docs #377 @xinyuejohn
remove CLI module #298 @Zethson

v0.1.0#

🚀 Features#

Input and output of CSVs, PDFs, h5ad files
Several encoding modes (one-hot, label, …)
Several imputation methods (simple, KNN, MissForest, …)
Several normalization methods (log, scale, …)
Full Scanpy API support
Initial MedCAT integration
DeepL & Google Translator support

Changelog

Contents

Changelog#

v0.12.1#

🚀 Features#

v0.12.0#

🚀 Features#

🐛 Bug Fixes#

🧰 Maintenance#

v0.11.0#

✨ Features#

🐛 Bug Fixes#

v0.9.0 & 0.10.0#

🚀 Features#

🐛 Bug Fixes#

v0.8.0#

🚀 Features#

🐛 Bug Fixes#

🧰 Maintenance#

v0.7.0#

🚀 Features#

🐛 Bug Fixes#

v0.6.0#

🚀 Features#

Breaking changes#

New features#

🐛 Bug Fixes#

🧰 Maintenance#

v0.5.0#

🚀 Features#

🐛 Bug Fixes#

v0.4.0#

🚀 Features#

🐛 Bug Fixes#

🧰 Maintenance#

v0.3.0#

🚀 Features#

🐛 Bug Fixes#

🧰 Maintenance#

v0.2.0#

🚀 Features#

🐛 Bug Fixes#

🧰 Maintenance#

v0.1.0#

🚀 Features#