ehrapy.tools.ncp#
- ehrapy.tools.ncp(edata, *, layer, rank=4, n_iter_max=300, init='random', sigmoid_transform=False, key_added='ncp', random_state=0, copy=False)[source]#
Non-negative CP (PARAFAC) decomposition of a 3D temporal layer.
Decomposes the stored 3D data into three factor matrices (all factors non-negative).
Uses
tensorly.decomposition.non_negative_parafac().- Parameters:
layer (
str) – Key of the 3D layer to decompose (shapen_obs × n_vars × n_time).rank (
int, default:4) – Number of components (rank of the decomposition).n_iter_max (
int, default:300) – Maximum number of ALS iterations.init (
str, default:'random') – Initialisation strategy passed tonon_negative_parafac()("random"or"svd").sigmoid_transform (
bool, default:False) – IfTrue, apply a sigmoid transformation to the layer before decomposition. Useful when the layer contains raw logits.key_added (
str, default:'ncp') – Key prefix for storing results. Results are stored asedata.obsm["X_{key_added}"](sample factors, shapen_obs × rank),edata.varm["{key_added}_loadings"](variable factors, shapen_vars × rank), andedata.uns["{key_added}"](temporal factors + metadata).random_state (
int, default:0) – Random seed for reproducibility.copy (
bool, default:False) – Whether to return a copy rather than modifying in place.
- Return type:
- Returns:
Noneifcopy=False, else a modified copy ofedata.
Examples
>>> import numpy as np, pandas as pd >>> import ehrdata as ed, ehrapy as ep >>> np.random.seed(0) >>> tensor = np.abs(np.random.randn(30, 8, 12)) # patients × vars × time >>> edata = ed.EHRData( ... shape=(30, 8), ... layers={"data": tensor}, ... var=pd.DataFrame(index=[f"var_{i}" for i in range(8)]), ... ) >>> ep.tl.ncp(edata, layer="data", rank=3) >>> edata.obsm["X_ncp"].shape # (30, 3) – sample factors >>> edata.varm["ncp_loadings"].shape # (8, 3) – variable factors >>> edata.uns["ncp"]["temporal_factors"].shape # (12, 3) – time factors