celldisect.CellDISECT.setup_anndata
- classmethod CellDISECT.setup_anndata(adata: anndata.AnnData, layer: str | None = None, batch_key: str | None = None, labels_key: str | None = None, size_factor_key: str | None = None, categorical_covariate_keys: List[str] | None = None, continuous_covariate_keys: List[str] | None = None, add_cluster_covariate: bool = False, clustering_normalize_counts: bool = True, perturbation_key: str | None = None, perturbation_embedding_key: str | None = None, perturbation_combination_delimiter: str = '+', **kwargs)
Set up the AnnData object for the CellDISECT model.
This method configures the AnnData object by registering the necessary fields and optionally adding a cluster covariate. When
perturbation_keyis provided, the corresponding column inadata.obsis treated as a perturbation covariate whose embeddings come fromadata.uns[perturbation_embedding_key]rather than being learned during training.- Parameters:
adata (AnnData) – AnnData object to be set up.
layer (Optional[str], optional) – Layer in adata to use as the count data, by default None.
batch_key (Optional[str], optional) – Key in adata.obs for batch information, by default None.
labels_key (Optional[str], optional) – Key in adata.obs for labels, by default None.
size_factor_key (Optional[str], optional) – Key in adata.obs for size factors, by default None.
categorical_covariate_keys (Optional[List[str]], optional) – List of keys in adata.obs for categorical covariates, by default None.
continuous_covariate_keys (Optional[List[str]], optional) – List of keys in adata.obs for continuous covariates, by default None.
add_cluster_covariate (bool, optional) – Whether to add a cluster covariate to adata.obs, by default False.
clustering_normalize_counts (bool, optional) – Whether to normalize counts before clustering, by default True.
perturbation_key (Optional[str], optional) – Column in
adata.obsthat contains perturbation labels (e.g."GeneA","GeneA+GeneB"). When set, the perturbation covariate uses predefined embeddings instead of learned ones.perturbation_embedding_key (Optional[str], optional) – Key in
adata.unswhose value is adict[str, np.ndarray]mapping atomic perturbation names to their vector representations (e.g. ESM or GenePT embeddings). Required whenperturbation_keyis set.perturbation_combination_delimiter (str, optional) – Delimiter for combinatorial perturbation labels, by default
"+".**kwargs – Additional keyword arguments.
- Return type:
None