schist.inference._flat_model

Module Contents

Functions

flat_model(→ Optional[anndata.AnnData])

Cluster cells into subgroups [Peixoto14].

schist.inference._flat_model.flat_model(adata: anndata.AnnData, n_sweep: int = 10, beta: float = np.inf, tolerance: float = 1e-06, collect_marginals: bool = True, deg_corr: bool = True, n_init: int = 100, n_jobs: int = -1, refine_model: bool = False, refine_iter: int = 100, max_iter: int = 100000, *, restrict_to: Tuple[str, Sequence[str]] | None = None, random_seed: int | None = None, key_added: str = 'sbm', adjacency: scipy.sparse.spmatrix | None = None, neighbors_key: str | None = 'neighbors', directed: bool = False, use_weights: bool = False, save_model: str | None = None, copy: bool = False, dispatch_backend: str | None = 'threads') anndata.AnnData | None

Cluster cells into subgroups [Peixoto14].

Cluster cells using the Stochastic Block Model [Peixoto14], performing Bayesian inference on node groups.

This requires having ran neighbors() or bbknn() first.

Parameters

adata

The annotated data matrix.

n_sweep

Number of MCMC sweeps to get the initial guess

beta

Inverse temperature for the initial MCMC sweep

tolerance

Difference in description length to stop MCMC sweep iterations

collect_marginals

Whether or not collect node probability of belonging to a specific partition.

deg_corr

Whether to use degree correction in the minimization step. In many real world networks this is the case, although this doesn’t seem the case for KNN graphs used in scanpy.

n_init

Number of initial minimizations to be performed. This influences also the precision for marginals

refine_model

Wether to perform a further mcmc step to refine the model

refine_iter

Number of refinement iterations.

max_iter

Maximum number of iterations during minimization, set to infinite to stop minimization only on tolerance

key_added

adata.obs key under which to add the cluster labels.

adjacency

Sparse adjacency matrix of the graph, defaults to adata.uns[‘neighbors’][‘connectivities’] in case of scanpy<=1.4.6 or adata.obsp[neighbors_key][connectivity_key] for scanpy>1.4.6

neighbors_key

The key passed to sc.pp.neighbors

directed

Whether to treat the graph as directed or undirected.

use_weights

If True, edge weights from the graph are used in the computation (placing more emphasis on stronger edges). Note that this increases computation times

save_model

If provided, this will be the filename for the PartitionModeState to be saved

copy

Whether to copy adata or modify it inplace.

random_seed

Random number to be used as seed for graph-tool

n_jobs

Number of parallel computations used during model initialization

Returns

adata.obs[key_added]

Array of dim (number of cells) that stores the subgroup id (‘0’, ‘1’, …) for each cell.

adata.uns[‘schist’][‘params’]

A dict with the values for the parameters resolution, random_state, and n_iterations.

adata.uns[‘schist’][‘stats’]

A dict with the values returned by mcmc_sweep

adata.obsm[‘CM_sbm’]

A np.ndarray with cell probability of belonging to a specific group

adata.uns[‘schist’][‘state’]

The BlockModel state object