`schist.inference._flat_model`

Module Contents

Functions

flat_model(→ Optional[anndata.AnnData])

Cluster cells into subgroups [Peixoto14].

schist.inference._flat_model.flat_model(adata: anndata.AnnData, n_sweep: int = 10, beta: float = np.inf, tolerance: float = 1e-06, collect_marginals: bool = True, deg_corr: bool = True, n_init: int = 100, n_jobs: int = -1, refine_model: bool = False, refine_iter: int = 100, max_iter: int = 100000, *, restrict_to: Tuple[str, Sequence[str]] | None = None, random_seed: int | None = None, key_added: str = 'sbm', adjacency: scipy.sparse.spmatrix | None = None, neighbors_key: str | None = 'neighbors', directed: bool = False, use_weights: bool = False, save_model: str | None = None, copy: bool = False, dispatch_backend: str | None = 'threads') → anndata.AnnData | None

Cluster cells into subgroups [Peixoto14].

Cluster cells using the Stochastic Block Model [Peixoto14], performing Bayesian inference on node groups.

This requires having ran neighbors() or bbknn() first.

Parameters

adata: The annotated data matrix.
n_sweep: Number of MCMC sweeps to get the initial guess
beta: Inverse temperature for the initial MCMC sweep
tolerance: Difference in description length to stop MCMC sweep iterations
collect_marginals: Whether or not collect node probability of belonging to a specific partition.
deg_corr: Whether to use degree correction in the minimization step. In many real world networks this is the case, although this doesn’t seem the case for KNN graphs used in scanpy.
n_init: Number of initial minimizations to be performed. This influences also the precision for marginals
refine_model: Wether to perform a further mcmc step to refine the model
refine_iter: Number of refinement iterations.
max_iter: Maximum number of iterations during minimization, set to infinite to stop minimization only on tolerance
key_added: adata.obs key under which to add the cluster labels.
adjacency: Sparse adjacency matrix of the graph, defaults to adata.uns[‘neighbors’][‘connectivities’] in case of scanpy<=1.4.6 or adata.obsp[neighbors_key][connectivity_key] for scanpy>1.4.6
neighbors_key: The key passed to sc.pp.neighbors
directed: Whether to treat the graph as directed or undirected.
use_weights: If True, edge weights from the graph are used in the computation (placing more emphasis on stronger edges). Note that this increases computation times
save_model: If provided, this will be the filename for the PartitionModeState to be saved
copy: Whether to copy adata or modify it inplace.
random_seed: Random number to be used as seed for graph-tool
n_jobs: Number of parallel computations used during model initialization

Returns

adata.obs[key_added]: Array of dim (number of cells) that stores the subgroup id (‘0’, ‘1’, …) for each cell.
adata.uns[‘schist’][‘params’]: A dict with the values for the parameters resolution, random_state, and n_iterations.
adata.uns[‘schist’][‘stats’]: A dict with the values returned by mcmc_sweep
adata.obsm[‘CM_sbm’]: A np.ndarray with cell probability of belonging to a specific group
adata.uns[‘schist’][‘state’]: The BlockModel state object

schist.inference._flat_model

Module Contents

Functions

Parameters

Returns

`schist.inference._flat_model`