scib_metrics.kbet_per_label

Contents

scib_metrics.kbet_per_label#

scib_metrics.kbet_per_label(X, batches, labels, alpha=0.05, diffusion_n_comps=100, return_df=False)[source]#

Compute kBET score per cell type label as in [Luecken et al., 2022].

This approximates the method used in the original scib package. Notably, the underlying kbet might have some inconsistencies with the R implementation. Furthermore, to equalize the neighbor graphs of cell type subsets we use diffusion distance approximated with diffusion maps. Increasing diffusion_n_comps will increase the accuracy of the approximation.

Parameters:
  • X (NeighborsResults) – A NeighborsResults object.

  • batches (ndarray) – Array of shape (n_cells,) representing batch values for each cell.

  • alpha (float (default: 0.05)) – Significance level for the statistical test.

  • diffusion_n_comps (int (default: 100)) – Number of diffusion components to use for diffusion distance approximation.

  • return_df (bool (default: False)) – Return dataframe of results in addition to score.

Return type:

Union[float, tuple[float, DataFrame]]

Returns:

kbet_score

Kbet score over all cells. Higher means more integrated, as in the kBET acceptance rate.

df

Dataframe with kBET score per cell type label.

Notes

This function requires X to be cell-cell connectivities, not distances.