scib_metrics.benchmark.Benchmarker#
- class scib_metrics.benchmark.Benchmarker(adata, batch_key, label_key, embedding_obsm_keys, bio_conservation_metrics=None, batch_correction_metrics=None, pre_integrated_embedding_obsm_key=None, n_jobs=1)[source]#
Benchmarking pipeline for the single-cell integration task.
- Parameters:
adata (
AnnData
) – AnnData object containing the raw count data and integrated embeddings as obsm keys.batch_key (
str
) – Key inadata.obs
that contains the batch information.label_key (
str
) – Key inadata.obs
that contains the cell type labels.embedding_obsm_keys (
list
[str
]) – List of obsm keys that contain the embeddings to be benchmarked.bio_conservation_metrics (
Optional
[BioConservation
] (default:None
)) – Specification of which bio conservation metrics to run in the pipeline.batch_correction_metrics (
Optional
[BatchCorrection
] (default:None
)) – Specification of which batch correction metrics to run in the pipeline.pre_integrated_embedding_obsm_key (
Optional
[str
] (default:None
)) – Obsm key containing a non-integrated embedding of the data. IfNone
, the embedding will be computed in the prepare step. See the notes below for more information.n_jobs (
int
(default:1
)) – Number of jobs to use for parallelization of neighbor search.
Notes
adata.X
should contain a form of the data that is not integrated, but is normalized. Theprepare
method will useadata.X
for PCA viapca()
, which also only uses features masked viaadata.var['highly_variable']
.See further usage examples in the following tutorial:
Methods table#
Run the pipeline. |
|
|
Return the benchmarking results. |
|
Plot the benchmarking results. |
|
Prepare the data for benchmarking. |
Methods#
- Benchmarker.get_results(min_max_scale=True, clean_names=True)[source]#
Return the benchmarking results.
- Benchmarker.plot_results_table(min_max_scale=True, show=True, save_dir=None)[source]#
Plot the benchmarking results.
- Benchmarker.prepare(neighbor_computer=None)[source]#
Prepare the data for benchmarking.
- Parameters:
neighbor_computer (
Optional
[Callable
[[ndarray
,int
],NeighborsResults
]] (default:None
)) – Function that computes the neighbors of the data. IfNone
, the neighbors will be computed withpynndescent()
. The function should take as input the data and the number of neighbors to compute and return aNeighborsResults
object.- Return type: