deconverse: Deconvolution using scRNA-seq references

What is deconvolution?

The goal of deconvolution is to predict the makeup of a mixture in terms of its components and their fractions. i.e. if:

. . .

Mixture: bulk RNA-seq profiles (bulk, spot on a 10X Visium)
Components: cell type profiles (C)

then \(mixture = \sum_{i=1}^{n} C_iw_i\) subject to: \(\sum_{i=1}^n w_i = 1\) and \(w_i \geq 0\)

Deconvolution graphical summary

Why is it useful?

Goal: define the cell types and their proportions present in a sample

Identify and understand tissue heterogeneity
Associate cell types with clinical variables (e.g. survival, response to therapy)
Apply to downstream analysis: e.g. cell-to-cell communication

Challenges of deconvolution

If \(mixture = \sum_{i=1}^{n} C_iw_i\), subject to: \(\sum_{i=1}^n w_i = 1\) and \(w_i \geq 0\)

For the user:
- What are the cell types present and how many are there?
- What is a good reference for my mixture?
For the method:
- Assumption: all cell types that could be present are represented in the reference
- How do I identify the cell type profiles (C)? In what space?
- How do I measure if I have a good fit?

TME deconvolution: first generation

First generation deconvolution methods generally used FACS-sorted gene expression as cell type profiles, often from PBMCs. References are almost always pre-computed.

. . .

TME deconvolution: examples of first generation methods

CIBERSORT (support vector regression)
MCPcounter (marker mean expression)
xCell (corrected ssgsea)
EPIC and quanTiSeq (constrained least square minimization)

Some methods don’t do deconvolution per se (don’t return proportions): inter-sample comparisons only

Methods don’t assume a complete reference: only deconvolute cell types of the TME

Cell type deconvolution: second-generation

(User-provided) single-cell reference of the same context as the sample to be deconvoluted

. . .

The single cell reference: atlases

Tabula Muris
Tabula Sapiens
Human Lung Atlas
Multiple published single-cell atlases of different tissues or pathologies

. . .

Major problem: cell-type annotation

Example: Colon Atlas (Pelka et al., 2021)

Coarse-grained annotation

. . .

Fine-grained annotation of compartments

Cell type resolution: can we separate them?

Deconvolution methods are often robust when using coarse-grained annotation
Deconvolution often fails at separating cell types defined by ‘state’ (e.g. T CD4+/CD8+, B-naive from B-mature)
What is the appropriate “level” of annotation that allows for deconvolution?

`deconverse`: a meta-method package with benchmarking built-in

Deconvolution methods in `deconverse`

deconvolution_methods()

         OLS         DWLS          SVR   CIBERSORTx        MuSiC   BayesPrism 
       "ols"       "dwls"        "svr" "cibersortx"      "music" "bayesprism" 
      Bisque    AutoGeneS       scaden         CARD         RCTD    SPOTlight 
    "bisque"  "autogenes"     "scaden"       "card"       "rctd"  "spotlight"

spatial_only_methods()

       CARD        RCTD   SPOTlight 
     "card"      "rctd" "spotlight"

`deconverse` ideas

Scientific

Support for multiple levels of annotation at the same time
Correction of finer grained annotation by coarser-grained
Aid users in detecting what level of annotation is appropriate through benchmarking

Technical

Any method: same syntax
Run multiple methods with one command
A general framework: adding new methods is easy

`deconverse` syntax: screference

Single-cell (hierarchycal) reference

pbmc_ref <- new_hscreference(pbmc_train,
                annot_ids = c("Cell_major_identities", "Cell_minor_identities"),
                project_name = "pbmc_example",
                batch_id = "orig.ident")

pbmc_ref <- pbmc_ref |>
    compute_reference("dwls") |>
    compute_reference("autogenes")

`deconverse` syntax: deconvolute

deconv_res <- deconvolute_all(gexp, pbmc_ref,
                              methods = c("dwls", "ols", "svr"))

`deconverse` syntax: scbench

pbmc_bench <- new_scbench(pbmc_test, 
                         annot_ids = c("Cell_major_identities",
                                       "Cell_minor_identities"),
                         project_name = "pbmc_example",
                         batch_id = "orig.ident")

Generate “mixtures” for each benchmarking test (bounds can be given)

pbmc_bench <- pbmc_bench |>
    mixtures_population(nsamp = 500) |>
    mixtures_lod() |>
    mixtures_spillover()

`deconverse` syntax: scbench

Creates pseudobulk samples from the single-cell profiles in pbmc_test

pbmc_bench <- pseudobulks(pbmc_bench, ncells = 1000)

pbmc_bench <- deconvolute_all(pbmc_bench, pbmc_ref,
                              methods = c("dwls", "svr", "ols", 
                                          "autogenes", "bisque"))

`deconverse` benchmarking results: population

plt_cors_scatter(pbmc_bench, method = "dwls")

`deconverse` benchmarking results: compare between populations

plt_cor_heatmap(pbmc_bench, level = "l2")$heatmap

`deconverse` benchmarking results: spillover

plt_spillover_heatmap(pbmc_bench)$heatmap

`deconverse` benchmarking results: limit of detection

plt_lod_heatmap(pbmc_bench)$heatmap

Some details: Deconvolution methods in `deconverse`

Ordinary Least Squares (OLS), Support Vector (SVR) and Dampened Weighted Least Squares Regressions use the same reference cell marker matrix from Seurat::findMarkers
CIBERSORTx runs in a docker
MuSiC and DWLS were reimplemented for performance
(Python) Reticulate methods: AutoGeneS and scaden

Spatial deconvolution methods to be added to `deconverse 0.3`

Same syntax, any method:

scref <- new_screference(kidney_so,
                                annot_id = c("compartment"),
                                project_name = "kidney",
                                batch_id = "donor")
scref <- compute_reference(scref, method = "card")

spatial_obj <- deconvolute(spatial_obj, scref, method = "rctd")

Example of spatial deconvolution results

SpatialFeaturePlot(spatial_obj,
                   features = deconverse_results(spatial_obj, method = "rctd")[[1]],
                   pt.size.factor = 1.3)

Example of spatial deconvolution results

SpatialDimPlot(spatial_obj, 
               group.by = deconverse_results(spatial_obj, method = "rctd", major_population = TRUE)[1], 
               pt.size.factor = 1.3)

New methodology for spatial deconvolution?

Not all current “spatial specific” methods use spatial information on deconvolution:

Use spatial information: CARD, cell2location
Don’t use: SPOTlight, RCTD, DestVI

. . .

Graph-based? e.g. graph-NMF followed by NNLS

Thanks!

Email 📧 clarice.groeneveld@inserm.fr

Github 😺 csgroen

BlueSky (bye-bye X) 🟦 csgroen

Try Deconverse: github.com/csgroen/deconverse

Presentation available at: csgroen.github.io/posts/deconverse_bioinfoclub

`deconverse`: Deconvolution using scRNA-seq references

Other Formats

What is deconvolution?

Deconvolution graphical summary

Why is it useful?

Challenges of deconvolution

TME deconvolution: first generation

TME deconvolution: examples of first generation methods

Cell type deconvolution: second-generation

The single cell reference: atlases

Example: Colon Atlas (Pelka et al., 2021)

Cell type resolution: can we separate them?

`deconverse`: a meta-method package with benchmarking built-in

Deconvolution methods in `deconverse`

`deconverse` ideas

`deconverse` syntax: screference

`deconverse` syntax: deconvolute

`deconverse` syntax: scbench

`deconverse` syntax: scbench

`deconverse` benchmarking results: population

`deconverse` benchmarking results: compare between populations

`deconverse` benchmarking results: spillover

`deconverse` benchmarking results: limit of detection

Some details: Deconvolution methods in `deconverse`

Spatial deconvolution methods to be added to `deconverse 0.3`

Example of spatial deconvolution results

Example of spatial deconvolution results

New methodology for spatial deconvolution?

Thanks!

Other Formats

What is deconvolution?

Deconvolution graphical summary

Why is it useful?

Challenges of deconvolution

TME deconvolution: first generation

TME deconvolution: examples of first generation methods

Cell type deconvolution: second-generation

The single cell reference: atlases

Example: Colon Atlas (Pelka et al., 2021)

Cell type resolution: can we separate them?

deconverse: a meta-method package with benchmarking built-in

Deconvolution methods in deconverse

deconverse ideas

deconverse syntax: screference

deconverse syntax: deconvolute

deconverse syntax: scbench

deconverse syntax: scbench

deconverse benchmarking results: population

deconverse benchmarking results: compare between populations

deconverse benchmarking results: spillover

deconverse benchmarking results: limit of detection

Some details: Deconvolution methods in deconverse

Spatial deconvolution methods to be added to deconverse 0.3

Example of spatial deconvolution results

Example of spatial deconvolution results

New methodology for spatial deconvolution?

Thanks!

`deconverse`: a meta-method package with benchmarking built-in

Deconvolution methods in `deconverse`

`deconverse` ideas

`deconverse` syntax: screference

`deconverse` syntax: deconvolute

`deconverse` syntax: scbench

`deconverse` syntax: scbench

`deconverse` benchmarking results: population

`deconverse` benchmarking results: compare between populations

`deconverse` benchmarking results: spillover

`deconverse` benchmarking results: limit of detection

Some details: Deconvolution methods in `deconverse`

Spatial deconvolution methods to be added to `deconverse 0.3`