ggheatmapper.Rmd
ggheatmapper
is built using ggplot2
and the
patchwork
framework for aligning plots. It supports adding
into the heatmap and all the powerful tweaks enabled by
theme
s in ggplot, and helps you align more information
about your data to the original heatmap to make “complex heatmaps” with
a more modern and tweakable framework.
The simplest, yet less powerful, way to use ggheatmapper
is with a matrix as-is. Here, we demonstrate with a subset of a gene
expression matrix for bladder cancer samples:
library(tidyverse)
#> ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
#> ✔ dplyr 1.1.3 ✔ readr 2.1.4
#> ✔ forcats 1.0.0 ✔ stringr 1.5.0
#> ✔ ggplot2 3.4.3 ✔ tibble 3.2.1
#> ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
#> ✔ purrr 1.0.2
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
#> ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(patchwork)
library(ggheatmapper)
#> Registered S3 methods overwritten by 'treeio':
#> method from
#> MRCA.phylo tidytree
#> MRCA.treedata tidytree
#> Nnode.treedata tidytree
#> Ntip.treedata tidytree
#> ancestor.phylo tidytree
#> ancestor.treedata tidytree
#> child.phylo tidytree
#> child.treedata tidytree
#> full_join.phylo tidytree
#> full_join.treedata tidytree
#> groupClade.phylo tidytree
#> groupClade.treedata tidytree
#> groupOTU.phylo tidytree
#> groupOTU.treedata tidytree
#> inner_join.phylo tidytree
#> inner_join.treedata tidytree
#> is.rooted.treedata tidytree
#> nodeid.phylo tidytree
#> nodeid.treedata tidytree
#> nodelab.phylo tidytree
#> nodelab.treedata tidytree
#> offspring.phylo tidytree
#> offspring.treedata tidytree
#> parent.phylo tidytree
#> parent.treedata tidytree
#> root.treedata tidytree
#> rootnode.phylo tidytree
#> sibling.phylo tidytree
data(tcgaBLCA_ex)
gexp <- tcgaBLCA_ex$gexp
gghm <- ggheatmap(gexp,
hm_colors = 'RdBu',
hm_color_values = scales::rescale(c(-4,-2,-1,-0.5,-0.25,0,0.25,0.5,1,2,4,6)),
scale = TRUE,
center = TRUE,
show_dend_row = TRUE,
colors_title = "Scaled expression (log2 UQ)",
show_colnames = FALSE)
#> Running `ggheatmap` in matrix mode. If that's not intentional, provide a `colv`.
gghm
This method still will support extending with
align_to_hm
but would not support add_tracks
.
You can still modify this by using &
and adding
theme
, for example:
gghm &
theme(axis.text = element_text(size = 6))
If transpose our original table and extended it with new variables,
we can unlock more features using ggheatmap
.
sample_annot <- tcgaBLCA_ex$sample_annot %>% as_tibble()
genes <- rownames(gexp)
tcgaBLCA_tb <- gexp %>%
t() %>%
as.data.frame() %>%
rownames_to_column("sample") %>%
left_join(sample_annot, by = "sample") %>%
tibble() %>%
group_by(consensusClass)
head(tcgaBLCA_tb)
sample | FBP1 | ACER2 | PKHD1 | CAPN5 | S100P | TMEM51 | DHRS2 | CYP4F22 | SPINK1 | ACSL5 | ST3GAL5 | TBX3 | HPGD | TGFBR3 | FAM3B | ATP8B1 | RNF128 | SNCG | SLC44A3 | GATA3 | PPARG | ICA1 | GGT6 | RAB11A | TRAK1 | VSIG2 | BCAS1 | RAB15 | FAM174B | SLC29A3 | FOXA1 | GOLT1A | PPFIBP2 | DENND2D | ACAA1 | DNAJA4 | HMGCS2 | CYP2J2 | VSNL1 | KRT14 | TGM1 | SERPINB4 | GSDMC | KRT6A | LGALS7 | SFN | SPRR2A | C12orf54 | SPRR2D | HOXD11 | KRT6C | KRT5 | DSG3 | KRT6B | HOXD10 | IL20RB | RHCG | AHNAK2 | SPRR2F | FGFBP1 | sex | age | stage | node | metastasis | papillary | squamous | neuroendocrine | plasmacytoid | consensusClass | cor_pval | separationLevel | LumP | LumNS | LumU | Stroma.rich | Ba.Sq | NE.like |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
TCGA-XF-A9SW-01A | 3.051428 | 0.7600492 | 1.0344884 | 3.190462 | 7.252360 | 4.695958 | 3.553598 | 0.0458037 | 8.070943 | 5.222600 | 2.7737241 | 4.846704 | 4.195551 | 5.317267 | 3.1750867 | 4.278225 | 3.175087 | 4.8426545 | 2.011588 | 5.840220 | 4.209453 | 2.770317 | 3.728798 | 5.853159 | 4.839407 | 3.2770249 | 0.9406215 | 4.783051 | 3.517479 | 3.205675 | 2.9853829 | 3.279423 | 2.467868 | 3.198088 | 3.772022 | 3.682428 | 0.3495844 | 0.718229 | 0.6896599 | 0.8271634 | 0.5215371 | 0.0230836 | 1.3219281 | 0.2937312 | 0.0000000 | 6.262246 | 1.2842084 | 0.3856537 | 0.2552571 | 0.1118929 | 0.0230836 | 1.853159 | 0.0000000 | 0.0000000 | 0.3677318 | 0.5052353 | 1.8786937 | 5.6623525 | 0.0000000 | 0.0901978 | M | 85 | T3 | N2 | NA | 0 | 0 | 0 | 0 | Stroma-rich | 0 | 0.8871657 | 0.3931646 | 0.4925222 | 0.4609638 | 0.6421217 | 0.4954034 | 0.2989414 |
TCGA-BL-A13J-01B | 2.538652 | 4.4439186 | 0.2504068 | 2.982211 | 4.283709 | 2.692439 | 8.245226 | 1.7312968 | 3.901222 | 4.771899 | 3.2720427 | 5.745076 | 5.665011 | 4.000000 | 1.6374299 | 5.595922 | 5.496664 | 1.4043903 | 3.260282 | 5.052941 | 5.196140 | 3.543512 | 2.441317 | 5.449971 | 5.443269 | 3.0979633 | 1.8087013 | 3.592799 | 3.937369 | 1.390071 | 2.4585741 | 1.371969 | 3.780845 | 3.206136 | 3.224412 | 6.407171 | 2.1814040 | 2.950846 | 0.0554951 | 3.5974805 | 0.9714308 | 0.4360991 | 0.6614754 | 2.7732793 | 0.0000000 | 3.995869 | 1.0093987 | 0.0187366 | 0.7026141 | 0.5499671 | 0.0372329 | 4.684393 | 1.3756074 | 0.2264279 | 1.1477536 | 0.9952776 | 2.3462385 | 4.5877101 | 0.0093987 | 0.6909794 | M | 65 | T4 | N2 | M0 | 1 | 0 | 0 | 0 | LumP | 0 | 0.8960881 | 0.5418487 | 0.4977683 | 0.5031495 | 0.4995555 | 0.4950211 | 0.1822938 |
TCGA-C4-A0F1-01A | 1.532221 | 2.0437214 | 3.0887679 | 2.473487 | 4.091374 | 3.335184 | 4.318595 | 0.2255597 | 1.804181 | 1.606989 | 0.8845228 | 2.225560 | 1.054448 | 1.599684 | 0.1475572 | 4.381709 | 3.059781 | 0.9888594 | 2.791413 | 4.214842 | 1.739183 | 2.461448 | 1.022026 | 6.647846 | 5.364572 | 0.1273793 | 0.2255597 | 1.562595 | 1.524527 | 2.191951 | 0.2630344 | 0.000000 | 2.592342 | 3.752419 | 4.852614 | 5.689729 | 0.0000000 | 2.566347 | 5.9933957 | 14.1342548 | 1.5700892 | 8.6758485 | 4.1235643 | 13.8688908 | 0.3526716 | 10.493024 | 4.8144715 | 0.5923420 | 5.0302002 | 3.5341382 | 10.4124884 | 13.898507 | 10.7761166 | 11.8531476 | 4.1056265 | 8.1470875 | 6.8639003 | 10.1725237 | 0.4533656 | 6.4604401 | M | 71 | T3 | N0 | M0 | 0 | 1 | 0 | 0 | Ba/Sq | 0 | 0.8163011 | 0.2997577 | 0.2315946 | 0.2387890 | 0.3410051 | 0.6597589 | 0.1615142 |
TCGA-BT-A20W-01A | 5.978460 | 5.0488941 | 1.2202661 | 4.915787 | 7.032628 | 4.972314 | 8.033026 | 5.0979235 | 8.851008 | 6.831062 | 5.1470213 | 7.443114 | 6.536437 | 4.238503 | 4.1879897 | 5.445847 | 5.888681 | 8.1717314 | 4.009266 | 7.773360 | 6.569229 | 4.644451 | 6.227231 | 6.769685 | 6.237912 | 5.8543865 | 5.1486993 | 5.892191 | 3.999070 | 4.605880 | 5.4792386 | 2.973734 | 5.580307 | 4.923159 | 5.437634 | 5.724268 | 6.6118234 | 2.794550 | 0.1280076 | 2.5370783 | 0.9314686 | 0.0000000 | 0.2949049 | 0.9157870 | 0.0000000 | 5.650089 | 0.0294438 | 0.1143327 | 0.1143327 | 0.5599585 | 0.0147970 | 2.134797 | 0.0725125 | 0.0865877 | 0.8009666 | 0.7399372 | 0.4335102 | 1.3009540 | 0.0000000 | 0.1814469 | M | 71 | T2 | N0 | M0 | 0 | 0 | 0 | 0 | LumU | 0 | 0.0257908 | 0.7403699 | 0.7359217 | 0.7420607 | 0.6170841 | 0.4691149 | 0.2982330 |
TCGA-GV-A40G-01A | 6.027991 | 3.9534802 | 2.8433655 | 7.060945 | 6.679805 | 5.813682 | 4.730013 | 5.1503722 | 9.804690 | 3.986061 | 3.8487674 | 7.189749 | 6.635844 | 4.667193 | 4.3252536 | 5.299278 | 6.732415 | 8.4981895 | 3.684128 | 5.840269 | 5.940347 | 4.537487 | 5.706403 | 7.319151 | 6.190736 | 5.8231222 | 5.0257784 | 6.155207 | 4.876234 | 4.512908 | 4.8017954 | 2.762267 | 5.332157 | 4.636290 | 5.964363 | 5.688439 | 5.1330829 | 2.457413 | 0.1171835 | 2.3351842 | 1.6322682 | 0.3699496 | 0.2444187 | 0.2160369 | 0.0110552 | 6.640857 | 0.1674567 | 0.1475572 | 0.0756643 | 0.7458165 | 0.0220263 | 2.471488 | 0.0000000 | 0.0756643 | 1.1968007 | 0.6849913 | 1.3996970 | 0.1968007 | 0.0000000 | 0.7785321 | M | 77 | T2 | N0 | NA | 0 | 0 | 0 | 0 | LumU | 0 | 0.0748451 | 0.6734381 | 0.6157098 | 0.6832107 | 0.4895690 | 0.4171794 | 0.2647129 |
TCGA-HQ-A2OE-01A | 5.562190 | 5.3558523 | 3.5948780 | 5.413716 | 7.054534 | 5.541404 | 6.580083 | 1.1238717 | 7.464341 | 6.386977 | 5.2531547 | 9.160480 | 8.112341 | 5.058496 | 6.3492774 | 7.093522 | 5.697183 | 7.7692337 | 4.639871 | 7.434341 | 6.044997 | 3.523129 | 4.547161 | 7.047030 | 6.513904 | 4.4078684 | 5.1383513 | 6.474951 | 4.267174 | 4.653376 | 7.2894147 | 5.716423 | 6.040620 | 5.575813 | 5.173789 | 5.508778 | 9.2317177 | 7.126793 | 0.0099155 | 0.6842771 | 0.7449034 | 0.0774788 | 0.3436529 | 0.5682838 | 0.0000000 | 8.060994 | 0.0962153 | 0.2037952 | 0.9338331 | 0.0099155 | 0.0099155 | 4.418143 | 0.6529809 | 0.1599409 | 0.2630344 | 0.5548005 | 1.1820347 | 0.3975197 | 0.0000000 | 0.1329739 | M | 69 | T2 | N2 | NA | 1 | 0 | 0 | 0 | LumP | 0 | 0.3120613 | 0.6270255 | 0.5555841 | 0.5852272 | 0.4305816 | 0.4076061 | 0.2394613 |
One of the additional features we unlock is using
grouping
with group_by
to make semi-supervised
heatmaps. In this case, we need to say which column contains the IDs
that will be the columns of the heatmap (in this case,
colv = 'sample'
), and what are the columns we want to plot
as rows (rowv = genes
). Other parameters here are
graphical, for demonstration:
gr_gghm <- ggheatmap(tcgaBLCA_tb,
colv = "sample",
rowv = genes,
hm_colors = 'RdBu',
hm_color_values = scales::rescale(c(-4,-2,-1,-0.5,-0.25,0,0.25,0.5,1,2,4,6)),
scale = TRUE,
center = TRUE,
show_dend_row = FALSE,
show_colnames = FALSE,
show_rownames = FALSE,
group_colors = c(`Ba/Sq` = "#fe4a49", LumNS = "#32837d", LumP = "#06d6a0", LumU = "#009fb7",
`Stroma-rich` = "#f9c80e", `NE-like` = "#7d5ba6"),
colors_title = "Scaled expression (log2 UQ)")
gr_gghm +
plot_layout(guides = 'collect')
Here, we can see that the heatmap is clustered in a semi-supervised
manner (by consensusClass
). It also enables
add_tracks
:
You can add_tracks
for variables that were in the
original table fed to ggheatmap
. You can see what variables
are available using get_data
:
get_data(gr_gghm) %>% colnames()
#> [1] "observations" "FBP1" "ACER2" "PKHD1"
#> [5] "CAPN5" "S100P" "TMEM51" "DHRS2"
#> [9] "CYP4F22" "SPINK1" "ACSL5" "ST3GAL5"
#> [13] "TBX3" "HPGD" "TGFBR3" "FAM3B"
#> [17] "ATP8B1" "RNF128" "SNCG" "SLC44A3"
#> [21] "GATA3" "PPARG" "ICA1" "GGT6"
#> [25] "RAB11A" "TRAK1" "VSIG2" "BCAS1"
#> [29] "RAB15" "FAM174B" "SLC29A3" "FOXA1"
#> [33] "GOLT1A" "PPFIBP2" "DENND2D" "ACAA1"
#> [37] "DNAJA4" "HMGCS2" "CYP2J2" "VSNL1"
#> [41] "KRT14" "TGM1" "SERPINB4" "GSDMC"
#> [45] "KRT6A" "LGALS7" "SFN" "SPRR2A"
#> [49] "C12orf54" "SPRR2D" "HOXD11" "KRT6C"
#> [53] "KRT5" "DSG3" "KRT6B" "HOXD10"
#> [57] "IL20RB" "RHCG" "AHNAK2" "SPRR2F"
#> [61] "FGFBP1" "sex" "age" "stage"
#> [65] "node" "metastasis" "papillary" "squamous"
#> [69] "neuroendocrine" "plasmacytoid" "consensusClass" "cor_pval"
#> [73] "separationLevel" "LumP" "LumNS" "LumU"
#> [77] "Stroma.rich" "Ba.Sq" "NE.like"
observations
will always be your ID variable, which is a
factor ordered in the same way as the heatmap, to ease making new plots
that will be perfectly aligned. Here, we’ll add some clinical
tracks:
gr_gghm <- add_tracks(gr_gghm,
track_columns = c("stage", "node", "metastasis"),
track_colors = list(stage = 'Greys', node = 'Oranges', metastasis = 'Reds'),
track_prop = 0.2)
#> Adding missing grouping variables: `consensusClass`
gr_gghm +
plot_layout(guides = 'collect')
Here we’ll make two different plots: one to align with the samples
(columns) and another to align with the genes (rows). We can get the
data from get_data, which is a good idea to facilitate further plotting
because it will ensure your observations will be in the correct order.
Then, here, we make a line-plot with the correlations of each sample to
the centroid of the 6 consensus bladder cancer subtypes. Note the use of
theme_quant
, one of our suggested themes that look nice
with ggheatmap
s, and that we switch the y-axis to the right
side to make the end-product look better (though everything will work
without this):
tcgaBLCA_tb2 <- get_data(gr_gghm)
plt_corlines <- tcgaBLCA_tb2 %>%
ungroup() %>%
select(observations, LumP:NE.like) %>%
pivot_longer(cols = -observations, names_to = "subtype", values_to = "cor") %>%
ggplot(aes(observations, cor, color = subtype, group = subtype)) +
geom_line() +
scale_y_continuous(position = "right") +
scale_color_manual(values = c(`Ba.Sq` = "#fe4a49", LumNS = "#32837d", LumP = "#06d6a0", LumU = "#009fb7",
`Stroma.rich` = "#f9c80e", `NE.like` = "#7d5ba6")) +
guides(color = FALSE) +
labs(y = "Correlation\n to centroid") +
theme_quant()
#> Warning: The `<scale>` argument of `guides()` cannot be `FALSE`. Use "none" instead as
#> of ggplot2 3.3.4.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.
plt_corlines
The second plot should align to the rows. We can’t just use the
get_data
because it doesn’t contain the information to
order our rows, but we can get this from get_rowLevels
.
Here, we plot which signature each gene in our example data belongs
to:
plt_row_annot <- tcgaBLCA_ex$gene_annot %>%
mutate(gene_symbol = factor(gene_symbol, levels = get_rowLevels(gr_gghm)),
group = 'signature') %>%
ggplot(aes(gene_symbol, group, fill = signature)) +
geom_tile() +
labs(y = "") +
coord_flip() +
theme_sparse2()
plt_row_annot
Finally, we can use align_to_hm
to add these plots to
the original hm with all panels properly aligned. Note the use of
legend_action = 'collect'
in the final call, that will
unite all the legends in a nice way:
gghm_complete <- gr_gghm %>%
align_to_hm(plt_corlines, newplt_size_prop = 0.3) %>%
align_to_hm(plt_row_annot, pos = "left", newplt_size_prop = 0.08,
legend_action = "collect", tag_level = 'keep')
gghm_complete
Note that you can still use the patchwork
&
to make global changes to all plots:
gghm_complete <- gghm_complete &
theme(legend.text = element_text(size = 7),
legend.title = element_text(size = 8))
gghm_complete
You can now make panels with your complete heatmap with
patchwork
, cowplot
or other aligning packages.
For example:
plt_subtype_count <- ggplot(sample_annot, aes(consensusClass, fill = consensusClass)) +
geom_bar() +
scale_fill_manual(values = c(`Ba/Sq` = "#fe4a49", LumNS = "#32837d",
LumP = "#06d6a0", LumU = "#009fb7",
`Stroma-rich` = "#f9c80e",
`NE-like` = "#7d5ba6")) +
labs(y = 'Number of samples') +
guides(fill = FALSE) +
theme_quant() +
theme(axis.ticks.x = element_line(color = "black"),
axis.text.x = element_text(color = "black", angle = 45, hjust = 1, vjust = 1))
plt_subtype_count
Then, we can just use standard patchwork
to align the
ggheatmap
and our new plots (as they won’t be aligned with
the heatmap part of the plot, but with the entire plot):
library(patchwork)
new_col <- (plt_subtype_count + plot_spacer()) +
plot_layout(heights = c(0.3,0.7))
(new_col | gghm_complete) +
plot_layout(widths = c(0.4,0.6))
As of ggheatmapper 0.1.2, we’ve added row facetting options using the
rowv
parameter. ggheatmapper will render parts of the
heatmap in different facets if the rowv
argument for the
ggheatmap
call is a named list:
sig_list <- split(tcgaBLCA_ex$gene_annot$gene_symbol, tcgaBLCA_ex$gene_annot$signature)
gr_gghm <- ggheatmap(tcgaBLCA_tb,
colv = "sample",
rowv = sig_list,
hm_colors = 'RdBu',
hm_color_values = scales::rescale(c(-4,-2,-1,-0.5,-0.25,0,0.25,0.5,1,2,4,6)),
scale = TRUE,
center = TRUE,
show_dend_row = FALSE,
show_colnames = FALSE,
show_rownames = FALSE,
group_colors = c(`Ba/Sq` = "#fe4a49", LumNS = "#32837d", LumP = "#06d6a0", LumU = "#009fb7",
`Stroma-rich` = "#f9c80e", `NE-like` = "#7d5ba6"),
colors_title = "Scaled expression (log2 UQ)")
gr_gghm +
plot_layout(guides = 'collect')