Data Visualization#

Author: Clarence Mah | Last Updated: May 15, 2023

We will demonstrate spatial visualization in bento-tools for exploring subcellular biology. We will explore the seqFISH+ 3T3 cells dataset, which we have included in the package.

A Brief Overview#

In bento-tools we provide a high-level interface based on matplotlib for plotting spatial transcriptomics formatted as an AnnData object. See more details about the data structure here. Data is represented as points and shapes, corresponding to molecules and segmentation masks. We closely mirror the seaborn package for mapping data semantics, while replicating some geopandas plotting functionality with styles more suitable for visualizing subcellular data. For spatial visualization at the tissue level (i.e. plotting cell coordinates instead of cell boundaries) we recommend using squidpy and scanpy instead.

Note

In general, plotting in bento-tools assumes datasets will have data stored from multiple fields of view (fov), which must be encoded in adata.obs["batch"]. The plotting functions plot a single fov at a time, which can be set with the batch parameter; if unspecified, the default is inferred from the first cell in adata.

If available, cell and nuclear shapes are plotted by default. Plot more shape layers by passing their names in a list to the shapes parameter.

Load Libraries and Data#

import bento as bt
import matplotlib as mpl
import matplotlib.pyplot as plt
adata = bt.ds.load_dataset("seqfish")
adata
AnnData object with n_obs × n_vars = 211 × 9506
    obs: 'cell_shape', 'nucleus_shape', 'batch'
    uns: 'points'
    layers: 'spliced', 'unspliced'

Plotting points#

Let’s plot the points (RNA) as a scatterplot in 2D. This is a lightweight wrapper around sns.scatterplot. Refer to the seaborn documentation for more details.

bt.pl.points(adata)
../_images/09ed4997ceaa438c5e380c5bc2d5588a23dbef756c063954f18f21179f167bcc.png

You can use hue to color transcripts by their gene identity. In this case there are >9000 genes, so it isn’t very informative; you can also hide the legend with legend=False.

bt.pl.points(adata, hue="gene", legend=False)
../_images/618d19efca60004825066cfc1e9397d30066ac06fec2bbf52a2fa29a53497e9b.png

If you have certain genes of interest, you can slice the adata object for that subset.

genes = ["Tln1", "Col1a1", "Dynll2"]
bt.pl.points(adata[:, genes], hue="gene")
../_images/4b405f75cf6b0ac704f907c80b4ba8e4b05af8e80556f8e6386223e3a59ef8ed.png

Plotting distributions#

Often it may be more useful to look at how molecules are distributed rather than individual points. The density() function wraps sns.histplot and sns.kdeplot, which is specified with kind='hist' and kind='kde' respectively.

Plot 2D histogram of points:

bt.pl.density(adata)
../_images/a50ac48bac79577a71e8567f72b95c0886fde52839449f91a7b36efd90d3df8d.png

Plot 2D kernel density estimate of points:

Note

Density plots are not recommended for a large number of points; plotting will be extremely slow.

bt.pl.density(adata, kind="kde")
../_images/825bc121c213b9d3240c869cf4f7349fcba587d9e1983908f06d3e0b518ddb45.png

Plotting shapes#

For finer control over plotting shapes, you can use bt.pl.shapes(). Similar to above, cells and nuclei are shown by default. This function wraps the geopandas function GeoDataFrame.plot().

bt.pl.shapes(adata)
../_images/4454742ee134e0c6a535c27a6c73320e05ac4dd88ce659cee9dc508a603b86d8.png

For convenience, shapes() provides two coloring styles, color_style='outline' (default) and color_style='fill'.

bt.pl.shapes(adata, color_style="fill")
../_images/898564985c073436633bd744d938e81939e2d57cc1dd1aa86408709ad33b9434.png

You can use the hue parameter to color shapes by group e.g. cell, cell type, phenotype, etc.

bt.pl.shapes(adata, hue="cell", color_style="fill")
../_images/fec425c4964bd6da3ab70ed8be1e67323c45db75d04a195f6ad1657d5bb61723.png

You can also layer shapes on top of each other in the same plot. This allows you to style shapes differently; for example we can highlight the nucleus with color and the cell membrane with a dashed line.

fig, ax = plt.subplots()
bt.pl.shapes(adata, shapes="cell", linestyle="--", ax=ax)
bt.pl.shapes(
    adata,
    shapes="nucleus",
    edgecolor="black",
    facecolor="lightseagreen",
    ax=ax,
)
../_images/1a01ac2f9e86af989ed1f8037513d976b32e09f5a4f87c14c1aedbf54f856fe5.png

Figure aesthetics#

To declutter unnecessary plot elements, you can use these convenient parameters:

  • axis_visible: show/hide axis labels and ticks

  • frame_visible: show/hide spines

  • square: makes the plot square, useful for lining up multiple subplots

  • title: defaults to the batch name, override with your own title

fig, axes = plt.subplots(1, 2, figsize=(8, 4))

bt.pl.density(adata, ax=axes[0], title="default styling")

bt.pl.density(
    adata,
    ax=axes[1],
    axis_visible=True,
    frame_visible=True,
    square=True,
    title="square plot + axis",
)
plt.tight_layout()
../_images/9ad23dbe70a331387011f07c40599f551a4acbc380892ed6890ec7251ae1d163.png
with mpl.style.context("dark_background"):
    fig, ax = plt.subplots()
    bt.pl.shapes(adata, shapes="cell", linestyle="--", ax=ax)
    bt.pl.shapes(
        adata,
        shapes="nucleus",
        edgecolor="black",
        facecolor="lightseagreen",
        ax=ax,
    )
../_images/cc6f49045718be071b6cd40bdd00a9ff00814f27379a5e29bb7a83728a06c033.png

Building subplots#

Since all plotting functions operate on matplotlib.Axes objects, not only can you build plots layer by layer, you can create multiple subplots.

You can tile across individual cells:

cells = adata.obs_names[:8]  # get some cells
ncells = len(cells)

ncols = 4
nrows = 2
ax_height = 1.5
fig, axes = plt.subplots(
    nrows, ncols, figsize=(ncols * ax_height, nrows * ax_height)
)  # instantiate

for c, ax in zip(cells, axes.flat):
    bt.pl.density(
        adata[c],
        ax=ax,
        square=True,
        title="",
    )

plt.subplots_adjust(wspace=0, hspace=0, bottom=0, top=1, left=0, right=1)
../_images/89004bd22be637fdaf7a243fb4c5916091294413ec929f89011244c6be36690a.png

Or tile across each batch:

batches = adata.obs["batch"].unique()[:6]  # get 6 batches
nbatches = len(batches)

ncols = 3
nrows = 2
ax_height = 3
fig, axes = plt.subplots(
    nrows, ncols, figsize=(ncols * ax_height, nrows * ax_height)
)  # instantiate

for b, ax in zip(batches, axes.flat):
    bt.pl.density(
        adata,
        batch=b,
        ax=ax,
        square=True,
        title="",
    )

# remove empty axes
for ax in axes.flat[nbatches:]:
    ax.remove()

plt.subplots_adjust(wspace=0, hspace=0, bottom=0, top=1, left=0, right=1)
../_images/28f88f1b4eb23641af1843f782ec9066cc07f187d0920a334ad36a841944f079.png