FAIR Data

Evolving Landscape of Spatial Omics Ecosystem: Data Exploration, Interactivity and Visualization

The enhanced ability to measure varied dimensions of biological systems, inevitably from analyzing a handful of genes using methods like RT-PCR to measuring tens of thousands of genes with RNA-Seq across various samples. This technological leap has dramatically advanced our understanding of the molecular underpinnings of biological states and phenomena. Similarly, transitioning from bulk omics (average omics profiles for biological samples) to single-cell analyses has facilitated the uncovering of cellular ecosystem intricacies, revealing how different cells drive biological states and diseases with unprecedented granularity. Advancements in the spatial omics field—particularly spatial transcriptomics, have enabled direct linking of gene expression to tissue architecture in ways that were previously inconceivable. 

Spatial transcriptomics offers a powerful means of identifying cell populations that hold unique functions, as revealed by their specific locations within tissue structures and distinct expression profiles.

However, as exploration of biological samples becomes multifaceted, the complexity of computational ecosystems required to manage this data also intensifies.  This complexity arises not only from the sheer number of measurements but also from the varied types and unique nature of each modality. Fortunately, researchers and developers continuously innovate toolkits and computational infrastructures capable of handling these multitudinous dimensions and complexities. Consequently,  different ecosystems emerge, each offering specific capabilities for data storage, processing, visualization, and analysis.

This blog examines the ecosystems available for spatial omics exploration.

It is crucial to focus on ecosystems rather than individual tools because researchers often select ecosystems based on the compatibility of tools, their efficiency, and the ease with which data can be transferred from one tool to another. As ecosystems evolve, researchers and developers across the globe design tools that integrate seamlessly within these frameworks. While components from different ecosystems can often interact, working within a single ecosystem streamlines transitions from one step or tool to the next. In sync, this blog concentrates on the broader ecosystems available for spatial omics research.

Spatial Experiment-Based Ecosystem

SpatialExperiment refers to data infrastructure for storing and accessing spatially resolved transcriptomics data. It is a Bioconductor package that introduces the S4 class object, extending the SingleCellExperiment class to incorporate spatial data, such as spatial coordinates and image files. This class is versatile and accommodates datasets from sequencing platforms like 10x Genomics Visium and molecule-based platforms at the cellular level.

A defining feature of the SpatialExperiment ecosystem is its modular approach. Unlike other ecosystems where a single toolkit handles multiple functions, the SpatialExperiment ecosystem distributes core functionalities like on-disk data storage, visualization, and pre-processing across specialized tools. It serves as the foundational data structure,  syncing well with other tools in the ecosystem. 

The SpatialExperiment object includes following key components: (i) assays for expression counts, (ii) rowData for feature information like genes, (iii) colData for metadata related to spots or cells, including spatial and non-spatial data, (iv) spatialCoords for spatial coordinates, and (v) imgData for image data. Expression data is typically stored in a single assay named counts for spot-based data like 10x Genomics Visium. 

Several tools are routinely used within the SpatialExperiment ecosystem and include:

  • Scran: Utilized primarily for single-cell data analysis, Scran is compatible with SingleCellExperiment and by extension SpatialExperiment. This feature makes it befitting for normalization, transformation, and data exploration.
  • alabaster.spatial: While R natively supports binary data storage in RDS format, alabaster.spatial offers a specialized method to store SpatialExperiment objects on disk using a combination of PNG, JSON, and H5 files.
  • ggspavis: A dedicated library for visualizing spatial transcriptomics data, ggspavis requires data in SingleCellExperiment or SpatialExperiment format. It contains visualization functions for plotting H&E images, along with spatial data in the form of spots, features, and QC plots. 
  • SpatialLIBD: An interactive Shiny application for exploring SpatialExperiment objects, SpatialLIBD offers a wide range of visualizations for feature-level and spot-level exploration. Continuously developed, it also includes its own datasets.

The SpatialExperiment ecosystem is flexible and provides a robust base for managing spatial data, which can be further enhanced with other tools for efficient data management, processing, visualization, and exploration.

Seurat Ecosystem

Originally developed as a single-cell toolkit, Seurat has since expanded its capabilities to include spatial transcriptomics. It  now supports spatial data through the introduction of spatially aware functions that build on its existing framework.

The Seurat-class S4 object stores multi-modal data within assays, with slots for raw counts, normalized data, and scaled data.   It incorporates an image slot to store images and map spots to their physical locations on the tissue when it comes to spatial data

In addition to dedicated spatial data management, Seurat has a dedicated set of functions for spatial data exploration. These functions are designed in a similar way as their non-spatial counterparts.  For instance, similar to FindVariableFeatures, FindSpatiallyVariableFeatures has been created to find spatial HVGs.

In addition, Seurat supports working with multiple slices of images within the same dataset, and internal calculations based on spatial statistical methods such as Moran’s I and markvariogram.

Seurat ecosystem majorly relies on Seurat as a prime toolkit. Nonetheless, it also implements tools like RCTD (Robust Cell Type Decomposition) or supports other tools like SpotLIGHT (Also available for SpatialExperiment) which can employ Seurat data structure as a base.

In its capacity as the main toolkit, Seurat provides convenient options to explore spatial data, without the hassle of dealing with advanced image analysis.

Squidpy/Scanpy-AnnData Ecosystem

Scanpy, initially developed for single-cell RNA sequencing (scRNA-Seq), has been extended through Squidpy to accommodate spatial transcriptomics data. Squidpy integrates seamlessly with Scanpy by introducing spatially aware functions built on the existing framework.

AnnData, the core data structure in Scanpy, efficiently handles spatial data by storing spatial coordinates, associated metadata, and various data types in dedicated slots, including obs for cell metadata, var for feature metadata, obsm for dimensionality reduction embeddings, and X for the data matrix. The flexible layer system enables the addition of multiple versions of normalized or scaled data, supporting a range of applications.

In addition to AnnData's capabilities, Squidpy extends the ecosystem with the ImageContainer class, which efficiently stores image data using an on-disk/in-memory switch powered by xArray and Dask. This setup ensures the smooth handling of large spatial datasets and supports multi-channel images. As a result, it enhances data processing and analysis.

Squidpy's modular design facilitates integration with other tools, creating a comprehensive framework for spatial transcriptomics analysis within Python. It is compatible with various methods across different domains such as deconvolution (Tangram), segmentation (Cellpose, StarDist), and visualization (Napari). This integration equips the researchers with a powerful and flexible approach to explore the spatial organization of gene expression. 

Comparative Study of Ecosystems 

Let’s evaluate these ecosystems more technically and figure out their  utility for different cases:

SpatialExperiment Seurat Squidpy/Scanpy
Language R R Python
On-disk storage Supported by alabaster.spatial, on-disk storage is handled through a combination of .h5, .json, and .png files.
Additionally, The native .RDS format in R also supports binary storage of any data type.
.h5Seurat, enabled by Seurat Disk, efficiently stores high-dimensional multi-modal data in H5 files.
Additionally, The native .RDS format in R also supports binary storage of any data type.
Squidpy builds on the Scanpy-AnnData ecosystem, using the .h5ad format for data storage. For images, it supports on-disk storage using a zarr store.
Interactive support SpatialLIBD offers interactive visualization through a Shiny-based app for data processing within the ecosystem. Seurat provides minimal interactivity by setting interactive = TRUE in plotting functions. Squidpy integrates with Napari for interactive data exploration and visualization.
Spatial data management Maintains data in two slots, spatialCoords containing spatial coordinates, and imgData containing image data. Stores images in a dedicated slot image within the assay, along with data that links spots to their physical locations on the image. Provides ImageContainer for managing image data, supporting multiple layers for different channels or versions of the same image for information overlay. Built with xArray and Dask, it ensures efficient operations and processing of image data even of large sizes
Visualization Provided by separate packages like ggspavis Seurat has its own dedicated visualization functions like SpatialDimPlot, SpatialFeaturePlot are available Dedicated spatial data visualization functions like spatial_segment, spatial_scatter, and spatial_neighbors are available.
Spatial specific features Not natively supported in our understanding by tools currently within the ecosystem. Wrapper functions for utilizing Moran’s I and other statistics for steps such as spatially variable feature calculations. Provides dedicated implementations for spatial statistics, including spatial autocorrelation measures like Moran's I and Geary’s C, as well as spatial organization scores like Ripley’s K.
Visualization Provided by separate packages like ggspavis Seurat has its own dedicated visualization functions like SpatialDimPlot, SpatialFeaturePlot are available Dedicated spatial data visualization functions like spatial_segment, spatial_scatter, and spatial_neighbors are available.
Unique selling proposition Flexible as the ecosystem is built on top of a spatial data management class around which multiple tool options are available. Perfect for R users who want a comprehensive tool for multimodal integration, focusing on spatial and single-cell data. Suitable for users requiring advanced image processing and spatial architecture analysis in Python. Additionally, several different tools built by independent labs/researchers/developers are available for use making it a flexible and extended ecosystem.

Cursorily, all these ecosystems discussed appear equally capable, but they differ in terms of supported functionalities and user experience. 

One must consider the technical capabilities, flexibility, and ease of integration offered by each ecosystem while choosing the right fit for one’s work and research.

Let’s look at the value proposition of these  ecosystems for easy decision-making: 

Value Proposition

  • SpatialExperiment Ecosystem: This ecosystem is most flexible with a strong data structure foundation, and is ideal for researchers and developers who want to build custom tools. While currently limited in advanced analysis tools,  it is evolving rapidly.
  • Seurat Ecosystem: A favorite among R users for its ease of use, Seurat provides comprehensive tools for spatial and single-cell data. It is perfect for users seeking straightforward data exploration without delving into complex image processing.
  • Squidpy/Scanpy-AnnData: A powerful option, especially for Python users, as it bears advanced image processing capabilities. It is suitable for both beginners and advanced users, as one can benefit from Python’s extensive image analysis tools and seamless integration with other tools.

Elucidata provides custom solutions for spatial omics, regardless of downstream requirements.
To learn more, reach out to us here or reach out at info@elucidata.io.

Blog Categories

Talk to our Data Expert
Thank you for reaching out!

Our team will get in touch with you over email within next 24-48hrs.
Oops! Something went wrong while submitting the form.

Blog Categories