top of page

3 Common Misconceptions in Spatial Transcriptome Data Analysis

  • Mar 25
  • 5 min read

Updated: Apr 8

If you’re familiar with single-cell data, you already know the massive leap spatial transcriptomes offers: Spatial analysis can advantageously tell us where they are and how they interact, specifically by preserving spatial context, tissue architecture or niche environments. Yet bridging that gap isn't always straightforward. In this post, we’ll break down 3 common misconceptions when analyzing spatial transcriptome data we have seen - and some of our few tips to optimize your pipeline.


Misconception #1: All Spatial Transcriptome Provides Single-Cell Resolution


First and foremost, not all spatial transcriptome data is provided at single-cell resolution. Techniques like 10x Genomics Xenium and NanoString CosMx are known for achieving single-cell resolution but earlier technologies such as 10x Visium doesn’t have that feature. The difference comes from their underlying approaches: methods like Visium are sequencing-based, rather than imaging-based (such as Xenium, which uses fluorescence microscopy to capture images). In Visium, cells fall in predefined spots, then these cells are labelled with the spot’s positional information, then be sequenced. However, these spots are relatively large (around 10–50 μm), so each one may contain multiple cells.


Unlike Xenium (left), which is known for achieving single-cell resolution, cells in Visium (right) fall in predefined spots; they are then labelled with the spot’s positional information and sequenced. However, these spots are relatively large (around 10–50 μm), so each one may contain multiple cells.
Unlike Xenium (left), which is known for achieving single-cell resolution, cells in Visium (right) fall in predefined spots; they are then labelled with the spot’s positional information and sequenced. However, these spots are relatively large (around 10–50 μm), so each one may contain multiple cells.

Hence, before doing downstream analysis, it is critical to verify the native resolution of your platform. If you are working with sequencing-based methods like Visium, you must avoid the common pitfall of treating each data point as a single cell; instead, utilize deconvolution algorithms (such as RCTD, Stereoscope, or Seurat’s Label Transfer) to "unmix" the multicellular signals within each spot.[1] Conversely, if you are using in situ platforms like Xenium or CosMx, your priority should shift toward robust cell segmentation.


Misconception #2: Standard Single-Cell RNA-seq (scRNA-seq) Tools are Sufficient


Another common trap is assuming that pipelines designed for dissociated single cells are enough for spatial transcriptome analysis. These tools are excellent for clustering but they often treat each data point as an independent "bag of genes" without spatial information. The defining feature of a spatial transcriptome is its spatial coordinates and spatial autocorrelation - biological reality that a cell’s identity and function are heavily influenced by its immediate similar cellular neighbors. When you strip away the coordinates, you lose the ability to see how tissue architecture drives biology.


Hence, sometimes using tools that incorporate spatial coordinates directly into the mathematical models helps a lot. Unlike most current spatial transcriptomics methods, an example study has introduced a flexible toolkit named stLearn that combines spatial information to reconstruct spatio-temporal trajectories, analyze cell interactions, and improve data quality through imputation.[2] Here, we will take a look at how it improves cell-cell communication analysis by using an algorithm named Spatially-Constrained Two-level Permutation (SCTP).


Overview of SCTP, which identifies spatial neighborhoods of ligand–receptor co-expression and computes LR scores, then applies a two-level permutation test on genes and spatial spots to detect significant interactions. This approach reduces bias and false discoveries and can further test whether specific cell type pairs are over-represented in those spatial regions. Source: [2], Fig 4.
Overview of SCTP, which identifies spatial neighborhoods of ligand–receptor co-expression and computes LR scores, then applies a two-level permutation test on genes and spatial spots to detect significant interactions. This approach reduces bias and false discoveries and can further test whether specific cell type pairs are over-represented in those spatial regions. Source: [2], Fig 4.

In a standard scRNA-seq dataset, you can predict that Cell A might talk to Cell B because one has a Ligand (L) and the other has a Receptor (R). However, if those cells were on opposite sides of a tumor in the actual tissue, that conversation never happened. This spatial-constrained method, nevertheless, verifies these interactions by checking for physical colocalization. It identifies spatial neighbourhoods of ligand-receptor co-expression, computes so-called LR scores, then applies a unique constrained, two-level permutation test of both genes and spots/cells to robustly identify spatial locations where a given LR pair has significantly higher scores than random.[2] When validating by comparing predictions across methods, stLearn was the only one that accurately recovered the true interactions without introducing false positives.


Summary of information utilised by stLearn SCTP and eight other methods (used for benchmarking) to predict cell–cell interaction (CCI) events. Source: [2], Fig 5.
Summary of information utilised by stLearn SCTP and eight other methods (used for benchmarking) to predict cell–cell interaction (CCI) events. Source: [2], Fig 5.
Predicted CCIs by stLearn, Squidpy, CellPhoneDB, CellChat, NATMI, SingleCellSignalR, NCEM, SpaTalk and spaOTsc. Only stLearn predicts the ground-truth without false positive interactions. Source: [2], Fig 5.
Predicted CCIs by stLearn, Squidpy, CellPhoneDB, CellChat, NATMI, SingleCellSignalR, NCEM, SpaTalk and spaOTsc. Only stLearn predicts the ground-truth without false positive interactions. Source: [2], Fig 5.

Therefore, to get the most out of your data, we recommend utilizing dedicated spatial analysis tools rather than relying solely on scRNA-seq pipelines. Treating spatial transcriptome data as 'disassociated' single-cell data is a missed opportunity, as it discards the vital spatial coordinates that provide biological context.


Misconception #3: All spatial technologies can follow the same analysis workflow


Another critical oversight is the assumption that all spatial transcriptomes can be funneled through a single, standardized pipeline. In reality, the "best" workflow is dictated by the research objective and how the data was captured, either via Next-Generation Sequencing (NGS) or in situ imaging.


NGS-based methods, such as 10x Visium, provide an unbiased, whole-transcriptome view (20,000+ genes). Therefore, these datasets are "discovery engines” and are perfectly suited for exploratory analyses and hypothesis generation. Such analyses can include finding novel markers and DEGs, pathway enrichment, characterizing rare/novel cell populations, etc.


In contrast, in situ technologies like Xenium or CosMx SMI use targeted panels of only hundreds or a few thousand genes. While they "see" fewer genes, they provide sub-cellular resolution and cleaner signals with less background noise. Because of this high precision, these platforms are better suited for confirmatory analyses and hypothesis testing. Such analyses can include validating cell–cell interactions, mapping localization of known cell types, identifying spatial cell neighborhoods / niches, quantifying tissue architecture (cell density, distances, boundaries), etc.


Applying a discovery-style workflow to a targeted panel, or a high-resolution workflow to a low-resolution spot, may ignore the unique strengths of each technology and often leads to misleading biological conclusions.


Streamlining spatial transcriptome data analysis with C-DIAM


By supporting multiple spatial technologies (Visium, Visium HD, Xenium, CosMx, Stereo-seq), our CDIAM Multi-Omics Studio platform is designed to make it easier to evaluate and work across diverse spatial transcriptome datasets without being locked into a single workflow. Researchers can seamlessly import and share data in standard formats such as count matrices coupled with spatial images and Scanpy objects, enabling smoother collaboration between pathologists, bioinformaticians, and bench scientists. With an interactive, user-friendly GUI, teams can quickly run and compare different analytical approaches without extensive coding.


By supporting multiple spatial technologies (Visium, Visium HD, Xenium, CosMx, Stereo-seq), our CDIAM Multi-Omics Studio platform is designed to make it easier to evaluate and work across diverse spatial transcriptome datasets without being locked into a single workflow.
With an interactive, user-friendly GUI, teams can quickly run and compare different analytical approaches without extensive coding.

Discover and try out C-DIAM for your next projects:



References

[1] Yingkun Zhang, Xinrui Lin, Zhixian Yao, Di Sun, Xin Lin, Xiaoyu Wang, Chaoyong Yang, Jia Song,

Deconvolution algorithms for inference of the cell-type composition of the spatial transcriptome,

Computational and Structural Biotechnology Journal, Volume 21, 2023, Pages 176-184, ISSN 2001-0370, https://doi.org/10.1016/j.csbj.2022.12.001.

[2] Pham, D., Tan, X., Balderson, B. et al. Robust mapping of spatiotemporal trajectories and cell–cell interactions in healthy and diseased tissues. Nat Commun 14, 7739 (2023). https://doi.org/10.1038/s41467-023-43120-6.

Comments


bottom of page