Integration of spatially resolved transcriptomics into pathological research: Opportunities and Challenges

DOI: https://doi.org/10.47184/tp.2024.01.07

The development of spatially resolved transcriptomics technologies has revolutionised research in recent years. By enabling the analysis of the state and position of cell types within a tissue section, these technologies have the potential to transform our understanding of pathological processes and translate this knowledge into improved treatments for patients. This review provides an overview of available technologies and discusses the potential challenges of integrating them into pathological research, with a particular focus on the computational analysis of such data.

Keywords: Spatial transcriptomics, computational analysis, Digital Pathology, In situ sequencing, RNA sequencing

Advances in Characterizing Biological Tissues: A Focus on Spatially Resolved Transcriptomics

Since Rudolf Virchow described the cell as the central unit of life and disease in his Cellularpathologie of 1858 [1], researchers have been striving to develop new methods for characterizing the cellular composition of biological tissues. For centuries, cells have been identified by their morphology and function. In recent years, the discovery of biomolecules as the building blocks of cells has revolutionized the understanding of cell types [2] and they are now understood as manifestations of the cells’ molecular composition, adapted to their specific functions [3]. By allowing researchers to quantify different classes of biomolecules in their entirety, the emergence of omics technologies revolutionized our understanding of the molecular composition of tissues. In particular the investigation of RNA molecules, known as transcriptomics technologies has led to major breakthroughs in the identification of cell types. Technological advances have facilitated the study of the whole transcriptome of single cells and initiated consortia such as the Human Cell Atlas, which set itself the task to define all human cell types in terms of distinct molecular profiles and correlate this information with the spatial location of the cells, their developmental point in time as well as the disease state, the environmental exposure, and the lifestyle of the donor [4]. The success of such goals crucially depends on the development of technologies that enable multimodal and multiconditional measurements. At the forefront of these developments stands the field of spatially resolved transcriptomics (SRT), which has been widely recognized as one of the most promising biological technologies [5]. This review provides an overview of SRT technologies and discusses the technological and computational challenges of applying SRT in pathology.

Sequencing-based vs. Imaging-based Approaches

The SRT field can be largely divided into two methodological principles: Sequencing-based SRT and imaging-based SRT (Figure 1).

**Figure 1:** Technological classes of spatially resolved transcriptomics methods and examples for each class.

In sequencing-based SRT, different strategies are used to encode the position of an RNA molecule within a tissue section prior to extraction and quantification using next generation sequencing (NGS). One strategy is the extraction of regions of interest (ROIs) using methods such as laser-capture microdissection [6], Tomo-seq [7] or Digital Spatial Profiling (DSP) [8]. DSP has been commercialized in Nanostring’s GeoMX system and is now one of the most widely used SRT platforms. While the physical extraction of the ROIs allows flexibility in the downstream readouts, the resolution is limited, and the ROIs must to be known in advance. In 2016, Ståhl et al. introduced the method of Spatial Transcriptomics in a seminal paper [9]. This method, whose name is now used for the entire scientific field, uses unique surface-bound DNA barcodes, arrayed on a glass surface. After attaching a tissue section to the glass, molecular biological methods are used to label RNA molecules within the tissue section with these barcodes. NGS and subsequent computer analysis facilitate the mapping of sequencing reads to spatial locations. The Ståhl method has triggered the development of a series of novel technologies using different strategies to generate surface-bound DNA barcodes, including spots [10], beads [11, 12], clusters [13, 14], nanoballs [15], or microfluidic devices [16, 17]. Although improved manufacturing methods achieved ever smaller barcode features, diffusion effects set a natural limit to the resolution of such methods and prevent these methods from achieveing actual single-cell resolution. Methods such as XYZeq or sci-Space combine spatial barcoding with single-cell extraction methods to potentially overcome these limitations, but so far lacking in throughput and sensitivity [18, 19].

Unlike to sequencing-based SRT technologies, imaging-based methods are not affected by diffusion effects during the barcoding step and are therefore able to reaching sub-cellular resolution. A fundamental principle of all these methods is fluorescent in situ hybridization (FISH), in which fluorescently labelled oligonucleotides bind to their complementary target sequences, making them visible under a fluorescence microscope. However, the width of the fluorescence spectra allows the measurement of only 3-4 fluorochromes in parallel, which drastically limits the number of genes that can be characterized in parallel. The introduction of different combinatorial strategies enabled the detection of many more genes in parallel, and this development culminated in highly multiplexed methods such as seqFISH+ [20], or MERFISH [21], which is marketed as MERSCOPE platform by Viszgen. As the signal strength depends on the number of hybridized fluorescent probes, the application of FISH-based methods is limited in the case of short target sequences. One solution to this is rolling circle amplification (RCA), which allows the isothermal amplification of the target nucleic acids using so-called padlock probes and thus yields stronger signals and higher selectivity than common FISH methods. In In Situ Sequencing (ISS) RCA has been combined with sequential, image-based readouts of fluorescently labelled detection probes to identify target RNA molecules [22]. In recent years, a variety of different technologies such as FISSEQ [23], STARmap [24] or BOLORAMIS [25] have been developed, which used the general principle of in situ sequencing, allowing the quantification of 100s to 1000s of genes in tissue sections. Combining the strengths of both the original ISS method and FISSEQ, 10X Genomics commercialized in 2022 a novel method called Xenium In Situ [26, 27], which, among other things, was used to map the breast cancer tumor microenvironment [28]. Comparisons of the currently most widely used SRT platforms GeoMX, MERSCOPE, and Xenium in situ in recent benchmarking articles have revealed both the strengths and weaknesses of these methods [29 – 31].

While early methods could only be applicable to fresh frozen samples, protocols have been developed to apply them also to FFPE samples, making the large pathological archives accessible for analysis. While this promises to revolutionize pathological research in the future, particularly the computational analysis of such datasets poses a number of challenges. These challenges and tools available to overcome them are discussed below (Figure 2).

**Figure 2:** Computational challenges in the analysis of data from sequencing-based and imaging-based spatially resolved transcriptomics methodologies.

Deconvolution of Spatial Transcriptomics Data

Sequencing-based methods such as Visium, Slide-seq or the recently published Visium HD technology, have resolutions from 50 µm down to 2 µm. Due to the technological design, each barcoded spot can contain transcripts from multiple cells, with the number of cells depending on the resolution of the method, the location of the spot relative to the tissue section, and the cell density within the tissue section. To infer cell type proportions as well as single-cell gene expression levels, computational deconvolution approaches have been developed. Most of these algorithms such as Stereoscope [32], cell2location [33], SPOTlight [34], DestVI [35], Tangram [36] or TACCO [37] use single-cell transcriptomics (scT) data as references to map single-cell information onto the SRT data. Other approaches like STdeconvolve use a reference-free approach and do not rely on scT datasets [38]. However, all these methods provide only computational approximations and are therefore not free from biases.

Cell Segmentation for Accurate Transcript Assignment

In contrast, imaging-based methods generate data consisting of the location of each measured transcript at a subcellular resolution, eliminating the need for data deconvolution. Instead, the accurate assignment of transcripts to cells becomes crucial, which brings cell segmentation into focus. In recent years, cell segmentation has been significantly improved using deep learning approaches in combination with multiplexed images, resulting in algorithms such as Cellpose [39,40] or Mesmer [41]. However, the accuracy of cell segmentation can vary between tissue types and alternative approaches such as Baysor [42], Bering [43] or BIDCell [44] exploit the transcript locations to augment the image data and improve cell segmentation. Although deconvolution is not required in imaging-based SRT methods, the integration of SRT data and scT data using tools such as TACCO [37] or Tangram [36], facilitates the transfer of cell type labels from annotated scT data to SRT data, thus combining the strengths of both technologies.

Optimizing Gene Panels for Targeted Imaging-Based Transcriptomics

So far, the most widely used imaging-based methods are targeted and rely on the prior determination of genes of interest in so-called gene panels. These gene panels determine which processes and cell types can be measured in the experiment and their design is therefore crucial. Various methods such as Spapros [45], SMaSH [46] or ActiveSVC [47] have been developed to derive optimal gene sets from scT datasets, making experiments more cost-effictive and improve transferability between scT and SRT experiments.

Addressing Batch Effects in Clinical Transcriptomic Data

Particularly in the context of clinical samples, the pre-processing steps and storage times can vary between samples, introducing batch effects in the resulting datasets which are independent of the methodology. Both for datasets from imaging-based and sequencing-based methods batch correction algorithms such as Harmony [48], Scanorama [49], or scVI [50] have been developed and compared in benchmarking studies [51].

Advancements in Predictive Modeling for Gene Expression

Unlike scT methods, SRT technologies combine image data and transcriptomic data. This allows researchers to train deep neural networks to predict the gene expression from histological images as demonstrated in SpaGCN [52], SCHAF [53] or iSTAR [54]. In the future, such algorithms could facilitate the inference of transcriptomic profiles and cell types based on histological stainings and thereby partially replace transcriptomic readouts and complement immunohistochemical stainings.

Enhancing Pathological Diagnostics

Current diagnostic workflows in pathology focus on the analysis of either histological images or sequencing-based readouts. However, SRT methods produce large data sets combining different modalities, which makes the analysis computationally resource-intensive and complex, demanding a bioinformatic skill set from (molecular) pathologists. In order to facilitate access to SRT methods to clinical researchers, integrated analytical frameworks that enable rapid visualization and exploration of data, while providing access to novel third-party analytical tools are important. The currently emerging frameworks, such as Seurat [55] and VoltRon [56] for R as well as Squidpy [57], SpatialData [58] and TissUUmaps [59] for Python, integrate the data modalities and offer a wide range of analyses, but require deeper bioinformatic knowledge, making integration into existing pathological frameworks difficult. The successful translation of knowledge from SRT experiments into improved treatments for patients will, however, rely on the integration of existing human expert knowledge with the rapidly evolving field of computational analyses. Therefore, the establishment and improvement of analytical frameworks are pivotal. Furthermore, for the correct interpretation of results and the integration of spatial transcriptomics methodologies into diagnostic workflows, training pathologists on these new data formats will be essential.

Conclusion

In conclusion, spatially resolved transcriptomics methods allow for the characterization of healthy and diseased tissue at an unprecedented depth and have the potential to revolutionize translational research. The currently observed increasing digitalization of pathological workflows [60] and the growing landscape of computational tools pose a challenge, but above all, present a unique opportunity to integrate these technologies into pathological research and improve the diagnosis and treatment of diseases.

Bibliography

1. Mazzarello P. A unifying concept: the history of cell theory. Nat Cell Biol 1999;1:E13–5. doi.org/10.1038/8964.
2. Arendt D et al. The origin and evolution of cell types. Nat Rev Genet 2016;17:744–57. doi.org/10.1038/nrg.2016.127.
3. Achim K, Arendt D. Structural evolution of cell types by step-wise assembly of cellular modules. Curr Opin Genet Dev 2014;27:102–8. doi.org/10.1016/j.gde.2014.05.001.
4. Regev A et al. The human cell atlas. eLife 2017;6. doi.org/10.7554/eLife.27041.
5. Marx V. Method of the Year: spatially resolved transcriptomics. Nat Methods 2021;18:9–14. doi.org/10.1038/s41592-020-01033-y.
6.Isenberg G, Bielser W, Meier-Ruge W, Remy E. Cell surgery by laser micro-dissection: A preparative method. J Microsc 1976;107:19–24. doi.org/10.1111/j.1365-2818.1976.tb02419.x.
7. Holler K, Junker JP. RNA Tomography for Spatially Resolved Transcriptomics (Tomo-Seq). Methods Mol Biol Clifton NJ 2019;1920:129–41. doi.org/10.1007/978-1-4939-9009-2_9.
8. Merritt CR et al. Multiplex digital spatial profiling of proteins and RNA in fixed tissue. Nat Biotechnol 2020;38:586–99. doi.org/10.1038/s41587-020-0472-9.
9. Ståhl PL et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science 2016;353:78–82. doi.org/10.1126/science.aaf2403.
10. Vickovic S et al. High-definition spatial transcriptomics for in situ tissue profiling. Nat Methods 2019;16:987–90. doi.org/10.1038/s41592-019-0548-y.
11. Rodriques SG et al. Slide-seq: A scalable technology for measuring genome-wide expression at high spatial resolution. Science 2019;363:1463–7. doi.org/10.1126/science.aaw1219.
12. Stickels RR et al. Highly sensitive spatial transcriptomics at near-cellular resolution with Slide-seqV2. Nat Biotechnol 2021;39:313–9. doi.org/10.1038/s41587-020-0739-1.
13. Fu X et al. Continuous Polony Gels for Tissue Mapping with High Resolution and RNA Capture Efficiency. Genomics; 2021. doi.org/10.1101/2021.03.17.435795.
14. Cho C-S et al. Microscopic examination of spatial transcriptome using Seq-Scope. Cell 2021;184:1–14. doi.org/10.1016/j.cell.2021.05.010.
15. Chen A, Liao S, Cheng M, Liu L, Xu X, Wang J. Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays 2022. doi.org/10.1016/j.cell.2022.04.003.
16. Liu Y et al. High-Spatial-Resolution Multi-Omics Sequencing via Deterministic Barcoding in Tissue. Cell 2020;183:1665-1681.e18. doi.org/10.1016/j.cell.2020.10.026.
17. Zhao H, Tian G, Hu A. Matrix-seq: An adjustable-resolution spatial transcriptomics via microfluidic matrix-based barcoding 2022:2022.08.05.502952. doi.org/10.1101/2022.08.05.502952.
18. Srivatsan SR et al. Embryo-scale, single-cell spatial transcriptomics. Science 2021;373:111–7. doi.org/10.1126/science.abb9536.
19. Lee Y et al. XYZeq: Spatially resolved single-cell RNA sequencing reveals expression heterogeneity in the tumor microenvironment. Sci Adv 2021;7:eabg4755. doi.org/10.1126/sciadv.abg4755.
20. Eng CHL et al. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH+. Nature 2019;568:235–9. doi.org/10.1038/s41586-019-1049-y.
21. Chen KH, Boettiger AN, Moffitt JR, Wang S, Zhuang X. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 2015;348:aaa6090. doi.org/10.1126/science.aaa6090.
22. Ke R et al. In situ sequencing for RNA analysis in preserved tissue and cells. Nat Methods 2013;10:857. doi.org/10.1038/nmeth.2563 www.nature.com/articles/nmeth.2563.
23. Lee JH et al. Highly Multiplexed Subcellular RNA Sequencing in Situ. Science 2014;343:1360–3. doi.org/10.1126/science.1250212.
24. Wang X et al. Three-dimensional intact-tissue sequencing of single-cell transcriptional states. Science 2018;361:eaat5691. doi.org/10.1126/science.aat5691.
25. Liu S et al. Barcoded oligonucleotides ligated on RNA amplified for multiplexed and parallel in situ analyses. Nucleic Acids Res 2021;49:e58–e58. doi.org/10.1093/nar/gkab120.
26. Williams CG, Lee HJ, Asatsuma T, Vento-Tormo R, Haque A. An introduction to spatial transcriptomics for biomedical research. Genome Med 2022;14:68. doi.org/10.1186/s13073-022-01075-1.
27. Salas SM et al. Optimizing Xenium In Situ data utility by quality assessment and best practice analysis workflows 2023:2023.02.13.528102. doi.org/10.1101/2023.02.13.528102.
28. Janesick A et al. High resolution mapping of the breast cancer tumor microenvironment using integrated single cell, spatial and in situ analysis of FFPE tissue 2022:2022.10.06.510405. doi.org/10.1101/2022.10.06.510405.
29. Cook DP et al. A Comparative Analysis of Imaging-Based Spatial Transcriptomics Platforms 2023:2023.12.13.571385. doi.org/10.1101/2023.12.13.571385.
30. Hartman A, Satija R. Comparative analysis of multiplexed in situ gene expression profiling technologies 2024:2024.01.11.575135. doi.org/10.1101/2024.01.11.575135.
31. Wang H et al. Systematic benchmarking of imaging spatial transcriptomics platforms in FFPE tissues 2023:2023.12.07.570603. doi.org/10.1101/2023.12.07.570603.
32. Andersson A et al. Single-cell and spatial transcriptomics enables probabilistic inference of cell type topography. Commun Biol 2020;3:1–8. doi.org/10.1038/s42003-020-01247-y.
33. Kleshchevnikov V et al. Cell2location maps fine-grained cell types in spatial transcriptomics. Nat Biotechnol 2022;40:661–71. doi.org/10.1038/s41587-021-01139-4.
34. Elosua-Bayes M, Nieto P, Mereu E, Gut I, Heyn H. SPOTlight: seeded NMF regression to deconvolute spatial transcriptomics spots with single-cell transcriptomes. Nucleic Acids Res 2021;49:e50. doi.org/10.1093/nar/gkab043.
35. Lopez R et al. DestVI identifies continuums of cell types in spatial transcriptomics data. Nat Biotechnol 2022;40:1360–9. doi.org/10.1038/s41587-022-01272-8.
36. Biancalani T et al. Deep learning and alignment of spatially resolved single-cell transcriptomes with Tangram. Nat Methods 2021;18:1352–62. doi.org/10.1038/s41592-021-01264-7.
37. Mages S et al. TACCO unifies annotation transfer and decomposition of cell identities for single-cell and spatial omics. Nat Biotechnol 2023:1–9. doi.org/10.1038/s41587-023-01657-3.
38. Miller BF, Huang F, Atta L, Sahoo A, Fan J. Reference-free cell type deconvolution of multi-cellular pixel-resolution spatially resolved transcriptomics data. Nat Commun 2022;13:2339. doi.org/10.1038/s41467-022-30033-z.
39. Stringer C, Wang T, Michaelos M, Pachitariu M. Cellpose: a generalist algorithm for cellular segmentation. Nat Methods 2021;18:100–6. doi.org/10.1038/s41592-020-01018-x.
40. Pachitariu M, Stringer C. Cellpose 2.0: how to train your own model. Nat Methods 2022:1–8. doi.org/10.1038/s41592-022-01663-4.
41. Greenwald NF et al. Whole-cell segmentation of tissue images with human-level performance using large-scale data annotation and deep learning. Nat Biotechnol 2022;40:555–65. doi.org/10.1038/s41587-021-01094-0.
42. Petukhov V, Soldatov RA, Khodosevich K, Kharchenko PV. Bayesian segmentation of spatially resolved transcrip-tomics data. bioRxiv 2020:2020.10.05.326777. doi.org/10.1101/2020.10.05.326777.
43. Jin K et al. Bering: joint cell segmentation and annotation for spatial transcriptomics with transferred graph embeddings 2023:2023.09.19.558548. doi.org/10.1101/2023.09.19.558548.
44. Fu X et al. Biologically-informed self-supervised learning for segmentation of subcellular spatial transcriptomics data 2023:2023.06.13.544733. doi.org/10.1101/2023.06.13.544733.
45. Kuemmerle LB et al. Probe set selection for targeted spatial transcriptomics 2022:2022.08.16.504115. doi.org/10.1101/2022.08.16.504115.
46. Nelson ME, Riva SG, Cvejic A. SMaSH: a scalable, general marker gene identification framework for single-cell RNA-sequencing. BMC Bioinformatics 2022;23:328. doi.org/10.1186/s12859-022-04860-2.
47. Chen X, Chen S, Thomson M. Minimal gene set discovery in single-cell mRNA-seq datasets with ActiveSVM. Nat Comput Sci 2022;2:387–98. doi.org/10.1038/s43588-022-00263-8.
48. Korsunsky I et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods 2019;16:1289–96. doi.org/10.1038/s41592-019-0619-0.
49. Hie B, Bryson B, Berger B. Efficient integration of heterogeneous single-cell transcriptomes using Scanorama. Nat Biotechnol 2019;37:685–91. doi.org/10.1038/s41587-019-0113-3.
50. Lopez R, Regier J, Cole MB, Jordan MI, Yosef N. Deep generative modeling for single-cell transcriptomics. Nat Methods 2018;15:1053–8. doi.org/10.1038/s41592-018-0229-2.
51. Luecken MD et al. Benchmarking atlas-level data integration in single-cell genomics. Nat Methods 2022;19:41–50. doi.org/10.1038/s41592-021-01336-8.
52. Hu J et al. SpaGCN: Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network. Nat Methods 2021;18:1342–51. doi.org/10.1038/s41592-021-01255-8.
53. Comiter C et al. Inference of single cell profiles from histology stains with the Single-Cell omics from Histology Analysis Framework (SCHAF) 2023:2023.03.21.533680. doi.org/10.1101/2023.03.21.533680.
54. Zhang D et al. Inferring super-resolution tissue architecture by integrating spatial transcriptomics with histology. Nat Biotechnol 2024:1–6. doi.org/10.1038/s41587-023-02019-9.
55. Hao Y et al. Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nat Biotechnol 2024;42:293–304. doi.org/10.1038/s41587-023-01767-y.
56. Manukyan A et al. VoltRon: A Spatial Omics Analysis Platform for Multi-Resolution and Multi-omics Integration using Image Registration 2023:2023.12.15.571667. doi.org/10.1101/2023.12.15.571667.
57. Palla G et al. Squidpy: a scalable framework for spatial omics analysis. Nat Methods 2022;19:171–8. doi.org/10.1038/s41592-021-01358-2.
58. Marconato L et al. SpatialData: an open and universal data framework for spatial omics 2023:2023.05.05.539647. doi.org/10.1101/2023.05.05.539647.
59.Pielawski N et al. TissUUmaps 3: Improvements in interactive visualization, exploration, and quality assessment of large-scale spatial omics data. Heliyon 2023;9:e15306. doi.org/10.1016/j.heliyon.2023.e15306.
60. Iwuajoku V et al. [Digital transformation of a routine histopathology lab : Dos and don’ts!]. Pathol Heidelb Ger 2024. doi.org/10.1007/s00292-023-01291-5.