ULTRA SCALABLE METHODS FOR DIFFERENTIAL TESTING OF SPATIAL-OMIC DATA

Loading...
Thumbnail Image
Degree type
PhD
Graduate group
Statistics and Data Science
Discipline
Statistics and Probability
Subject
Funder
Grant number
License
Copyright date
01/01/2025
Distributor
Related resources
Author
Mason, Kaishu
Contributor
Abstract

Spatial omic technologies are revolutionizing our ability to study tissue organization and function by preserving the spatial context of molecular measurements. Unlike traditional single-cell assays, which dissociate cells from their native environments, spatial methods retain information about each cell’s physical location and neighborhood. This spatial context enables researchers to explore how cellular processes such as gene regulation are influenced by microenvironmental cues like intercellular communication—critical for understanding development, immune responses, disease progression, and tissue regeneration. However, a key limitation of current spatial omic technologies is their resolution: most platforms capture measurements at the level of spatial “spots,” each of which may contain multiple cells. This poses a challenge for traditional statistical tools, such as generalized linear models, which typically assume cell-specific input. As spatial datasets continue to grow in size and complexity, there is a pressing need for computational methods that can robustly infer cell type-specific molecular patterns from mixed-resolution data while scaling well. To address these problems we introduce two key innovations. The first is SpotGLM, a statistical framework for modeling niche-differential patterns in spatial-omic data. At its core, SpotGLM applies a mixture-based generalized linear model to test for molecular features that vary in a cell type-specific manner across spatial niches. This general pipeline supports a wide range of spatial omic modalities—including gene expression, chromatin accessibility, and RNA splicing—and can be applied at multiple spatial resolutions. A key application of SpotGLM is niche-differential expression (niche-DE) analysis, which identifies genes that are differentially expressed within a specific cell type depending on its spatial context. We extend this with niche-LR, a method that uncovers ligand-receptor interactions that may underlie niche-specific gene regulation. To ensure scalability, our second innovation is SPARROW, a power-preserving data reduction technique that enables efficient inference across millions of spatial coordinates. Together, SpotGLM and SPARROW comprise a generalized and ultra-scalable computational pipeline for dissecting spatially organized molecular programs across diverse spatial omic platforms.

Advisor
Zhang, Nancy, R
Date of degree
2025
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation