Rectifying Batch Effects in Histology Images for Spatial Transcriptomics Analysis

Loading...
Thumbnail Image
Penn collection
The Wharton School::Wharton Undergraduate Research::Wharton Research Scholars
The Wharton School::Wharton Undergraduate Research::Summer Program for Undergraduate Research (SPUR)
Degree type
Discipline
Bioinformatics
Subject
Spatial transcriptomics
Computational pathology
Foundation model
batch effect
single cell gene expression
Funder
Grant number
Copyright date
2025
Distributor
Related resources
Author
Luo, Tianhao
Contributor
Li, Mingyao
Abstract

Spatial transcriptomics (ST) has revolutionized tissue-level gene expression analysis, but high costs and data scarcity limit its widespread application and necessitate data integration from multiple sources, which contain strong batch effects. We present a novel approach to address these challenges, combining (1) stain normalization and augmentation across multiple color spaces; and (2) incorporation of pathology foundation models pre-trained on diverse histopathological datasets. Our method improves the performance and generalizability of the iStar, which predicts near-single-cell spatial gene expression by integrating ST data with histology images. By leveraging hierarchical image feature extraction using foundation models, our approach captures both local and global tissue structures, mimicking the process of pathological examination. Notably, foundation models trained on large, diverse histology image datasets prove robust against batch effects. We evaluate our method using various ST platforms, demonstrating significant improvements in cell type segmentation and out-of-sample gene expression prediction. Our model addresses the complex nature of batch effects in ST data analysis while offering a cost-effective solution to expand the utility of existing ST datasets. By enhancing tissue characterization accuracy, our AI-driven approach advances precision medicine, potentially improving cancer diagnosis and treatment selection.

Advisor
Date Range for Data Collection (Start Date)
Date Range for Data Collection (End Date)
Digital Object Identifier
Series name and number
Publication date
2025-05-03
Volume number
Issue number
Publisher
Publisher DOI
Journal Issue
Comments
Recommended citation
Collection