Publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2024
- Disentangling Interpretable Factors with Supervised Independent Subspace Principal Component AnalysisJiayu Su†, David A. Knowles†, and Raul Rabadan†In Advances in Neural Information Processing Systems, 2024
The success of machine learning models relies heavily on effectively representing high-dimensional data. However, ensuring data representations capture human-understandable concepts remains difficult, often requiring the incorporation of prior knowledge and decomposition of data into multiple subspaces. Traditional linear methods fall short in modeling more than one space, while more expressive deep learning approaches lack interpretability. Here, we introduce Supervised Independent Subspace Principal Component Analysis (sisPCA), a PCA extension designed for multi-subspace learning. Leveraging the Hilbert-Schmidt Independence Criterion (HSIC), sisPCA incorporates supervision and simultaneously ensures subspace disentanglement. We demonstrate sisPCA’s connections with autoencoders and regularized linear regression and showcase its ability to identify and separate hidden data structures through extensive applications, including breast cancer diagnosis from image features, learning aging-associated DNA methylation changes, and single-cell analysis of malaria infection. Our results reveal distinct functional pathways associated with malaria colonization, underscoring the essentiality of explainable representation in high-dimensional data analysis.
2023
- Smoother: a unified and modular framework for incorporating structural dependency in spatial omics dataJiayu Su†, Jean-Baptiste Reynier, Xi Fu, and 8 more authorsGenome Biology, 2023
Spatial omics technologies can help identify spatially organized biological processes, but existing computational approaches often overlook structural dependencies in the data. Here, we introduce Smoother, a unified framework that integrates positional information into non-spatial models via modular priors and losses. In simulated and real datasets, Smoother enables accurate data imputation, cell-type deconvolution, and dimensionality reduction with remarkable efficiency. In colorectal cancer, Smoother-guided deconvolution reveals plasma cell and fibroblast subtype localizations linked to tumor microenvironment restructuring. Additionally, joint modeling of spatial and single-cell human prostate data with Smoother allows for spatial mapping of reference populations with significantly reduced ambiguity.
- A transcriptome-based single-cell biological age model and resource for tissue-specific aging measuresShulin Mao*, Jiayu Su*, Longteng Wang, and 3 more authorsGenome Research, 2023
Accurately measuring biological age is crucial for improving healthcare for the elderly population. However, the complexity of aging biology poses challenges in how to robustly estimate aging and interpret the biological significance of the traits used for estimation. Here we present SCALE, a statistical pipeline that quantifies biological aging in different tissues using explainable features learned from literature and single-cell transcriptomic data. Applying SCALE to the “Mouse Aging Cell Atlas” (Tabula Muris Senis) data, we identified tissue-level transcriptomic aging programs for more than 20 murine tissues and created a multitissue resource of mouse quantitative aging-associated genes. We observe that SCALE correlates well with other age indicators, such as the accumulation of somatic mutations, and can distinguish subtle differences in aging even in cells of the same chronological age. We further compared SCALE with other transcriptomic and methylation “clocks” in data from aging muscle stem cells, Alzheimer’s disease, and heterochronic parabiosis. Our results confirm that SCALE is more generalizable and reliable in assessing biological aging in aging-related diseases and rejuvenating interventions. Overall, SCALE represents a valuable advancement in our ability to measure aging accurately, robustly, and interpretably in single cells.
- Single-cell multi-omics defines the cell-type-specific impact of splicing aberrations in human hematopoietic clonal outgrowthsMariela Cortés-López, Paulina Chamely, Allegra G. Hawkins, and 31 more authorsCell Stem Cell, 2023
RNA splicing factors are recurrently mutated in clonal blood disorders, but the impact of dysregulated splicing in hematopoiesis remains unclear. To overcome technical limitations, we integrated genotyping of transcriptomes (GoT) with long-read single-cell transcriptomics and proteogenomics for single-cell profiling of transcriptomes, surface proteins, somatic mutations, and RNA splicing (GoT-Splice). We applied GoT-Splice to hematopoietic progenitors from myelodysplastic syndrome (MDS) patients with mutations in the core splicing factor SF3B1. SF3B1mut cells were enriched in the megakaryocytic-erythroid lineage, with expansion of SF3B1mut erythroid progenitor cells. We uncovered distinct cryptic 3? splice site usage in different progenitor populations and stage-specific aberrant splicing during erythroid differentiation. Profiling SF3B1-mutated clonal hematopoiesis samples revealed that erythroid bias and cell-type-specific cryptic 3? splice site usage in SF3B1mut cells precede overt MDS. Collectively, GoT-Splice defines the cell-type-specific impact of somatic mutations on RNA splicing, from early clonal outgrowths to overt neoplasia, directly in human samples.
2021
- Immunotherapy for breast cancer using EpCAM aptamer tumor-targeted gene knockdownYing Zhang, Xuemei Xie, Pourya Naderi Yeganeh, and 8 more authorsProceedings of the National Academy of Sciences, 2021
Immunotherapy benefits some aggressive breast cancers, but many breast tumors do not respond to checkpoint blockade. Novel strategies to increase breast cancer immunogenicity are needed to improve immunotherapy. Here, we used epithelial cell adhesion molecule (EpCAM) aptamer-linked small-interfering RNA chimeras (AsiC) to selectively knock down genes in mouse breast cancers to induce tumor neoantigens or overcome immune evasion. Individual gene knockdown markedly delayed tumor growth and enhanced antitumor immunity. Cd47 and Parp1 AsiCs outperformed anti-CD47 antibody and the PARP1 inhibitor Olaparib, respectively. Combining EpCAM-AsiCs targeting multiple pathways worked better than single agents and enhanced tumor inhibition by a checkpoint inhibitor. EpCAM-AsiCs have the potential to boost immunity to tumors that are poorly responsive to checkpoint blockade. New strategies for cancer immunotherapy are needed since most solid tumors do not respond to current approaches. Here we used epithelial cell adhesion molecule EpCAM (a tumor-associated antigen highly expressed on common epithelial cancers and their tumor-initiating cells) aptamer-linked small-interfering RNA chimeras (AsiCs) to knock down genes selectively in EpCAM+ tumors with the goal of making cancers more visible to the immune system. Knockdown of genes that function in multiple steps of cancer immunity was evaluated in aggressive triple-negative and HER2+ orthotopic, metastatic, and genetically engineered mouse breast cancer models. Gene targets were chosen whose knockdown was predicted to promote tumor neoantigen expression (Upf2, Parp1, Apex1), phagocytosis, and antigen presentation (Cd47), reduce checkpoint inhibition (Cd274), or cause tumor cell death (Mcl1). Four of the six AsiC (Upf2, Parp1, Cd47, and Mcl1) potently inhibited tumor growth and boosted tumor-infiltrating immune cell functions. AsiC mixtures were more effective than individual AsiC and could synergize with anti–PD-1 checkpoint inhibition.
2020
- Single-cell transcriptome profiling reveals neutrophil heterogeneity in homeostasis and infectionXuemei Xie*, Qiang Shi*, Peng Wu, and 14 more authorsNature Immunology, 2020
The full neutrophil heterogeneity and differentiation landscape remains incompletely characterized. Here, we profiled >25,000 differentiating and mature mouse neutrophils using single-cell RNA sequencing to provide a comprehensive transcriptional landscape of neutrophil maturation, function and fate decision in their steady state and during bacterial infection. Eight neutrophil populations were defined by distinct molecular signatures. The three mature peripheral blood neutrophil subsets arise from distinct maturing bone marrow neutrophil subsets. Driven by both known and uncharacterized transcription factors, neutrophils gradually acquire microbicidal capability as they traverse the transcriptional landscape, representing an evolved mechanism for fine-tuned regulation of an effective but balanced neutrophil response. Bacterial infection reprograms the genetic architecture of neutrophil populations, alters dynamic transitions between subpopulations and primes neutrophils for augmented functionality without affecting overall heterogeneity. In summary, these data establish a reference model and general framework for studying neutrophil-related disease mechanisms, biomarkers and therapeutic targets at single-cell resolution.