Back to Search Start Over

Imputing abundance of over 2,500 surface proteins from single-cell transcriptomes with context-agnostic zero-shot deep ensembles.

Authors :
Chen R
Zhou J
Chen B
Source :
Cell systems [Cell Syst] 2024 Sep 18; Vol. 15 (9), pp. 869-884.e6. Date of Electronic Publication: 2024 Sep 06.
Publication Year :
2024

Abstract

Cell surface proteins serve as primary drug targets and cell identity markers. Techniques such as CITE-seq (cellular indexing of transcriptomes and epitopes by sequencing) have enabled the simultaneous quantification of surface protein abundance and transcript expression within individual cells. The published data have been utilized to train machine learning models for predicting surface protein abundance solely from transcript expression. However, the small scale of proteins predicted and the poor generalization ability of these computational approaches across diverse contexts (e.g., different tissues/disease states) impede their widespread adoption. Here, we propose SPIDER (surface protein prediction using deep ensembles from single-cell RNA sequencing), a context-agnostic zero-shot deep ensemble model, which enables large-scale protein abundance prediction and generalizes better to various contexts. Comprehensive benchmarking shows that SPIDER outperforms other state-of-the-art methods. Using the predicted surface abundance of >2,500 proteins from single-cell transcriptomes, we demonstrate the broad applications of SPIDER, including cell type annotation, biomarker/target identification, and cell-cell interaction analysis in hepatocellular carcinoma and colorectal cancer. A record of this paper's transparent peer review process is included in the supplemental information.<br />Competing Interests: Declaration of interests The authors declare no competing interests.<br /> (Copyright © 2024 Elsevier Inc. All rights reserved.)

Details

Language :
English
ISSN :
2405-4720
Volume :
15
Issue :
9
Database :
MEDLINE
Journal :
Cell systems
Publication Type :
Academic Journal
Accession number :
39243755
Full Text :
https://doi.org/10.1016/j.cels.2024.08.006