Technical Report Learning Meaningful Representations of Life Workshop · Singapore, Singapore Apr 28, 2025

Transformer-Based Integrative Patient Representations from Single-Cell RNA Data

Benedikt von Querfurth, Johannes Lohmöller, Jan Pennekamp, Tore Bleckwehl, Rafael Kramann, Klaus Wehrle, Sikander Hayat

Abstract

Single-cell RNA sequencing (scRNA-Seq) is a powerful tool to explore cellular heterogeneity in healthy and diseased states, yet its translation into clinical insights has been limited. To bridge the gap between detailed cellular analysis and broader patient-level representations usable for phenotyping, we introduce a novel transformer-based architecture capable of embedding single-cell data into meaningful patient-level embeddings. This approach utilizes a self-supervised learning phase to construct integrative patient representations, which are then refined using contrastive learning techniques. On a dataset covering 7 million cells across 1223 individuals with diverse disease states, we show that learned embeddings are meaningful representations for a variety of downstream analytical tasks. Here, our approach proves robust against unbalanced datasets and shows indications of learning similarities between related diseases, such as COVID-19 and flu.

Cite this paper
@misc{querfurth2025transformer-based,
  title     = {Transformer-Based Integrative Patient Representations from Single-Cell RNA Data},
  author    = {Benedikt von Querfurth and johannes-lohmoeller and Jan Pennekamp and Tore Bleckwehl and Rafael Kramann and Klaus Wehrle and Sikander Hayat},
  booktitle = {Learning Meaningful Representations of Life Workshop},
  year      = {2025},
  doi       = {10.18154/RWTH-2025-03883},
  }