A recent publication in Nature Communications led by Massimo Andreatta and directed by Santiago Carmona examines the incorporation of prior cell type knowledge into the integration of single-cell transcriptomics data. This approach allows for the integration of diverse samples while retaining biological variability.
Combining data from various samples and sources enables the exploration of biological diversity across different tissues and conditions. However, integrating single-cell omics data poses a challenge known as "overcorrection," which can lead to significant loss of biological variability. In this study, researchers introduce a novel algorithm designed to incorporate prior knowledge of expected cell types during sample integration, thus maintaining biological diversity. They advocate for the widespread adoption of this approach, highlighting its effectiveness in guiding dataset integration.
This work was supported by the Swiss National Science Foundation, Swiss Cancer Research foundation, and ISREC foundation.