The analysis of tumour microenvironment cell composition from a single-cell perspective.

The analysis of tumour microenvironment cell composition from a single-cell perspective.


Author(s): Laura Masatti,Stefania Pirrotta,Nicolò Gnoato,Matteo Marchetti,Robert Fruscio,Lorenzo Ceppi,Laura Mannarino,Chiara Romualdi,Maurizio D'Incalci,Sergio Marchini,Roberto Tozzi,Enrica Calura

Affiliation(s): Department of Biology, University of Padova, Italy.



Ovarian cancer (OC) is a common form of gynecologic cancer and is a major concern in women's health. Despite ongoing efforts, it remains a significant challenge to diagnose and treat effectively. Especially in its most aggressive subtype, high-grade serous ovarian cancer (HGSOC), represents one of the leading causes of mortality among women, due to invasiveness and tendency to metastasis. Moreover, the lack of effective screenings at an early stage of the disease leads to a poor outlook with almost 80% of the patients diagnosed with an advanced-stage disease. One of the defining traits of the disease and the reason for its aggressiveness is represented by the extensive heterogeneity carried by the tumour cells and their tumour microenvironment (TME). The Intricate interplay carried by the different cell types acting in the tumour mass still represents one of the major unveiled pieces in the puzzle on the reconstruction of this cancer mechanisms. Defining these interactions and dissecting the mechanisms of action of the TME cells, remains limited by the identification of the specific cell types and subtypes. Each process of analysing and interpreting large single-cell datasets relies heavily on cell type annotation and despite the recent progress in the technologies of sequencing, without a huge collection of different omics for the same samples, like single-cell transcriptomic paired with spatial transcriptomic data, it remains difficult to identify the correct cell type for each cell. Unfortunately, the current predominant approaches are represented by: manual cell annotation, limited by the requirement of expert knowledge and the slow speed at which it is performed; reference-based methods like SingleR, CIBERSORTx and others, limited by the need for a specific reference containing all the cell types expected to be present in the sample, with in some cases (HGSOC e.g.) still needs to be identified; the use of marker-based methods that despite the possibility to define different cell types with manually curated set of markers, are limited by the less accuracy of the process and the differences in the level of expression of some markers that could be present at the same time in different cell types. We will evaluate the different annotations performed by the method proposed in the literature, benchmarking the several tools developed for cell assignment and annotation in single-cell. We perform this benchmark on OC public datasets, and a dataset of HGSOV composed of 17 single-cell samples collected from different patients, at distinct body sites to evaluate the heterogeneity of cell composition depending on the metastatic site. Then we plan to perform an unsupervised clustering analysis to identify the differentially expressed genes belonging to the separate clusters, based on their statistics subdivide clusters with significant marker genes belonging to each of them. This step will help us identify and confirm the known cell types' annotation accuracy and identify the unknown subtypes composing the samples. Then by comparing the annotations, establish the concordant ones and select the cells that express the same cell type, annotating the common shared markers. The two purposes of this analysis are: to identify the specific set of markers for known and unknown types and subtypes of cells and to pinpoint the cells that best represent the different cell subtypes by their gene expression and in this way collect a set of cell to construct a specific reference containing all the cell type expected to be present in our samples. We hope our pipeline will define some common accuracy steps that will better define the strategy for cell-type annotation, relying on a shared understandable method, more stronger from a biological and statistical point of view.