Motivation: Recent advances in DNA sequencing technologies have allowed the detailed characterization of genomes in large cohorts of tumors, highlighting their extreme heterogeneity, with no two tumors sharing the same complement of somatic mutations. Such heterogeneity hinders our ability to identify somatic mutations important for the disease, including mutations that determine clinically relevant phenotypes (e.g., cancer subtypes). Several tools have been developed to identify somatic mutations related to cancer phenotypes. However, such tools identify correlations between somatic mutations and cancer phenotypes, with no guarantee of highlighting causal relations. Results: This thesis is centered around ALLSTAR, a novel tool I developed as a result of a joint collaboration between the Veneto Institute of Oncology and the Department of Information Engineering at the University of Padova. The tool is able to infer reliable causal relations between combinations of somatic mutations and cancer phenotypes. ALLSTAR ranks causal rules based on the highest impact in terms of average effect on the phenotype. Since proving that the underlying computational problem is NP-hard, I developed a branch-and-bound approach, employing protein-protein interaction networks and novel bounds for pruning the search space, while properly correcting for multiple hypothesis testing. The extensive experimental evaluation on synthetic data shows that ALLSTAR is able to identify reliable causal relations in large cancer cohorts. Moreover, the reliable causal rules identified in cancer data show that my approach is able to retrieve several somatic mutations known to be relevant for cancer phenotypes, as well as novel biologically meaningful relations. Availability and Implementation: Code, data, and scripts to reproduce the experiments are available at https://github.com/VandinLab/ALLSTAR.

ALLSTAR: un nuovo algoritmo bioinformatico per inferire regole causali tra mutazioni somatiche e fenotipi tumorali / Collesei, Antonio. - (2024 May 14).

ALLSTAR: un nuovo algoritmo bioinformatico per inferire regole causali tra mutazioni somatiche e fenotipi tumorali

COLLESEI, ANTONIO
2024

Abstract

Motivation: Recent advances in DNA sequencing technologies have allowed the detailed characterization of genomes in large cohorts of tumors, highlighting their extreme heterogeneity, with no two tumors sharing the same complement of somatic mutations. Such heterogeneity hinders our ability to identify somatic mutations important for the disease, including mutations that determine clinically relevant phenotypes (e.g., cancer subtypes). Several tools have been developed to identify somatic mutations related to cancer phenotypes. However, such tools identify correlations between somatic mutations and cancer phenotypes, with no guarantee of highlighting causal relations. Results: This thesis is centered around ALLSTAR, a novel tool I developed as a result of a joint collaboration between the Veneto Institute of Oncology and the Department of Information Engineering at the University of Padova. The tool is able to infer reliable causal relations between combinations of somatic mutations and cancer phenotypes. ALLSTAR ranks causal rules based on the highest impact in terms of average effect on the phenotype. Since proving that the underlying computational problem is NP-hard, I developed a branch-and-bound approach, employing protein-protein interaction networks and novel bounds for pruning the search space, while properly correcting for multiple hypothesis testing. The extensive experimental evaluation on synthetic data shows that ALLSTAR is able to identify reliable causal relations in large cancer cohorts. Moreover, the reliable causal rules identified in cancer data show that my approach is able to retrieve several somatic mutations known to be relevant for cancer phenotypes, as well as novel biologically meaningful relations. Availability and Implementation: Code, data, and scripts to reproduce the experiments are available at https://github.com/VandinLab/ALLSTAR.
ALLSTAR: A Novel Bioinformatic Algorithm to Infer Causal Rules between Somatic Mutations and Cancer Phenotypes
14-mag-2024
ALLSTAR: un nuovo algoritmo bioinformatico per inferire regole causali tra mutazioni somatiche e fenotipi tumorali / Collesei, Antonio. - (2024 May 14).
File in questo prodotto:
File Dimensione Formato  
tesi_definitiva_Antonio_Collesei.pdf

accesso aperto

Descrizione: tesi_definitiva_Antonio_Collesei
Tipologia: Tesi di dottorato
Dimensione 3.25 MB
Formato Adobe PDF
3.25 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3520422
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact