Deep learning is significantly advancing the analysis of electroencephalography (EEG) data, with recent studies reporting remarkably high accuracy across various applications. However, impressive accuracies are often the result of methodological flaws that lead to inflated and unrealistic estimates. When properly evaluated, state-of-the-art EEG deep learning models often demonstrate poor generalizability. This issue stems from several stages of the training pipeline, including data preprocessing, data partitioning, random seed selection, and cross-validation strategy. Generalizability issues and methodological errors hinder progress in the field by creating unrealistic expectations and limiting the reliability of proposed systems. Conducting rigorous analyses is therefore essential, not only to assess the true capabilities of state-of-the-art models and training approaches but also to guide research toward reliable outcomes. The aim of this thesis is to investigate model generalizability in deep learning-based EEG analysis. First, critical steps in the training pipeline are examined to identify which features of the EEG signals are extracted and exploited by the model during inference on unseen subjects. Insights from these analyses are then used to develop novel architectures and pretraining strategies aimed at mitigating current limitations. This thesis showcases findings from multiple research studies published in scientific journals. It is structured as follows. Chapter 1 outlines the motivation, objectives, and structure of the thesis. Chapter 2 provides an in-depth literature review of self-supervised learning, a promising approach to improve model generalizability by learning meaningful patterns from large unlabeled datasets. Chapter 3 introduces novel software packages for scalable and reproducible EEG deep learning research. Chapter 4 and 5 present two comprehensive analyses of data preprocessing and partitioning, showing how poor design in these steps can lead to data leakage and misuse of inherent signal characteristics (e.g., artifacts, biometric features). Chapter 6 introduced TransformEEG, a novel convolutional-transformer model aimed at reducing performance variability in clinical applications. Chapter 7 evaluates whether self-supervised learning can further improve TransformEEG's generalizability. In conclusion, this thesis offers a comprehensive examination of current generalizability challenges in EEG-based deep learning. It proposes rigorous evaluation protocols, introduces novel modeling and training strategies, and demonstrates that generalizability can be substantially improved through methodological rigor and innovation.

Understanding Model Generalizability in Deep Learning-Based Electroencephalography Analysis and How Self-Supervised Learning Can Improve It / Del Pup, Federico. - (2026 Mar 20).

Understanding Model Generalizability in Deep Learning-Based Electroencephalography Analysis and How Self-Supervised Learning Can Improve It

DEL PUP, FEDERICO
2026

Abstract

Deep learning is significantly advancing the analysis of electroencephalography (EEG) data, with recent studies reporting remarkably high accuracy across various applications. However, impressive accuracies are often the result of methodological flaws that lead to inflated and unrealistic estimates. When properly evaluated, state-of-the-art EEG deep learning models often demonstrate poor generalizability. This issue stems from several stages of the training pipeline, including data preprocessing, data partitioning, random seed selection, and cross-validation strategy. Generalizability issues and methodological errors hinder progress in the field by creating unrealistic expectations and limiting the reliability of proposed systems. Conducting rigorous analyses is therefore essential, not only to assess the true capabilities of state-of-the-art models and training approaches but also to guide research toward reliable outcomes. The aim of this thesis is to investigate model generalizability in deep learning-based EEG analysis. First, critical steps in the training pipeline are examined to identify which features of the EEG signals are extracted and exploited by the model during inference on unseen subjects. Insights from these analyses are then used to develop novel architectures and pretraining strategies aimed at mitigating current limitations. This thesis showcases findings from multiple research studies published in scientific journals. It is structured as follows. Chapter 1 outlines the motivation, objectives, and structure of the thesis. Chapter 2 provides an in-depth literature review of self-supervised learning, a promising approach to improve model generalizability by learning meaningful patterns from large unlabeled datasets. Chapter 3 introduces novel software packages for scalable and reproducible EEG deep learning research. Chapter 4 and 5 present two comprehensive analyses of data preprocessing and partitioning, showing how poor design in these steps can lead to data leakage and misuse of inherent signal characteristics (e.g., artifacts, biometric features). Chapter 6 introduced TransformEEG, a novel convolutional-transformer model aimed at reducing performance variability in clinical applications. Chapter 7 evaluates whether self-supervised learning can further improve TransformEEG's generalizability. In conclusion, this thesis offers a comprehensive examination of current generalizability challenges in EEG-based deep learning. It proposes rigorous evaluation protocols, introduces novel modeling and training strategies, and demonstrates that generalizability can be substantially improved through methodological rigor and innovation.
Understanding Model Generalizability in Deep Learning-Based Electroencephalography Analysis and How Self-Supervised Learning Can Improve It
20-mar-2026
Understanding Model Generalizability in Deep Learning-Based Electroencephalography Analysis and How Self-Supervised Learning Can Improve It / Del Pup, Federico. - (2026 Mar 20).
File in questo prodotto:
File Dimensione Formato  
PhD_Thesis_Del_Pup.pdf

accesso aperto

Descrizione: tesi_Federico_DelPup
Tipologia: Tesi di dottorato
Dimensione 9.9 MB
Formato Adobe PDF
9.9 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3591225
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact