Understanding Model Generalizability in Deep Learning-Based Electroencephalography Analysis and How Self-Supervised Learning Can Improve It

Del Pup, Federico

Deep learning is significantly advancing the analysis of electroencephalography (EEG) data, with recent studies reporting remarkably high accuracy across various applications. However, impressive accuracies are often the result of methodological flaws that lead to inflated and unrealistic estimates. When properly evaluated, state-of-the-art EEG deep learning models often demonstrate poor generalizability. This issue stems from several stages of the training pipeline, including data preprocessing, data partitioning, random seed selection, and cross-validation strategy. Generalizability issues and methodological errors hinder progress in the field by creating unrealistic expectations and limiting the reliability of proposed systems. Conducting rigorous analyses is therefore essential, not only to assess the true capabilities of state-of-the-art models and training approaches but also to guide research toward reliable outcomes. The aim of this thesis is to investigate model generalizability in deep learning-based EEG analysis. First, critical steps in the training pipeline are examined to identify which features of the EEG signals are extracted and exploited by the model during inference on unseen subjects. Insights from these analyses are then used to develop novel architectures and pretraining strategies aimed at mitigating current limitations. This thesis showcases findings from multiple research studies published in scientific journals. It is structured as follows. Chapter 1 outlines the motivation, objectives, and structure of the thesis. Chapter 2 provides an in-depth literature review of self-supervised learning, a promising approach to improve model generalizability by learning meaningful patterns from large unlabeled datasets. Chapter 3 introduces novel software packages for scalable and reproducible EEG deep learning research. Chapter 4 and 5 present two comprehensive analyses of data preprocessing and partitioning, showing how poor design in these steps can lead to data leakage and misuse of inherent signal characteristics (e.g., artifacts, biometric features). Chapter 6 introduced TransformEEG, a novel convolutional-transformer model aimed at reducing performance variability in clinical applications. Chapter 7 evaluates whether self-supervised learning can further improve TransformEEG's generalizability. In conclusion, this thesis offers a comprehensive examination of current generalizability challenges in EEG-based deep learning. It proposes rigorous evaluation protocols, introduces novel modeling and training strategies, and demonstrates that generalizability can be substantially improved through methodological rigor and innovation.

Understanding Model Generalizability in Deep Learning-Based Electroencephalography Analysis and How Self-Supervised Learning Can Improve It / Del Pup, Federico. - (2026 Mar 20).

Understanding Model Generalizability in Deep Learning-Based Electroencephalography Analysis and How Self-Supervised Learning Can Improve It

DEL PUP, FEDERICO

2026

Abstract

Deep learning is significantly advancing the analysis of electroencephalography (EEG) data, with recent studies reporting remarkably high accuracy across various applications. However, impressive accuracies are often the result of methodological flaws that lead to inflated and unrealistic estimates. When properly evaluated, state-of-the-art EEG deep learning models often demonstrate poor generalizability. This issue stems from several stages of the training pipeline, including data preprocessing, data partitioning, random seed selection, and cross-validation strategy. Generalizability issues and methodological errors hinder progress in the field by creating unrealistic expectations and limiting the reliability of proposed systems. Conducting rigorous analyses is therefore essential, not only to assess the true capabilities of state-of-the-art models and training approaches but also to guide research toward reliable outcomes. The aim of this thesis is to investigate model generalizability in deep learning-based EEG analysis. First, critical steps in the training pipeline are examined to identify which features of the EEG signals are extracted and exploited by the model during inference on unseen subjects. Insights from these analyses are then used to develop novel architectures and pretraining strategies aimed at mitigating current limitations. This thesis showcases findings from multiple research studies published in scientific journals. It is structured as follows. Chapter 1 outlines the motivation, objectives, and structure of the thesis. Chapter 2 provides an in-depth literature review of self-supervised learning, a promising approach to improve model generalizability by learning meaningful patterns from large unlabeled datasets. Chapter 3 introduces novel software packages for scalable and reproducible EEG deep learning research. Chapter 4 and 5 present two comprehensive analyses of data preprocessing and partitioning, showing how poor design in these steps can lead to data leakage and misuse of inherent signal characteristics (e.g., artifacts, biometric features). Chapter 6 introduced TransformEEG, a novel convolutional-transformer model aimed at reducing performance variability in clinical applications. Chapter 7 evaluates whether self-supervised learning can further improve TransformEEG's generalizability. In conclusion, this thesis offers a comprehensive examination of current generalizability challenges in EEG-based deep learning. It proposes rigorous evaluation protocols, introduces novel modeling and training strategies, and demonstrates that generalizability can be substantially improved through methodological rigor and innovation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Titolo in inglese
	
				Understanding Model Generalizability in Deep Learning-Based Electroencephalography Analysis and How Self-Supervised Learning Can Improve It
			
	Anno di discussione
	
				20-mar-2026
			
	Citazione
	
				Understanding Model Generalizability in Deep Learning-Based Electroencephalography Analysis and How Self-Supervised Learning Can Improve It / Del Pup, Federico. - (2026 Mar 20).
			
	Appare nelle tipologie:
	
				08.01 - Tesi di Dottorato UNIPD (Deposito Legale)

File in questo prodotto:

File	Dimensione	Formato
PhD_Thesis_Del_Pup.pdf accesso aperto Descrizione: tesi_Federico_DelPup Tipologia: Tesi di dottorato Dimensione 9.9 MB Formato Adobe PDF Visualizza/Apri	9.9 MB	Adobe PDF	Visualizza/Apri