All-for-One and One-for-All: Deep Learning-Based Feature Fusion for Synthetic Speech Detection

Mari, D.; Salvi, D.; Bestagini, P.; Milani, S.

doi:10.1007/978-3-031-74627-7_39

Recent advances in deep learning and computer vision have made the synthesis and counterfeiting of multimedia content more accessible than ever, leading to possible threats and dangers from malicious users. In the audio field, we are witnessing the growth of speech deepfake generation techniques, which solicit the development of synthetic speech detection algorithms to counter possible mischievous uses such as frauds or identity thefts. In this paper, we consider three different feature sets proposed in the literature for the synthetic speech detection task and present a model that fuses them, achieving overall better performances with respect to the state-of-the-art solutions. The system was tested on different scenarios and datasets to prove its robustness to anti-forensic attacks and its generalization capabilities.

All-for-One and One-for-All: Deep Learning-Based Feature Fusion for Synthetic Speech Detection

Mari D.;Salvi D.;Bestagini P.;Milani S.^Supervision

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Titolo del Libro
	
				Communications in Computer and Information Science
			
	Collana/serie monografica
	
				COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE
			
	Codice DOI
	
				https://dx.doi.org/10.1007/978-3-031-74627-7_39
			
	Codice Scopus
	
				2-s2.0-85215589508
			
	Appare nelle tipologie:
	
				02.01 - Contributo in volume (Capitolo o Saggio)

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3566399

All-for-One and One-for-All: Deep Learning-Based Feature Fusion for Synthetic Speech Detection

Mari D.;Salvi D.;Bestagini P.;Milani S.^Supervision

Supervision

2025

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

Pubblicazioni consigliate

Citazioni

social impact

All-for-One and One-for-All: Deep Learning-Based Feature Fusion for Synthetic Speech Detection

Mari D.;Salvi D.;Bestagini P.;Milani S.Supervision

Supervision

2025

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Pubblicazioni consigliate

Informazioni

Citazioni

social impact

Conferma cancellazione

Mari D.;Salvi D.;Bestagini P.;Milani S.^Supervision

Scheda breve

Scheda completa

Scheda completa (DC)