Can Correspondence Analysis Challenge Transformers in Authorship Attribution Tasks?

With reference to a large corpus of 76 Italian contemporary popular mystery novels by 16 different authors, this study aims to assess the performance of large language models in an authorship attribution test. The results obtained through both transformers and correspondence analysis vector representations are compared and contrast in machine learning classification tasks. Although in previous works transformers have been shown to perform better than other alternatives, in this case, correspondence analysis wins the challenge. Results support the hypothesis that specialized large corpora require tailor-made representations.

Can Correspondence Analysis Challenge Transformers in Authorship Attribution Tasks?

Sciandra A.;Tuzzi A.

2024

Abstract

With reference to a large corpus of 76 Italian contemporary popular mystery novels by 16 different authors, this study aims to assess the performance of large language models in an authorship attribution test. The results obtained through both transformers and correspondence analysis vector representations are compared and contrast in machine learning classification tasks. Although in previous works transformers have been shown to perform better than other alternatives, in this case, correspondence analysis wins the challenge. Results support the hypothesis that specialized large corpora require tailor-made representations.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2024
			
	Titolo del Libro
	
				Proceedings of the Statistics and Data Science 2024 Conference - New perspectives on Statistics and Data Science
			
	Titolo convegno
	
				Statistics and Data Science 2024 Conference
			
	Codice ISBN
	
				978-88-5509-645-4
			
	Appare nelle tipologie:
	
				04.01 - Contributo in atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Atti-SDS-2024-Sciandra_Tuzzi.pdf accesso aperto Descrizione: Full text Tipologia: Published (Publisher's Version of Record) Licenza: Accesso gratuito Dimensione 619.46 kB Formato Adobe PDF Visualizza/Apri	619.46 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3518541

Citazioni

ND

ND

ND

ND

social impact