Adversarial training reduces information and improves transferability

Recent results show that features of adversarially trained networks for classification, in addition to being robust, enable desirable properties such as invertibility. The latter property may seem counter-intuitive as it is widely accepted by the community that classification models should only capture the minimal information (features) required for the task. Motivated by this discrepancy, we investigate the dual relationship between Adversarial Training and Information Theory. We show that the Adversarial Training can improve linear transferability to new tasks, from which arises a new tradeoff between transferability of representations and accuracy on the source task. We validate our results employing robust networks trained on CIFAR-10, CIFAR-100 and ImageNet on several datasets. Moreover, we show that Adversarial Training reduces Fisher information of representations about the input and of the weights about the task, and we provide a theoretical argument which explains the invertib...

Adversarial training reduces information and improves transferability

Matteo Terzi;Alessandro Achille;Marco Maggipinto;Gian Antonio Susto

2021

Abstract

Recent results show that features of adversarially trained networks for classification, in addition to being robust, enable desirable properties such as invertibility. The latter property may seem counter-intuitive as it is widely accepted by the community that classification models should only capture the minimal information (features) required for the task. Motivated by this discrepancy, we investigate the dual relationship between Adversarial Training and Information Theory. We show that the Adversarial Training can improve linear transferability to new tasks, from which arises a new tradeoff between transferability of representations and accuracy on the source task. We validate our results employing robust networks trained on CIFAR-10, CIFAR-100 and ImageNet on several datasets. Moreover, we show that Adversarial Training reduces Fisher information of representations about the input and of the weights about the task, and we provide a theoretical argument which explains the invertib...

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Titolo del Libro
	
				AAAI Conference on Artificial Intelligence
			
	Collana/serie monografica
	
				PROCEEDINGS OF THE ... AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE
			
	Titolo convegno
	
				35th AAAI Conference on Artificial Intelligence, AAAI 2021
			
	Codice DOI
	
				https://dx.doi.org/10.1609/aaai.v35i3.16371
			
	Codice WOS
	
				WOS:000680423502086
			
	Codice Scopus
	
				2-s2.0-85122476609
			
	Codice ISBN
	
				9781713835974
			
	Appare nelle tipologie:
	
				04.01 - Contributo in atti di convegno

File in questo prodotto:

File	Dimensione	Formato
16371-Article Text-19865-1-2-20210518.pdf accesso aperto Tipologia: Published (Publisher's Version of Record) Licenza: Accesso libero Dimensione 1.45 MB Formato Adobe PDF Visualizza/Apri	1.45 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3371789

Citazioni

ND

11

7

ND

social impact