Towards Lower Bounds on the Depth of ReLU Neural Networks

We contribute to a better understanding of the class of functions that is represented by a neural network with ReLU activations and a given architecture. Using techniques from mixed-integer optimization, polyhedral theory, and tropical geometry, we provide a mathematical counterbalance to the universal approximation theorems which suggest that a single hidden layer is sufficient for learning tasks. In particular, we investigate whether the class of exactly representable functions strictly increases by adding more layers (with no restrictions on size). This problem has potential impact on algorithmic and statistical aspects because of the insight it provides into the class of functions represented by neural hypothesis classes. However, to the best of our knowledge, this question has not been investigated in the neural network literature. We also present upper bounds on the sizes of neural networks required to represent functions in these neural hypothesis classes.

Towards Lower Bounds on the Depth of ReLU Neural Networks

Hertrich C.;Basu A.;Di Summa M.;Skutella M.

2021

Abstract

We contribute to a better understanding of the class of functions that is represented by a neural network with ReLU activations and a given architecture. Using techniques from mixed-integer optimization, polyhedral theory, and tropical geometry, we provide a mathematical counterbalance to the universal approximation theorems which suggest that a single hidden layer is sufficient for learning tasks. In particular, we investigate whether the class of exactly representable functions strictly increases by adding more layers (with no restrictions on size). This problem has potential impact on algorithmic and statistical aspects because of the insight it provides into the class of functions represented by neural hypothesis classes. However, to the best of our knowledge, this question has not been investigated in the neural network literature. We also present upper bounds on the sizes of neural networks required to represent functions in these neural hypothesis classes.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2021
			
	Titolo del Libro
	
				Advances in Neural Information Processing Systems
			
	Collana/serie monografica
	
				ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS
			
	Titolo convegno
	
				35th Conference on Neural Information Processing Systems, NeurIPS 2021
			
	Codice WOS
	
				WOS:000925183302072
			
	Codice Scopus
	
				2-s2.0-85128527699
			
	Appare nelle tipologie:
	
				04.01 - Contributo in atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Relu.pdf accesso aperto Descrizione: Articolo Tipologia: Published (publisher's version) Licenza: Accesso gratuito Dimensione 372.28 kB Formato Adobe PDF Visualizza/Apri	372.28 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3455162

Citazioni

ND

17

1

ND

social impact