Robotic Object Sorting via Deep Reinforcement Learning: A generalized approach

This work proposes a general formulation for the Object Sorting problem, suitable to describe any non-deterministic environment characterized by friendly and adversarial interference. Such an approach, coupled with a Deep Reinforcement Learning algorithm, allows training policies to solve different sorting tasks without adjusting the architecture or modifying the learning method. Briefly, the environment is subdivided into a clutter, where objects are freely located, and a set of clusters, where objects should be placed according to predefined ordering and classification rules. A 3D grid discretizes such environment: the properties of an object within a cell depict its state. Such attributes include object category and order. A Markov Decision Process formulates the problem: at each time step, the state of the cells fully defines the environment's one. Users can custom-define object classes, ordering priorities, and failure rules. The latter by assigning a non-uniform risk probability to each cell. Performed experiments successfully trained and validated a Deep Reinforcement Learning model to solve several sorting tasks while minimizing the number of moves and failure probability. Obtained results demonstrate the capability of the system to handle non-deterministic events, like failures, and unpredictable external disturbances, like human user interventions.

Robotic Object Sorting via Deep Reinforcement Learning: A generalized approach

Nicola G.;Tagliapietra L.;Tosello E.;Navarin N.;Ghidoni S.;Menegatti E.

2020

Abstract

This work proposes a general formulation for the Object Sorting problem, suitable to describe any non-deterministic environment characterized by friendly and adversarial interference. Such an approach, coupled with a Deep Reinforcement Learning algorithm, allows training policies to solve different sorting tasks without adjusting the architecture or modifying the learning method. Briefly, the environment is subdivided into a clutter, where objects are freely located, and a set of clusters, where objects should be placed according to predefined ordering and classification rules. A 3D grid discretizes such environment: the properties of an object within a cell depict its state. Such attributes include object category and order. A Markov Decision Process formulates the problem: at each time step, the state of the cells fully defines the environment's one. Users can custom-define object classes, ordering priorities, and failure rules. The latter by assigning a non-uniform risk probability to each cell. Performed experiments successfully trained and validated a Deep Reinforcement Learning model to solve several sorting tasks while minimizing the number of moves and failure probability. Obtained results demonstrate the capability of the system to handle non-deterministic events, like failures, and unpredictable external disturbances, like human user interventions.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2020
			
	Titolo del Libro
	
				29th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2020
			
	Titolo convegno
	
				29th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2020
			
	Codice DOI
	
				https://dx.doi.org/10.1109/RO-MAN47096.2020.9223484
			
	Codice WOS
	
				WOS:000598571700184
			
	Codice Scopus
	
				2-s2.0-85095746167
			
	Codice ISBN
	
				9781728160757
			
	Appare nelle tipologie:
	
				04.01 - Contributo in atti di convegno

File in questo prodotto:

File	Dimensione	Formato
09223484.pdf Accesso riservato Descrizione: articolo principale Tipologia: Published (Publisher's Version of Record) Licenza: Accesso privato - non pubblico Dimensione 4.18 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	4.18 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3358161

Citazioni

ND

5

3

ND

social impact