This paper addresses the joint problem of energy saving and interference coordination in heterogeneous networks (HetNets) using a contextual bandit formulation. We propose a semi-distributed scheme consisting of a learning agent and local controllers. The learning agent comprises a neural network (NN) classifier and a Multi-Armed Bandit (MAB) algorithm. The NN classifier is dynamically trained to choose a subset of configurations (i.e., feasible configurations in terms of QoS) based on the context information (network state). Then, the MAB algorithm picks one control (i.e., global configuration parameters) among those selected by the NN classifier, with the aim of improving the energy efficiency. These global configurations are interpreted by the local controllers on each network sector. This scheme allows the learning agent to progressively learn the best policy by observing the network state and the performance of the chosen configurations in terms of energy consumption and QoS. Our numerical results show an energy saving close to 20% with respect to a default policy and an improvement of 13% with respect to addressing energy saving and interference coordination separately.

Contextual Bandit Approach for Energy Saving and Interference Coordination in HetNets

ALCARAZ ESPIN, JUAN JOSè;Zanella, Andrea;Zorzi, Michele
2018

Abstract

This paper addresses the joint problem of energy saving and interference coordination in heterogeneous networks (HetNets) using a contextual bandit formulation. We propose a semi-distributed scheme consisting of a learning agent and local controllers. The learning agent comprises a neural network (NN) classifier and a Multi-Armed Bandit (MAB) algorithm. The NN classifier is dynamically trained to choose a subset of configurations (i.e., feasible configurations in terms of QoS) based on the context information (network state). Then, the MAB algorithm picks one control (i.e., global configuration parameters) among those selected by the NN classifier, with the aim of improving the energy efficiency. These global configurations are interpreted by the local controllers on each network sector. This scheme allows the learning agent to progressively learn the best policy by observing the network state and the performance of the chosen configurations in terms of energy consumption and QoS. Our numerical results show an energy saving close to 20% with respect to a default policy and an improvement of 13% with respect to addressing energy saving and interference coordination separately.
2018
IEEE International Conference on Communications
9781538631805
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3300537
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 1
social impact