Contextual Bandit Approach for Energy Saving and Interference Coordination in HetNets

Ayala-Romero, Jose A.; Alcaraz Espin, Juan Josè; Zanella, Andrea; Zorzi, Michele

doi:10.1109/ICC.2018.8422872

This paper addresses the joint problem of energy saving and interference coordination in heterogeneous networks (HetNets) using a contextual bandit formulation. We propose a semi-distributed scheme consisting of a learning agent and local controllers. The learning agent comprises a neural network (NN) classifier and a Multi-Armed Bandit (MAB) algorithm. The NN classifier is dynamically trained to choose a subset of configurations (i.e., feasible configurations in terms of QoS) based on the context information (network state). Then, the MAB algorithm picks one control (i.e., global configuration parameters) among those selected by the NN classifier, with the aim of improving the energy efficiency. These global configurations are interpreted by the local controllers on each network sector. This scheme allows the learning agent to progressively learn the best policy by observing the network state and the performance of the chosen configurations in terms of energy consumption and QoS. Our numerical results show an energy saving close to 20% with respect to a default policy and an improvement of 13% with respect to addressing energy saving and interference coordination separately.