Large language models have significantly transformed multiple fields with their exceptional performance in natural language tasks, but their deployment in resource-constrained environments like edge networks presents an ongoing challenge. Decentralized techniques for inference have emerged, distributing the model blocks among multiple devices to improve flexibility and cost effectiveness. However, energy limitations remain a significant concern for edge devices. We propose a sustainable model for collaborative inference on interconnected, battery-powered edge devices with energy harvesting. A semi-Markov model is developed to describe the states of the devices, considering processing parameters and average green energy arrivals. This informs the design of scheduling algorithms that aim to minimize device downtimes and maximize network throughput. Through empirical evaluations and simulated runs, we validate the effectiveness of our approach, paving the way for energy-efficient decentralized inference over edge networks.

Decentralized LLM Inference over Edge Networks with Energy Harvesting

Khoshsirat, Aria;Perin, Giovanni;Rossi, Michele
2024

Abstract

Large language models have significantly transformed multiple fields with their exceptional performance in natural language tasks, but their deployment in resource-constrained environments like edge networks presents an ongoing challenge. Decentralized techniques for inference have emerged, distributing the model blocks among multiple devices to improve flexibility and cost effectiveness. However, energy limitations remain a significant concern for edge devices. We propose a sustainable model for collaborative inference on interconnected, battery-powered edge devices with energy harvesting. A semi-Markov model is developed to describe the states of the devices, considering processing parameters and average green energy arrivals. This informs the design of scheduling algorithms that aim to minimize device downtimes and maximize network throughput. Through empirical evaluations and simulated runs, we validate the effectiveness of our approach, paving the way for energy-efficient decentralized inference over edge networks.
2024
Proceeding of the 2024 IEEE Global Communications Conference
GLOBECOM 2024 - 2024 IEEE Global Communications Conference
   SmaRt, AutOmated, and ReliaBle SecUrity Service PlaTform for 6G
   ROBUST-6G
   European Commission
   Horizon Europe Framework Programme
   101139068

   Taming the environmental impact of mobile networks through GREEN EDGE computing platforms
   GREENEDGE
   European Commission
   Horizon 2020 Framework Programme
   953775

   "Telecommunications of the future"
   RESTART
   European Commission
   Italian NRRP
   PE0000001 - program “RESTART”
File in questo prodotto:
File Dimensione Formato  
2408.15907v1.pdf

accesso aperto

Tipologia: Preprint (AM - Author's Manuscript - submitted)
Licenza: Creative commons
Dimensione 370.51 kB
Formato Adobe PDF
370.51 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3550237
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 3
  • OpenAlex 4
social impact