Server Selection and Inference Rate Optimization in AoI-Driven Distributed Systems

Badia, L.; Castagno, P.; Mancuso, V.; Sereno, M.; Marsan, M. A.

doi:10.23919/ITC-3665175.2025.11078623

Many of today's user applications are both time-critical and computationally intensive. A typical example is provided by assisted- and self-driving systems, where the data collected by onboard sensors must be fused over network computing elements, possibly using artificial intelligence (AI) tools, to accurately reconstruct a vehicle's environment in a sufficiently short time to guarantee safe operations. Our study considers this example, but also covers more general cases, and extends to any system in which independent sources generate time-critical queries for networked services. Obtaining good performance in these cases requires the careful engineering of both communication networks and computing facilities. In addition, when multiple computation facilities are available to run AI processes (in the fog, edge or cloud, or even on the device itself), users running those time-critical and computationally intensive applications experience the dilemma of which remote resource to use so as to obtain results within the limited available time budget. This does not necessarily imply the choice of the fastest servers, as they may end up getting congested by multiple requests. In this paper, we use optimization and game theory to analyze the balance of user updates among remote AI engines, as well as the choice of the intensity of user traffic, trying to optimize the age of information (AoI) that users experience on their time-critical AI-assisted processes. We show that targeting the minimization of AoI leads to non-trivial server selection and data injection policies, and that the unavoidable price of anarchy of systems that enforce a distributed AI server selection can be low, as long as autonomous adaptation of the individual injection rate of the users is properly kept under control.