Ensuring trust in machine learning (ML) models has become increasingly relevant in recent times, especially in sensitive areas such as healthcare. To this end, this study proposes a methodological framework to characterize prediction reliability while training ML models. The framework relies on bootstrap to compute a metric based on the rank difference between true and predicted event risks. This structure allows for the stratification of a population of patients into different groups according to the reliability of their predictions expressed as a function of the number of variables considered in input by the model. Finally, the characteristics of the groups identified from the previous step are inspected from two different perspectives: the model perspective and the variable perspective. The first analysis utilizes Shapley values to inspect how the model relies on the input variables to perform a prediction of patients assigned to different groups. Instead, the latter investigates differences in variable distributions to ensure that different groups do not represent different populations. To showcase the potential of this approach, a case study on the development and prediction reliability characterization of models to predict death due to amyotrophic lateral sclerosis is included in this paper.
Unveiling Trustworthy AI Challenges: Characterizing Prediction Reliability
Trescato, Isotta;Guazzo, Alessandro;Longato, Enrico;Tavazzi, Erica;Vettoretti, Martina;Di Camillo, Barbara
2024
Abstract
Ensuring trust in machine learning (ML) models has become increasingly relevant in recent times, especially in sensitive areas such as healthcare. To this end, this study proposes a methodological framework to characterize prediction reliability while training ML models. The framework relies on bootstrap to compute a metric based on the rank difference between true and predicted event risks. This structure allows for the stratification of a population of patients into different groups according to the reliability of their predictions expressed as a function of the number of variables considered in input by the model. Finally, the characteristics of the groups identified from the previous step are inspected from two different perspectives: the model perspective and the variable perspective. The first analysis utilizes Shapley values to inspect how the model relies on the input variables to perform a prediction of patients assigned to different groups. Instead, the latter investigates differences in variable distributions to ensure that different groups do not represent different populations. To showcase the potential of this approach, a case study on the development and prediction reliability characterization of models to predict death due to amyotrophic lateral sclerosis is included in this paper.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.