Data imputation for gait analysis of children with Fragile X Syndrome

Beghetti, Federica; Spolaor, Fabiola; Varagnolo, Damiano; Sawacha, Zimi

doi:10.1016/j.ifacol.2025.12.030

Missing values represent a critical challenge in gait analysis datasets, where due to the high variability across subjects, a large number of trials is needed to represent an individual gait pattern; however, this is rarely available in the context of pathological subjects. This study analyzes the possibility to obtain a homogeneous number of trials across subjects through the application of imputation algorithms. Specifically Partial Least Squares regression is adopted and compared to traditional mean imputation strategies to gait analysis data of children with Fragile X Syndrome and healthy controls. For this purpose, missing values were introduced at varying percentages, and six different missingness scenarios analyzed. The effectiveness of each imputation method was assessed through quantifying the Kullback-Leibler divergence between the imputed and original datasets. Results demonstrate that PLS regression consistently outperforms on the available dataset mean imputation strategies across all conditions, maintaining a lower divergence while remaining computationally efficient. The present findings suggest that the more missing data the datasets exhibit, the more important is to choose PLS regression over mean imputation approaches.