We introduce the QuadratiK package that incorporates innovative data analysis methodologies. The presented software, implemented in both R and Python, offers a comprehensive set of goodness-of-fit tests and clustering techniques using kernel-based quadratic distances, thereby bridging the gap between the statistical and machine learning literatures. Our software implements one, two and k-sample tests for goodness of fit, providing an efficient and mathematically sound way to assess the fit of probability distributions. Expanded capabilities of our software include supporting tests for uniformity on the d-dimensional Sphere based on Poisson kernel densities. Particularly noteworthy is the incorporation of a unique clustering algorithm specifically tailored for spherical data that leverages a mixture of Poisson kernel-based densities on the sphere. Alongside this, our software includes additional graphical functions, aiding the users in validating, as well as visualizing and representing clustering results. This enhances interpretability and usability of the analysis. In summary, our R and Python packages serve as a powerful suite of tools, offering researchers and practitioners the means to delve deeper into their data, draw robust inference, and conduct potentially impactful analyses and inference across a wide array of disciplines.

Goodness-of-Fit and Clustering of Spherical Data: the QuadratiK package in R and Python

Saraceno, G.;
2024

Abstract

We introduce the QuadratiK package that incorporates innovative data analysis methodologies. The presented software, implemented in both R and Python, offers a comprehensive set of goodness-of-fit tests and clustering techniques using kernel-based quadratic distances, thereby bridging the gap between the statistical and machine learning literatures. Our software implements one, two and k-sample tests for goodness of fit, providing an efficient and mathematically sound way to assess the fit of probability distributions. Expanded capabilities of our software include supporting tests for uniformity on the d-dimensional Sphere based on Poisson kernel densities. Particularly noteworthy is the incorporation of a unique clustering algorithm specifically tailored for spherical data that leverages a mixture of Poisson kernel-based densities on the sphere. Alongside this, our software includes additional graphical functions, aiding the users in validating, as well as visualizing and representing clustering results. This enhances interpretability and usability of the analysis. In summary, our R and Python packages serve as a powerful suite of tools, offering researchers and practitioners the means to delve deeper into their data, draw robust inference, and conduct potentially impactful analyses and inference across a wide array of disciplines.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3533807
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact