Fitting overdispersed count data accurately and efficiently is instrumental for statistical modeling. Failing to capture the overdispersed nature of data yields to inaccurate model parameter and error estimations, and this jeopardizes the validity and reliability of statistical inferences. In this work, we expand on previous studies on developing efficient and accurate estimations of the Dirichlet Multinomial distribution log-likelihood function. A faster and accurate implementation of a state-of-the-art technique is presented and its performance is compared to Python's math and Scipy implementations of the same function. Experiments conducted using 9 different public datasets, one of which is not overdispersed, show that the proposed technique achieves up to 11 fold gain in speed over both of Python's implementations of the log-likelihood function.

Fast and accurate implementation of the Dirichlet multinomial log-likelihood function

Languasco, Alessandro
;
Migliardi, Mauro
2025

Abstract

Fitting overdispersed count data accurately and efficiently is instrumental for statistical modeling. Failing to capture the overdispersed nature of data yields to inaccurate model parameter and error estimations, and this jeopardizes the validity and reliability of statistical inferences. In this work, we expand on previous studies on developing efficient and accurate estimations of the Dirichlet Multinomial distribution log-likelihood function. A faster and accurate implementation of a state-of-the-art technique is presented and its performance is compared to Python's math and Scipy implementations of the same function. Experiments conducted using 9 different public datasets, one of which is not overdispersed, show that the proposed technique achieves up to 11 fold gain in speed over both of Python's implementations of the log-likelihood function.
2025
Proc. of International Conference on Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA 2025)
International Conference on Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA 2025)
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3522082
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact