The aim of this study is to evaluate the quality of topic solutions generated by Latent Dirichlet Allocation (LDA), Correlated Topic Model (CTM), and fuzzy Latent Semantic Analysis (fLSA). By introducing the CL, RL, and HO indices, the study focuses on structural properties such as oversimplification, redundancy, and homogeneity, offering a novel approach to complement traditional metrics like coherence and perplexity. This framework provides a nuanced perspective for assessing topic quality.

Low-Rank Analysis of Topic Quality: Comparing LDA, CTM, and Fuzzy-LSA methods

antonio calcagni'
2025

Abstract

The aim of this study is to evaluate the quality of topic solutions generated by Latent Dirichlet Allocation (LDA), Correlated Topic Model (CTM), and fuzzy Latent Semantic Analysis (fLSA). By introducing the CL, RL, and HO indices, the study focuses on structural properties such as oversimplification, redundancy, and homogeneity, offering a novel approach to complement traditional metrics like coherence and perplexity. This framework provides a nuanced perspective for assessing topic quality.
2025
BOOK OF SHORT PAPERS
IES 2025 - Innovation & Society: Statistics and Data Science for Evaluation and Quality
978 88 5495 849 4
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3556030
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact