Abstract In this paper, we introduce EXPress closED ITemset Enumeration (Expedite), a new frequent closed itemset (FCI) miner designed to speed up the process of FCIs extraction from a dataset of transactions. Compared to the state of the art, Expedite provides a CPU time saving of up to two orders of magnitude without compromising other dimensions of performance (e.g. memory). The reason why it is so fast is that Expedite wastes less time in mining intermediate item sets that are discarded in later phases of the algorithm. More specifically, it cuts down the number of both duplicate FCIs—those generated multiple times by the algorithm—and infrequent itemsets—those with low support or no supporting transactions. This feature, enjoyable by both sparse and dense datasets, is analytically motivated first, and then experimentally supported by extensive tests on real datasets. As a further contribution, we propose two alternative implementations of Expedite that perform even better than the basic version, although they rely on particular features of the input dataset.

EXPEDITE: EXPress closED ITemset Enumeration

DI PIETRO, ROBERTO;
2015

Abstract

Abstract In this paper, we introduce EXPress closED ITemset Enumeration (Expedite), a new frequent closed itemset (FCI) miner designed to speed up the process of FCIs extraction from a dataset of transactions. Compared to the state of the art, Expedite provides a CPU time saving of up to two orders of magnitude without compromising other dimensions of performance (e.g. memory). The reason why it is so fast is that Expedite wastes less time in mining intermediate item sets that are discarded in later phases of the algorithm. More specifically, it cuts down the number of both duplicate FCIs—those generated multiple times by the algorithm—and infrequent itemsets—those with low support or no supporting transactions. This feature, enjoyable by both sparse and dense datasets, is analytically motivated first, and then experimentally supported by extensive tests on real datasets. As a further contribution, we propose two alternative implementations of Expedite that perform even better than the basic version, although they rely on particular features of the input dataset.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3157913
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 3
  • OpenAlex ND
social impact