Permuting a vector is a fundamental primitive which arises in many applications. In particular, rational permutations, which are defined by permutations of the bits of the binary representations of the vector indices, are widely used. Matrix transposition and bit-reversal are notable examples of rational permutations. In this paper we contribute a number of results regarding the execution of these permutations in cache hierarchies, with particular emphasis on the cache-oblivious setting. We first bound from below the work needed to execute a rational permutation with an optimal cache complexity. Then, we develop a cache-oblivious algorithm to perform any rational permutation, which exhibits optimal work and cache complexities under the tall cache assumption. We finally show that for certain families of rational permutations (including matrix transposition and bit reversal) no cache-oblivious algorithm can exhibit optimal cache complexity for all values of the cache parameters. This latter result specializes the one proved by Brodal and Fagerberg for general permutations to the case of rational permutations, and provides further evidence that the tall cache assumption is often necessary to attain cache optimality in the context of cache-oblivious algorithms.
On the limits of cache-oblivious rational permutations
SILVESTRI, FRANCESCO
2008
Abstract
Permuting a vector is a fundamental primitive which arises in many applications. In particular, rational permutations, which are defined by permutations of the bits of the binary representations of the vector indices, are widely used. Matrix transposition and bit-reversal are notable examples of rational permutations. In this paper we contribute a number of results regarding the execution of these permutations in cache hierarchies, with particular emphasis on the cache-oblivious setting. We first bound from below the work needed to execute a rational permutation with an optimal cache complexity. Then, we develop a cache-oblivious algorithm to perform any rational permutation, which exhibits optimal work and cache complexities under the tall cache assumption. We finally show that for certain families of rational permutations (including matrix transposition and bit reversal) no cache-oblivious algorithm can exhibit optimal cache complexity for all values of the cache parameters. This latter result specializes the one proved by Brodal and Fagerberg for general permutations to the case of rational permutations, and provides further evidence that the tall cache assumption is often necessary to attain cache optimality in the context of cache-oblivious algorithms.File | Dimensione | Formato | |
---|---|---|---|
Silvestri08.pdf
accesso aperto
Descrizione: Pre-print of the paper
Tipologia:
Preprint (submitted version)
Licenza:
Accesso libero
Dimensione
246.64 kB
Formato
Adobe PDF
|
246.64 kB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.