We study data structures for storing a set of polygonal curves in R^d such that, given a query curve, we can efficiently retrieve similar curves from the set, where similarity is measured using the discrete Fréchet distance or the dynamic time warping distance. To this end we devise the first locality-sensitive hashing schemes for these distance measures. A major challenge is posed by the fact that these distance measures internally optimize the alignment between the curves. We give solutions for different types of alignments including constrained and unconstrained versions. For unconstrained alignments, we improve over a result by Indyk [SoCG 2002] for short curves. Let n be the number of input curves and let m be the maximum complexity of a curve in the input. In the particular case where m <= (a/(4d)) log n, for some fixed a>0, our solutions imply an approximate near-neighbor data structure for the discrete Fréchet distance that uses space in O(n^(1+a) log n) and achieves query time in O(n^a log^2 n) and constant approximation factor. Furthermore, our solutions provide a trade-off between approximation quality and computational performance: for any parameter k in [m], we can give a data structure that uses space in O(2^(2k) m^(k-1) n log n + nm), answers queries in O( 2^(2k) m^(k) log n) time and achieves approximation factor in O(m/k).
Locality-sensitive hashing of curves
SILVESTRI, FRANCESCO
2017
Abstract
We study data structures for storing a set of polygonal curves in R^d such that, given a query curve, we can efficiently retrieve similar curves from the set, where similarity is measured using the discrete Fréchet distance or the dynamic time warping distance. To this end we devise the first locality-sensitive hashing schemes for these distance measures. A major challenge is posed by the fact that these distance measures internally optimize the alignment between the curves. We give solutions for different types of alignments including constrained and unconstrained versions. For unconstrained alignments, we improve over a result by Indyk [SoCG 2002] for short curves. Let n be the number of input curves and let m be the maximum complexity of a curve in the input. In the particular case where m <= (a/(4d)) log n, for some fixed a>0, our solutions imply an approximate near-neighbor data structure for the discrete Fréchet distance that uses space in O(n^(1+a) log n) and achieves query time in O(n^a log^2 n) and constant approximation factor. Furthermore, our solutions provide a trade-off between approximation quality and computational performance: for any parameter k in [m], we can give a data structure that uses space in O(2^(2k) m^(k-1) n log n + nm), answers queries in O( 2^(2k) m^(k) log n) time and achieves approximation factor in O(m/k).Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.