We study the complexity of local graph-centrality estimation, with the goal of approximating the centrality score of a given target node while exploring only a sublinear number of nodes/arcs of the graph and performing a sublinear number of elementary operations. We develop a technique, that we apply to PageRank and Heat Kernel, for constructing a low-variance score estimator through a local exploration of the graph. We obtain an algorithm that, given any node in any graph of $n$ nodes and $m$ arcs, with probability $(1-\delta)$ computes a multiplicative $(1\pm\epsilon)$-approximation of its score by examining only $\tilde{O}(\min(n^{1/2} \dmax^{1/2}, n^{1/2} m^{1/4}))$ nodes/arcs, where $\dmax$ is the maximum outdegree of the graph (omitting for readability $\poly(\epsilon^{-1})$ and $\polylog(\delta^{-1})$ factors). A similar bound holds for computational cost. We also prove a lower bound of $\Omega(\min(n^{1/2} \dmax^{1/2}, \, n^{1/3} m^{1/3}))$ for both query complexity and computational complexity. Moreover, in the jump-and-crawl graph-access model, our technique yields a $\tilde{O}(\min(n^{1/2}\dmax^{1/2}, n^{2/3}))$-queries algorithm; we show this algorithm is optimal up to a logarithmic factor -- in fact, sublogarithmic in the case of PageRank. These are the first algorithms with sublinear worst-case bounds for general directed graphs and any choice of the target node.
Sublinear Algorithms for Local Graph-Centrality Estimation
Peserico, Enoch;
2023
Abstract
We study the complexity of local graph-centrality estimation, with the goal of approximating the centrality score of a given target node while exploring only a sublinear number of nodes/arcs of the graph and performing a sublinear number of elementary operations. We develop a technique, that we apply to PageRank and Heat Kernel, for constructing a low-variance score estimator through a local exploration of the graph. We obtain an algorithm that, given any node in any graph of $n$ nodes and $m$ arcs, with probability $(1-\delta)$ computes a multiplicative $(1\pm\epsilon)$-approximation of its score by examining only $\tilde{O}(\min(n^{1/2} \dmax^{1/2}, n^{1/2} m^{1/4}))$ nodes/arcs, where $\dmax$ is the maximum outdegree of the graph (omitting for readability $\poly(\epsilon^{-1})$ and $\polylog(\delta^{-1})$ factors). A similar bound holds for computational cost. We also prove a lower bound of $\Omega(\min(n^{1/2} \dmax^{1/2}, \, n^{1/3} m^{1/3}))$ for both query complexity and computational complexity. Moreover, in the jump-and-crawl graph-access model, our technique yields a $\tilde{O}(\min(n^{1/2}\dmax^{1/2}, n^{2/3}))$-queries algorithm; we show this algorithm is optimal up to a logarithmic factor -- in fact, sublogarithmic in the case of PageRank. These are the first algorithms with sublinear worst-case bounds for general directed graphs and any choice of the target node.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.