Similarity-based PageRank

The similarity-based PageRank is a variant of the PageRank algorithm designed to identify influential nodes by guiding random walks according to the structural similarity between nodes [2, 3].
Let \(d_{\max}\) denote the largest degree in the graph \(G\). For each node \(i\), a \((d_{\max}+1)\)-dimensional vector is constructed containing the degree of node \(i\) and the degrees of its \(|\mathcal{N}(i)|\) neighbors; any remaining entries are set to a small constant \(10^{-8}\). The vector entries are sorted to ensure a consistent alignment for KL-divergence computation. This vector is then normalized to form a probability vector \(p(i)\).
The dissimilarity between nodes \(i\) and \(j\) is quantified using the symmetric Kullback-Leibler (KL) divergence:
\[
r_{ij} = D_{\mathrm{KL}}(p(i)\,\|\,p(j)) + D_{\mathrm{KL}}(p(j)\,\|\,p(i)),
\]
where larger values of \(r_{ij}\) indicate greater structural differences between nodes \(i\) and \(j\).
The structural similarity between nodes \(i\) and \(j\) is defined as
\[
s_{ij} = 1 - \frac{r_{ij}}{r_{\max}+1},
\]
where \(r_{\max} = \max_{(i,j)} r_{ij}\). Higher structural similarity \(s_{ij}\) corresponds to a higher probability of transition between adjacent nodes. The resulting similarity matrix \(S = [s_{ij}]\) is then used to generate the transition probability matrix \(P\) for the PageRank algorithm, which in turn determines the importance of each node.

References

[1] Shvydun, S. (2025). Zoo of Centralities: Encyclopedia of Node Metrics in Complex Networks. arXiv: 2511.05122 https://doi.org/10.48550/arXiv.2511.05122

[2] Zhang, Q., Li, M., & Deng, Y. (2018). Measure the structure similarity of nodes in complex networks based on relative entropy. Physica A: Statistical Mechanics and its Applications, 491, 749-763. doi: 10.1016/j.physa.2017.09.042.

[3] Zhao, J., Song, Y., Liu, F., & Deng, Y. (2021). The identification of influential nodes based on structure similarity. Connection science, 33(2), 201-218. doi: 10.1080/09540091.2020.1806203.