# BiRank

#### Definition

this method targeted the problem of ranking vertices of bipartite graph, based on the graph’s link structure as well as prior information about vertices (termed as query vector). $BiRank$ iteratively assigns scores to vertices and finally converges to a unique stationary ranking. In contrast to the traditional random walk-based methods, $BiRank$ iterates towards optimizing a regularization function, which smooths the graph under the guidance of the query vector.

A bipartite graph $G = (U \cup P,E)$ with its weight matrix $W$. A query vector $u^0$ $p^0$ encodes the prior belief concerning the vertices in $U$ and $P$, respectively, with respect to the ranking criterion.

To rank vertices based on the graph structure, seminal algorithms like $PageRank$ and $HITS$ have been proposed. Motivated from their design, this intuition for bipartite graph ranking is that the scores of vertices should follow a smoothness convention, namely that: $a$ vertex (from one side) should be ranked high if it is connected to higher-ranked vertices (from the other side). This rule defines a mutually-reinforcing relationship, which is naturally implemented as an iterative process that refines each vertex’s score as the sum of the contribution from its connected vertices:

$$P_j={\underset{i=1}{\overset{|U|}{\sum}}} w_{ij} u_i; u_i={\underset{j=1}{\overset{|P|}{\sum}}} w_{ij} P_j$$

As it is an additive update rule, normalization is necessary to ensure the convergence and stability. This method adopts the symmetric normalization scheme, which is inspired from of semi supervised learning on graphs. The idea is to smooth an edge weight by the degree of its two connected vertices simultaneously:

$$p_j={\underset{i=1}{\overset{|U|}{\sum}}} {w_{ij} \over \sqrt d_i \sqrt d_j} u_i ; u_i={\underset{j=1}{\overset{|P|}{\sum}}} {w_{ij} \over \sqrt d_i \sqrt d_j} P_j$$

where $d_i$ and $d_j$ are the weighted degrees of vertices $u_i$ and $p_j$ , respectively. The use of symmetric normalization is a key characteristic of $BiRank$, allowing edges connected to a high degree vertex to be suppressed through normalization, lessening the contribution of high-degree vertices. This has the beneficial effect of toning down the dependence of top rankings on high degree vertices, a known defect of the random walk-based diffusion methods. This gives rise to better quality results.

To account for the query vector $p^0$ and $u^0$ that encode the prior belief on the importance of the vertices, one can either opt for 1) incorporating the graph ranking results for combination in post-processing ($a.k.a$ late fusion), or 2) factoring the query vector directly into the ranking process. The first way of post-processing yields a ranking that is a compromise between two rankings; for scenarios that the query vector defines a full ranking of vertices, this ensemble approach might be suitable. However, when the query vector only provides partial information this method fails to identify an optimal ranking. In $BiRank$ the second way opted that factors the query vector directly into the ranking process, which has the advantage of using the query vector to guide the ranking process:

$$p_j= \alpha {\underset{i=1}{\overset{|U|}{\sum}}} {w_{ij} \over \sqrt d_i \sqrt d_j} u_i +(1- \alpha) p_j^0$$

$$u_i= \beta {\underset{j=1}{\overset{|P|}{\sum}}} {w_{ij} \over \sqrt d_i \sqrt d_j} P_i +(1- \beta) u_i^0$$

where $\alpha$ and $\beta$ are hyper-parameters to weight the importance of the graph structure and the prior query vector, to be set between $[0, 1]$. To keep notation simple, it can also express the iteration in its equivalent matrix form:

$$p = \alpha S^T u + (1-\alpha) p^0;$$

$$u = \beta S p + (1-\beta) u^0;$$

where $S = D_u^{-1\over 2} WD_p^{-1\over 2}$, the symmetric normalization of weight matrix $W$. This set of update rules called the $BiRank$ iteration, which forms the core of the iterative $BiRank$ algorithm.

$BiRank$ ranks vertices by accounting for both the graph structure and prior knowledge. $BiRank$ is theoretically guaranteed to converge to a stationary solution, and can be explained as by both a regularization view and a Bayesian view.

A bipartite graph $G = (U \cup P,E)$ with its weight matrix $W$. A query vector $u^0$ $p^0$ encodes the prior belief concerning the vertices in $U$ and $P$, respectively, with respect to the ranking criterion.

To rank vertices based on the graph structure, seminal algorithms like $PageRank$ and $HITS$ have been proposed. Motivated from their design, this intuition for bipartite graph ranking is that the scores of vertices should follow a smoothness convention, namely that: $a$ vertex (from one side) should be ranked high if it is connected to higher-ranked vertices (from the other side). This rule defines a mutually-reinforcing relationship, which is naturally implemented as an iterative process that refines each vertex’s score as the sum of the contribution from its connected vertices:

$$P_j={\underset{i=1}{\overset{|U|}{\sum}}} w_{ij} u_i; u_i={\underset{j=1}{\overset{|P|}{\sum}}} w_{ij} P_j$$

As it is an additive update rule, normalization is necessary to ensure the convergence and stability. This method adopts the symmetric normalization scheme, which is inspired from of semi supervised learning on graphs. The idea is to smooth an edge weight by the degree of its two connected vertices simultaneously:

$$p_j={\underset{i=1}{\overset{|U|}{\sum}}} {w_{ij} \over \sqrt d_i \sqrt d_j} u_i ; u_i={\underset{j=1}{\overset{|P|}{\sum}}} {w_{ij} \over \sqrt d_i \sqrt d_j} P_j$$

where $d_i$ and $d_j$ are the weighted degrees of vertices $u_i$ and $p_j$ , respectively. The use of symmetric normalization is a key characteristic of $BiRank$, allowing edges connected to a high degree vertex to be suppressed through normalization, lessening the contribution of high-degree vertices. This has the beneficial effect of toning down the dependence of top rankings on high degree vertices, a known defect of the random walk-based diffusion methods. This gives rise to better quality results.

To account for the query vector $p^0$ and $u^0$ that encode the prior belief on the importance of the vertices, one can either opt for 1) incorporating the graph ranking results for combination in post-processing ($a.k.a$ late fusion), or 2) factoring the query vector directly into the ranking process. The first way of post-processing yields a ranking that is a compromise between two rankings; for scenarios that the query vector defines a full ranking of vertices, this ensemble approach might be suitable. However, when the query vector only provides partial information this method fails to identify an optimal ranking. In $BiRank$ the second way opted that factors the query vector directly into the ranking process, which has the advantage of using the query vector to guide the ranking process:

$$p_j= \alpha {\underset{i=1}{\overset{|U|}{\sum}}} {w_{ij} \over \sqrt d_i \sqrt d_j} u_i +(1- \alpha) p_j^0$$

$$u_i= \beta {\underset{j=1}{\overset{|P|}{\sum}}} {w_{ij} \over \sqrt d_i \sqrt d_j} P_i +(1- \beta) u_i^0$$

where $\alpha$ and $\beta$ are hyper-parameters to weight the importance of the graph structure and the prior query vector, to be set between $[0, 1]$. To keep notation simple, it can also express the iteration in its equivalent matrix form:

$$p = \alpha S^T u + (1-\alpha) p^0;$$

$$u = \beta S p + (1-\beta) u^0;$$

where $S = D_u^{-1\over 2} WD_p^{-1\over 2}$, the symmetric normalization of weight matrix $W$. This set of update rules called the $BiRank$ iteration, which forms the core of the iterative $BiRank$ algorithm.

$BiRank$ ranks vertices by accounting for both the graph structure and prior knowledge. $BiRank$ is theoretically guaranteed to converge to a stationary solution, and can be explained as by both a regularization view and a Bayesian view.

#### References

- He X., Gao M., Kan M.Y., Wang D., 2017. BiRank: Towards Ranking on Bipartite Graphs. IEEE Transactions on Knowledge and Data Engineering, 29(1), pp.57-71. DOI: 10.1109/TKDE.2016.2611584