SSC - Source/Sink Centrality


Definition

SSC-Source/Sink Centrality is a directed graph framework, to address the limitations of standard models. $SSC$ separately measures the importance of a node in the upstream and the downstream of a pathway, as a sender and a receiver of biological signals, and combines the two terms for evaluating the centrality.

This method apples as an extension for standard centrality models .PageRank Centrality is a spectral centrality measure where the importance of a node is a function of the centrality of its neighbors. In its original definition, PageRank describes the probability distribution of a uniform random walk with restart being present at each node of a graph after a large number of steps. In graph theory terms, the PageRank of a node $v$ is based on the PageRank of the nodes with links to $v$, divided by their out degrees. Formally:

$$C_{pgr}(v_i)=\beta_i + {\underset{v_j|v_i \in N_G(v_i)}{\sum}} {C_{pgr}(v_j)\over |N_G(v_j)|}$$

$\beta^i$’s are constant values that relate the probability of restarting at node $v_i$. The parameter $\alpha$ is a dampening factor that relates to the transition probability of the random walk. The previous Formula can be expressed in a vectorized format as following:
$$C_{pgr}=\beta + \alpha A^T D^{-1} C_{pgr}$$

where $C_{pgr}$ is the vector of centralities and $\beta$ is the vector of initial values. $D$ is the diagonal (out) degree matrix such that $[D]_{ii} = max (deg_{(+)}(v_i), 1)$. A closed form solution of the first Formula is achieved by solving for $C_{pgr}$. Formally:

$$C_{pgr}= (I − \alpha A^T D^{−1})^ {−1} \beta$$

In the Sink component of the PageRank, the downstream nodes have the higher importance. This is because a random walk will not be present at any node without incoming edges, unless by a restart event. The PageRank Sink centrality captures the importance of a node as a receiver of information. Formally the Sink PageRank centrality $(C_{pgr}^{Si})$ defined as:

$$(C_{pgr}^{Si})(v):= C_{pgr}(v)$$

To modify PageRank in such a way that captures the importance of nodes as source of signal, we derive a PageRank score when applied to the transpose of a graph. Formally, we define the PageRank Source $C_{pgr}^{So}$ as:

$$C_{pgr}^{So} (v_i) = \beta_i + \alpha {\underset{v_j|v_i \in N_G(v_i)}{\sum}} {C_{pgr}^{So} (v_j)\over |N_{GT} (v_j)|}$$

$\beta_i$ and $\alpha$ are constants that relate to the restart and transition probabilities. The PageRank Source of a node is calculated based on the centrality of a its neighbors in the transposed graphs. Define the diagonal in-degree matrix, $D^\prime$, of $G$ such that $[D^\prime]_{ii} = max(1, deg^- (vi))$. Similar to the equations for deriving the standard PageRank, the Source component can be solved as following:

$$C_{pgr}^{So}= (I − \alpha AD^{\prime−1})^{-1} \beta$$

Directed centralities only gives importance to either upstream nodes or downstream ones. To address this issue the Source/Sink PageRank defined . The fundamental concept of Source/Sink modeling is to measure the centrality of nodes as both sources and sinks of information. The Source/Sink concept to the PageRank adapted by calculating Source and Sink Centrality values individually and summing them:

$$C_{pgr}^{SS}(v) = C_{pgr}^{So}(v) + C_{pgr}^{Si}(v)$$

The above definition has no limitation of using different constant parameters for $C_{pgr}^{So}$ and $C_{pgr}^{Si}$.

This method considers a better topological description of the pathways that accounts for the importance of the pathway elements with respect to the upstream and downstream positions.


References

  • Yeganeh, P.N., Richardson, C., Saule, E., Loraine, A. and Mostafavi, M.T., 2020. Revisiting the use of graph centrality models in biological pathway analysis. BioData mining, 13(1), pp.1-23. DOI: 10.1186/s13040-020-00214-x Publisher web site


Comments

There are no comment yet.

Add your comment

Name:
Email:
Sum of    and  

The rendering mode: