# Super Mediator

#### Definition

Super-mediators are those nodes that appear frequently in long diffusion sequences but much less frequently in short diffusion sequences. There are two approaches to calculate the super-mediator nodes in networks: The date-driven super-mediator and the Model-driven super-mediator. The data driven super-mediators that correspond to those identified by the super-mediator degree based on observed data, and the model-driven super-mediators that correspond to those identified by the super-mediator degree based on the diffusion model.

The super-mediator degree of a node $w$ based on observed data, $SMD_{data}(w)$, is defined as the following expected F-measure:

$$SMD_{data}(w)={\underset{v\in V}{\sum}}F(w;v)k(v)$$

where $k(v)$ stands for the probability that the node $v$ becomes an information source node, that is, an initial active node, which can be empirically estimated by: $k(v)=M(v)/ \sum_{v\in V} M(v)$.

The F-measure $F(w,v)$ is a widely used measure in information retrieval, is employed, which is the harmonic average of recall and precision of a node $w$ for the node $v$.

In case of the Model-driven super-mediator, if a node $w$ is a super-mediator, removing it would substantially decrease the average influence degree derived based on the underlying information diffusion model. $SMD_{model}(w)$ denotes as the difference in the average influence degree with respect to the node removal

$$SMD_{model}(w)={\underset{v\in V}{\sum}}\phi(v;G)|k(v) -{\underset{v\in V \{w\}}{\sum}} \phi (v;G{\backslash}\{w\}) k(v)$$

Finding the most influential super-mediator is finding a node that maximizes this difference. We can rank the super-mediators according to the amount of difference. This definition requires use of information diffusion model to estimate influence degree of each node, and thus it is model driven. The other definition of super-mediators, which we defined earlier and is more empirical, is that super-mediators are those nodes that appear frequently in long diffusion sequences but much less frequently in short diffusion sequences. This definition does not require any model but does require abundant information diffusion data and thus it is data-driven.

The super-mediator degree of a node $w$ based on observed data, $SMD_{data}(w)$, is defined as the following expected F-measure:

$$SMD_{data}(w)={\underset{v\in V}{\sum}}F(w;v)k(v)$$

where $k(v)$ stands for the probability that the node $v$ becomes an information source node, that is, an initial active node, which can be empirically estimated by: $k(v)=M(v)/ \sum_{v\in V} M(v)$.

The F-measure $F(w,v)$ is a widely used measure in information retrieval, is employed, which is the harmonic average of recall and precision of a node $w$ for the node $v$.

In case of the Model-driven super-mediator, if a node $w$ is a super-mediator, removing it would substantially decrease the average influence degree derived based on the underlying information diffusion model. $SMD_{model}(w)$ denotes as the difference in the average influence degree with respect to the node removal

$$SMD_{model}(w)={\underset{v\in V}{\sum}}\phi(v;G)|k(v) -{\underset{v\in V \{w\}}{\sum}} \phi (v;G{\backslash}\{w\}) k(v)$$

Finding the most influential super-mediator is finding a node that maximizes this difference. We can rank the super-mediators according to the amount of difference. This definition requires use of information diffusion model to estimate influence degree of each node, and thus it is model driven. The other definition of super-mediators, which we defined earlier and is more empirical, is that super-mediators are those nodes that appear frequently in long diffusion sequences but much less frequently in short diffusion sequences. This definition does not require any model but does require abundant information diffusion data and thus it is data-driven.