Multiple Network Models
Contents
4.7. Multiple Network Models#
Up to this point, you have studied network models which are useful for a single network. What do you do if you have multiple networks?
Let’s imagine that you have a company with
Click to show
from graphbook_code import heatmap
from graspologic.simulations import sbm
import numpy as np
import matplotlib.pyplot as plt
K = 3
B1 = np.zeros((K,K))
B1[0,0] = 0.3
B1[0,1] = .05
B1[1,0] = .05
B1[0,2] = .05
B1[2,0] = .05
B1[1,1] = 0.3
B1[1,2] = .05
B1[2,1] = .05
B1[2,2] = 0.3
B2 = B1
B3 = np.copy(B1)
B3[1,1] = 0.9
ns = [25, 10, 10]
A1 = sbm(n=ns, p=B1, directed=False, loops=False)
ys = ["ML" for i in range(0, ns[0])] + ["Adm." for i in range(ns[0],ns[0] + ns[1])] + ["Mar." for i in range(ns[0] + ns[1],ns[0] + ns[1] + ns[2])]
A2 = sbm(n=ns, p=B2, directed=False, loops=False)
A3 = sbm(n=ns, p=B3, directed=False, loops=False)
fig, axs = plt.subplots(1,3, figsize=(30, 6))
heatmap(A1, ax=axs[0], title="$A^{(1)}$ Facebook", color="sequential", xticklabels=False, yticklabels=False, inner_hier_labels=ys);
heatmap(A2, ax=axs[1], title="$A^{(2)}$ Twitter", color="sequential", xticklabels=False, yticklabels=False, inner_hier_labels=ys);
heatmap(A3, ax=axs[2], title="$A^{(3)}$ Linkedin", color="sequential", xticklabels=False, yticklabels=False, inner_hier_labels=ys);

Remember that a random network has an adjacency matrix denoted by a boldfaced uppercase
What this description lacks is that the three networks share a lot of common information. You might expect that, on some level, the latent position matrices should also show some sort of common structure. However, since you used a unique latent position matrix
In the below descriptions, you will tend to build off the Random Dot Product Graph (RDPG), and closely related variations of it. Why? Well, as it turns out, the RDPG is extremely flexible, in that you can represent both ER and SBMs as RDPGs, too. You’ll remember from The IER network unifies independent-edge random networks, as it is the most general independent-edge random network model. What this means is that all models discussed so far, and many not discussed but found in the appendix, can be conceptualized as special cases of the IER network. Stated in other words, a probability matrix is the fundamental unit necessary to describe an independent-edge random network model. For the models covered so far, all GRDPGs are IER networks (by taking P to be XY^\top), all RDPGs are GRDPGs (by taking the right latent positions Y to be equal to the left latent positions X), all SBMs are RDPGs, and all ERs are SBMs, whereas the converses in general are not true. that the ER network was a special case of the SBM and SBMs were special cases of RDPG networks, so the RDPG can describe any network which is an RDPG, an SBM, or an ER random network. This means that building off the RDPG gives you multiple random network models that will be inherently flexible. Further, as you will see in the later section on Learning Network Representations, the RDPG is extremely well-suited for the situation in which you want to analyze SBMs, but do not know which communities the nodes are in ahead of time. This situation is extremely common across numerous disciplines of network machine learning, such as social networking, neuroscience, and many other fields.
So, how can you model your collection of random networks with shared structure?
4.7.1. Joint Random Dot Product Graphs (JRDPG) Model#
In our example on social networks, notice that the Facebook and Twitter connections do not look too qualitatively different. It looks like they have relatively similar connectivity patterns between the different employee working groups, and you might even think that the underlying random network that governs these two social networks are identical. In statisical science, when we have a collection of
A homogeneous collection of independent-edge random networks
The probability matrices
Click to show
import seaborn as sns
from graphbook_code import cmaps
import pandas as pd
def p_from_block(B, ns):
P = np.zeros((np.sum(ns), np.sum(ns)))
ns_cumsum = np.append(0, np.cumsum(ns))
for k, n1 in enumerate(ns_cumsum[0:(len(ns_cumsum)-1)]):
for l, n2 in enumerate(ns_cumsum[k:(len(ns_cumsum)-1)]):
P[ns_cumsum[k]:ns_cumsum[k+1], ns_cumsum[l+k]:ns_cumsum[l+k+1]] = B[k,l+k]
P[ns_cumsum[l+k]:ns_cumsum[l+k+1],ns_cumsum[k]:ns_cumsum[k+1]] = B[l+k,k]
return P
P1 = p_from_block(B1, ns)
U1, S1, V1 = np.linalg.svd(P1)
X1 = U1[:,0:3] @ np.sqrt(np.diag(S1[0:3]))
P2 = p_from_block(B2, ns)
U2, S2, V2 = np.linalg.svd(P2)
X2 = U2[:,0:3] @ np.sqrt(np.diag(S2[0:3]))
P3 = p_from_block(B3, ns)
U3, S3, V3 = np.linalg.svd(P3)
X3 = U3[:,0:3] @ np.sqrt(np.diag(S3[0:3]))
def plot_prob_block_annot(X, title="", nodename="Employee", nodetix=None,
nodelabs=None, ax=None, annot=True):
if (ax is None):
fig, ax = plt.subplots(figsize=(8, 6))
with sns.plotting_context("talk", font_scale=1):
if annot:
X_annot = np.empty(X.shape, dtype='U4')
X_annot[13, 13] = str("{:.3f}".format(X[13,13]))
X_annot[30, 30] = str("{:.3f}".format(X[30,30]))
X_annot[40,40] = str("{:.3f}".format(X[40,40]))
X_annot[13, 30] = str("{:.3f}".format(X[13,30]))
X_annot[30,13] = str("{:.3f}".format(X[30,13]))
X_annot[13,40] = str("{:.3f}".format(X[23,40]))
X_annot[40,13] = str("{:.3f}".format(X[40,13]))
X_annot[30,40] = str("{:.3f}".format(X[30,40]))
X_annot[40,30] = str("{:.3f}".format(X[40,30]))
ax = sns.heatmap(X, cmap="Purples",
ax=ax, cbar_kws=dict(shrink=1), yticklabels=False,
xticklabels=False, vmin=0, vmax=1, annot=pd.DataFrame(X_annot),
fmt='')
else:
ax = sns.heatmap(X, cmap="Purples",
ax=ax, cbar_kws=dict(shrink=1), yticklabels=False,
xticklabels=False, vmin=0, vmax=1)
ax.set_title(title)
cbar = ax.collections[0].colorbar
ax.set(ylabel=nodename, xlabel=nodename)
if (nodetix is not None) and (nodelabs is not None):
ax.set_yticks(nodetix)
ax.set_yticklabels(nodelabs)
ax.set_xticks(nodetix)
ax.set_xticklabels(nodelabs)
cbar.ax.set_frame_on(True)
return
fig, axs = plt.subplots(1,3, figsize=(25, 6))
plot_prob_block_annot(P1, ax=axs[0], title="$P^{(1)}$ Facebook Prob. Matrix")
plot_prob_block_annot(P2, ax=axs[1], title="$P^{(2)}$ Twitter Prob. Matrix")
plot_prob_block_annot(np.abs(P1 - P2), ax=axs[2], annot=False, title="$|P^{(1)} - P^{(2)}|$")

The Joint Random Dot Product Graphs (JRDPG) is the simplest way you can extend the RDPG random network model to multiple random networks. The way you can think of the JRDPG model is that for each of your
4.7.1.1. The JRDPG model does not allow you to convey unique aspects about the networks#
Under the JRDPG model, each of the
For an RDPG, since
Click to show
from numpy.linalg import svd
def plot_latent(X, title="", nodename="Employee", ylabel="Latent Dimension",
nodetix=None, vmin=None, vmax=None, nodelabs=None, ax=None, dimtix=None, dimlabs=None):
if ax is None:
fig, ax = plt.subplots(figsize=(6, 6))
with sns.plotting_context("talk", font_scale=1):
lim_max = np.max(np.abs(X))
if vmin is None:
vmin = -lim_max
if vmax is None:
vmax = lim_max
X_annot = np.empty((45, 3), dtype='U6')
X_annot[13,0] = str("{:.3f}".format(X[13,0]))
X_annot[30,0] = str("{:.3f}".format(X[30,0]))
X_annot[40,0] = str("{:.3f}".format(X[40,0]))
X_annot[13,1] = str("{:.3f}".format(X[13,1]))
X_annot[30,1] = str("{:.3f}".format(X[30,1]))
X_annot[40,1] = str("{:.3f}".format(X[40,1]))
X_annot[13,2] = str("{:.3f}".format(X[13,2]))
X_annot[30,2] = str("{:.3f}".format(X[30,2]))
X_annot[40,2] = str("{:.3f}".format(X[40,2]))
ax = sns.heatmap(X, cmap=cmaps["divergent"],
ax=ax, cbar_kws=dict(shrink=1), yticklabels=False,
xticklabels=False, vmin=vmin, vmax=vmax,
annot=X_annot, fmt='')
ax.set_title(title)
cbar = ax.collections[0].colorbar
ax.set(ylabel=nodename, xlabel=ylabel)
if (nodetix is not None) and (nodelabs is not None):
ax.set_yticks(nodetix)
ax.set_yticklabels(nodelabs)
if (dimtix is not None) and (dimlabs is not None):
ax.set_xticks(dimtix)
ax.set_xticklabels(dimlabs)
cbar.ax.set_frame_on(True)
return
fig, axs = plt.subplots(1,3, figsize=(20, 6))
vmin = np.min([X1,X2,X3]); vmax = np.max([X1, X2, X3])
plot_latent(X1, nodetix=[0,24,34,44], vmin=vmin, vmax=vmax, nodelabs=["1", "25", "35", "45"],
dimtix=[0.5,1.5,2.5], ax=axs[0], dimlabs=["1", "2", "3"], title="$X^{(1)}$ Facebook LPM")
plot_latent(X2, nodetix=[0,24,34,44], nodelabs=["1", "25", "35", "45"],
dimtix=[0.5,1.5,2.5], ax=axs[1], vmin=vmin, vmax=vmax, dimlabs=["1", "2", "3"], title="$X^{(2)}$ Twitter LPM")
plot_latent(np.abs(X1 - X2), nodetix=[0,24,34,44], vmin=vmin, vmax=vmax, nodelabs=["1", "25", "35", "45"],
dimtix=[0.5,1.5,2.5], ax=axs[2], dimlabs=["1", "2", "3"], title="$|X^{(1)} - X^{(2)}|$")

So the latent position matrices for Facebook and Twitter are exactly identical. Therefore, the collection of random networks
What about Linkedin? Well, as it turns out, Linkedin had a different latent position matrix from both Facebook and Twitter:
Click to show
fig, axs = plt.subplots(1,3, figsize=(20, 6))
vmin = np.min([X1,X2,X3]); vmax = np.max([X1, X2, X3])
plot_latent(X1, nodetix=[0,24,34,44], vmin=vmin, vmax=vmax, nodelabs=["1", "25", "35", "45"],
dimtix=[0.5,1.5,2.5], ax=axs[0], dimlabs=["1", "2", "3"], title="$X^{(1)}$ Facebook LPM")
plot_latent(X3, nodetix=[0,24,34,44], nodelabs=["1", "25", "35", "45"],
dimtix=[0.5,1.5,2.5], ax=axs[1], vmin=vmin, vmax=vmax, dimlabs=["1", "2", "3"], title="$X^{(3)}$ Linkedin LPM")
plot_latent(np.abs(X1 - X3), nodetix=[0,24,34,44], vmin=vmin, vmax=vmax, nodelabs=["1", "25", "35", "45"],
dimtix=[0.5,1.5,2.5], ax=axs[2], dimlabs=["1", "2", "3"], title="$|X^{(1)} - X^{(3)}|$")

So,
4.7.2. Common Subspace Independent Edge (COSIE) Model#
In your example on social networks, notice that the Facebook and Linkedin connections looked relatively similar, but had an important difference: on Linkedin, there was an administrative meeting, and the employees on the administrative team exchanged far more connections than usual among one another. It turns out that, in fact, the random networks
Click to show
fig, axs = plt.subplots(1,3, figsize=(25, 6))
plot_prob_block_annot(P1, ax=axs[0], title="$P^{(1)}$ Facebook Prob. Matrix")
plot_prob_block_annot(P3, ax=axs[1], title="$P^{(3)}$ Linkedin Prob. Matrix")
plot_prob_block_annot(np.abs(P1 - P3), ax=axs[2], annot=True, title="$|P^{(1)} - P^{(3)}|$")

Notice that the difference in the probability of a connection being shared between the members of the administrative team is
4.7.4. References#
- 1
Avanti Athreya, Donniell E. Fishkind, Minh Tang, Carey E. Priebe, Youngser Park, Joshua T. Vogelstein, Keith Levin, Vince Lyzinski, and Yichen Qin. Statistical inference on random dot product graphs: a survey. J. Mach. Learn. Res., 18(1):8393–8484, January 2017. doi:10.5555/3122009.3242083.
- 2
Keith Levin, A. Athreya, M. Tang, V. Lyzinski, and C. Priebe. A Central Limit Theorem for an Omnibus Embedding of Multiple Random Dot Product Graphs. 2017 IEEE International Conference on Data Mining Workshops (ICDMW), 2017. URL: https://www.semanticscholar.org/paper/A-Central-Limit-Theorem-for-an-Omnibus-Embedding-of-Levin-Athreya/fdc658ee1a25c511d7da405a9df7b30b613e8dc8.
- 3
Jesús Arroyo, Avanti Athreya, Joshua Cape, Guodong Chen, Carey E. Priebe, and Joshua T. Vogelstein. Inference for Multiple Heterogeneous Networks with a Common Invariant Subspace. Journal of Machine Learning Research, 22(142):1–49, 2021. URL: https://jmlr.org/papers/v22/19-558.html.
- 4
Vince Lyzinski, Donniell E. Fishkind, and Carey E. Priebe. Seeded graph matching for correlated Erdös-Rényi graphs. J. Mach. Learn. Res., 15(1):3513–3540, January 2014. doi:10.5555/2627435.2750357.
- 5
Konstantinos Pantazis, Avanti Athreya, Jesús Arroyo, William N. Frost, Evan S. Hill, and Vince Lyzinski. The Importance of Being Correlated: Implications of Dependence in Joint Spectral Inference across Multiple Networks. Journal of Machine Learning Research, 23(141):1–77, 2022. URL: https://jmlr.org/papers/v23/20-944.html.