Kernels

We build our simple kernels using global, node-wise, and edge-wise properties of the network. Both datasets have the shape (N,n,n) where N is the number of networks in the dataset and n is the number of nodes in each network. Each graph is represented by an (n x n) adjacency matrix, where the existence of an edge between node i and node j is denoted by a nonzero value at the i,j th element.

For Matched Networks

Attached is the code for calculating the dissimilarity matrix using different kernels for matched networks.

Click to show
# HIDE CODE
import numpy as np

def calculate_dissim(graphs, method="density", norm=None, normalize=True):
    """ Calculate the dissimilarity matrix using the input kernel. """
    glob = False
    node = False
    edge = False

    if method == "density":
        glob = True
        num_nodes = graphs.shape[1]
        num_edges_possible = num_nodes ** 2 - num_nodes

        metric = np.zeros(len(graphs))
        for i, graph in enumerate(graphs):
            num_edges = np.count_nonzero(graph)
            metric[i] = num_edges / num_edges_possible
    
    elif method == "avgedgeweight":
        glob = True
        metric = np.zeros(len(graphs))
        for i, graph in enumerate(graphs):
            num_edges = np.count_nonzero(graph)
            metric[i] = np.sum(graph) / num_edges
    
    elif method == "avgadjmatrix":
        glob = True
        metric = np.zeros(len(graphs))
        for i, graph in enumerate(graphs):
            metric[i] = np.average(graph)   

    elif method == "degree":
        node = True
        metric = np.zeros((graphs.shape[0], graphs.shape[1]))
        for i, graph in enumerate(graphs):
            for j, row in enumerate(graph):
                metric[i, j] = np.count_nonzero(row)
    
    elif method == "strength":
        node = True
        metric = np.zeros((graphs.shape[0], graphs.shape[1]))
        for i, graph in enumerate(graphs):
            for j, row in enumerate(graph):
                metric[i, j] = np.sum(row)

    elif method == "edgeweight":
        edge = True
        metric = graphs
    
    else:
        print("Not a valid kernel name.")
    
    dissim_matrix = np.zeros((len(graphs), len(graphs)))

    for i, metric1 in enumerate(metric):
        for j, metric2 in enumerate(metric):
            if glob and norm == None:
                diff = np.abs(metric1- metric2)
            elif (node or edge) and norm == "l1":
                diff = np.linalg.norm(metric1 - metric2, ord=1)
            elif (node or edge) and norm == "l2":
                diff = np.linalg.norm(metric1 - metric2, ord=2)
            else:
                print("L1, L2 norms only apply to node or edge-wise kernels.")
            
            dissim_matrix[j, i] = diff

    if normalize:
        dissim_matrix = dissim_matrix / np.max(dissim_matrix)
    
    return dissim_matrix

Global Properties

For each of these kernels, each element of the (N x N) dissimilarity matrix is the absolute difference between the metrics, where N is the number of networks in the dataset.

Density

Graph density refers to the ratio between the number of edges present in the graph and the number of possible edges and is calculated by

Density=Number of Edges Presentn(n1)

where n is the number of nodes in the graph.

Average Edge Weight

To calculate average edge weight, we sum all the nonzero values in the adjacency matrix and divide by the number of edges present.

Average Edge Weight=Sum of Edge WeightsNumber of Edges Present

Average of the Adjacency Matrix

To calculate the average of the adjacency matrix, we sum all the nonzero values in the adjacency matrix and divide by the total number of possible edges. Note that if the graph is unweighted, the average of the adjacency matrix would be the same as graph density.

Average of Adj. Matrix=Sum of Edge Weightsn(n1)

where n is the number of nodes in the graph.

Node-wise Properties

For each of these kernels, two (N x N) dissimilarity matrices are constructed. Each element of the dissimilarity matrix is the L1-norm and L2-norm of the difference vector between the metrics.

Note

The L1-norm is the sum of the magnitude of the vector. Formally,

|x|1=i=1n|xi|

where xi for i=1,2,...,n are the elements of the vector x [Weisstein, n.d.].

Note

The L2-norm is the Euclidean distance from the origin to the vector. Formally,

|x|2=i=1n|xi|2

where xi for i=1,2,...,n are the elements of the vector x [Weisstein, n.d.].

Node Degree

The degree of a node is the number of edges a node has. In an (n x n) adjacency matrix A, the degree of node i is calculated by j=1naj where aj=1 if Aij0 and aj=0 if Aij=0. Thus for each graph, we have an (i x 1) vector where each element is the degree of node i, and the dissimilarity matrix is constructed based on these vectors.

Node Strength

The strength of a node is the sum of the weights of the edges a node has. In an (n x n) adjacency matrix A, the degree of node i is calculated by j=1nAij. Thus for each graph, we have an (i x 1) vector where each element is the strength of node i, and the dissimilarity matrix is constructed based on these vectors.

Edge-wise Properties

Similar to the kernels above, two (N x N) dissimilarity matrices are constructed based on the L1-matrix norm and L2-matrix norm of the difference matrix between the metrics.

Note

The L1-matrix norm is also called the maximum absolute column sum norm and is calculated by

||A1||=maxji=1n|Aij|

[Weisstein, n.d.].

Note

The L2-matrix norm is also called the spectral norm and is calculated by

||A2||=(max eigenvalue ofAHA)1/2

where AH is the conjugate transpose matrix. In our case, matrix A is an adjacency matrix and thus always real, so AH=AT [Weisstein, n.d.].

Edge Weight

For each pair of (n x n) graphs, the difference in edge weights is calculated by the difference in their adjacency matrices, yielding an (n x n) difference matrix. Then the dissimilarity matrix is constructed based on the L1 and L2-matrix norm of these matrices.

For Unmatched Networks

The kernels we use above assume the nodes are matched. However, when we have no known matching of the nodes, we cannot use kernels based on the node-wise or edge-wise properties of the network because we cannot calculate a difference vector or matrix directly and calculate its L1 and L2 norm. Thus, we flatten the difference vector or matrix for those kernels and find the two-sample Kolmogorov-Smirnov test statistic for the two “distributions”. The metrics remain the same. The modified code is attached below.

Click to show
# HIDE CODE
from scipy.stats import ks_2samp

def calculate_dissim_unmatched(graphs, method="degree", normalize=True):
    """ Calculate the dissimilarity matrix using the input kernel. """
    if method == "degree":
        metric = np.zeros((graphs.shape[0], graphs.shape[1]))
        for i, graph in enumerate(graphs):
            for j, node in enumerate(graph):
                metric[i, j] = np.count_nonzero(node)
    
    elif method == "strength":
        metric = np.zeros((graphs.shape[0], graphs.shape[1]))
        for i, graph in enumerate(graphs):
            for j, node in enumerate(graph):
                metric[i, j] = np.sum(node)

    elif method == "edgeweight":
        metric = []
        for graph in graphs:
            metric.append(np.ravel(np.nonzero(graph)))
            
    else:
        print("Not a valid kernel name.")
    
    dissim_matrix = np.zeros((len(graphs), len(graphs)))
    for i, metric1 in enumerate(metric):
        for j, metric2 in enumerate(metric):
            diff, _ = ks_2samp(np.array(metric1), np.array(metric2), mode='asymp')
            dissim_matrix[i, j] = diff
            
    if normalize:
        dissim_matrix = dissim_matrix / np.max(dissim_matrix)
    
    return dissim_matrix

Established Kernels

The following kernels use existing algorithms and/or equations that are much more complex and computationally expensive than those described above. We use them as a point of comparison for the performance for our “simple” kernels.

Laplacian Spectral Distance

We compute the I-DAD Laplacian of each graph, which is calculated by

L=IDiADi

where D is the diagonal matrix of degrees of node i raised to the -1/2 power, I is the identity matrix, and A is the adjacency matrix.

Graphs can often be characterized by its Laplacian eigenspectrum, which makes this a useful metric to consider when assessing similarity between graphs [Das, 2004]. The I-DAD Laplacian is particularly useful because all its eigenvalues are real, and for undirected graphs, all eigenvalues are guaranteed to be in the range [0, 2] [Suárez et al., 2022].

We implement the Laplacian spectral kernel by applying pass-to-ranks transformations on the networks, sorting the eigenvalues of the graph Laplacian without any smoothing, then calculating the L2-norm of the difference vector to construct the dissimilarity matrix.

Note

Using the code attached below, there are several ways we can derive and use the Laplacian eigenspectrum. We can transform the graphs through pass-to-ranks or binarization before finding its graph Laplacian, and we can simply sort its eigenvalues or smooth them using a Gaussian kernel to construct the eigenspectrum. We also have multiple ways to quantify dissimilarity, such as calculating the cosine difference between two eigenspectra, or by finding the L1-norm, or L2-norm of the difference vector between two eigenspectra. The implementation we state above is based on which yields the highest discriminiability value on the matched networks. More details can be found in the Laplacian section in the Appendix.

We use the Laplacian spectral kernel for both matched and unmatched networks.

Click to show
# HIDE CODE
import numpy as np
from graspologic.utils import pass_to_ranks, to_laplacian
from sklearn.neighbors import KernelDensity
from scipy.spatial.distance import cosine

def laplacian_dissim(graphs, transform: str=None, metric: str='l2', smooth_eigvals: bool=False, \
    normalize=True):
    if transform == 'pass-to-ranks':
        for i, graph in enumerate(graphs):
            graph = pass_to_ranks(graph)
            graphs[i] = graph
    elif transform == 'binarize':
        graphs[graphs != 0] = 1
    elif transform == None:
        graphs = graphs
    else:
        print('Supported transformations are "pass-to-ranks" (simple-nonzero), "binarize", or None.')
    
    eigs = []
    for i, graph in enumerate(graphs):
        # calculate laplacian
        lap = to_laplacian(graph, 'I-DAD')

        # find and sort eigenvalues
        w = np.linalg.eigvals(lap)
        w = np.sort(w)

        if smooth_eigvals:
            kde = KernelDensity(kernel='gaussian', bandwidth=0.015).fit(w.reshape(-1, 1))
            xs = np.linspace(0, 2, 2000)
            xs = xs[:, np.newaxis]
            log_dens = kde.score_samples(xs)
            eigs.append(np.exp(log_dens))
        else:
            eigs.append(w)

    dissim_matrix = np.zeros((len(graphs), len(graphs)))
    for i, eig1 in enumerate(eigs):
        for j, eig2 in enumerate(eigs):
            if metric == 'cosine':
                diff = cosine(eig1, eig2)
            elif metric == 'l1':
                diff = np.linalg.norm(eig1 - eig2, ord=1)
            elif metric == 'l2':
                diff = np.linalg.norm(eig1 - eig2, ord=2)
            dissim_matrix[i, j] = diff

    if normalize:
        dissim_matrix = dissim_matrix / np.max(dissim_matrix)
    
    return dissim_matrix

For Matched Networks - Omnibus Embedding

We generate omnibus embeddings of the matched networks, then calculate the Frobenius norm of the difference between two embeddings to construct the dissimilarity matrix.

Note

The Frobenius norm of a matrix A is given by

||A||F=(i,j|Aij|2)1/2

[Golub and Van Loan, 1985].

Click to show
# HIDE CODE
from graspologic.embed import OmnibusEmbed

# embed using Omni
embedder = OmnibusEmbed(n_elbows=4)
omni_embedding = embedder.fit_transform(graphs)

omni_matrix = np.zeros((len(graphs), len(graphs)))
for i, embedding1 in enumerate(omni_embedding):
    for j, embedding2 in enumerate(omni_embedding):
        dist = np.linalg.norm(embedding1 - embedding2, ord="fro")
        omni_matrix[i, j] = dist

# scale values from 0 to 1
scaled_omni_dissim = omni_matrix / np.max(omni_matrix)

For Unmatched Networks - Latent Distribution Test Statistic

The latent distribution test is a two-sample hypothesis test for whether or not two graphs have the same distributions of latent positions. This test does not assume any known matching, as it can use alignment techniques such as seedless-procrustes to find the best orthogonal alignment between two embeddings.

First we generate adjacency spectral embeddings of the largest connected component of the unmatched graphs, then calculate the latent distribution test statistic for every pair to construct the dissimilarity matrix using parameters shown in the code below.

Click to show
# HIDE CODE
from graspologic.inference import latent_distribution_test
from graspologic.embed import AdjacencySpectralEmbed
from graspologic.utils import largest_connected_component
from tqdm import tqdm
import warnings
warnings.filterwarnings("ignore")

ase_graphs = []
for i, graph in enumerate(graphs):
    lcc_graph = largest_connected_component(graph)
    ase_graph = AdjacencySpectralEmbed(n_components=4).fit_transform(lcc_graph)
    ase_graphs.append(ase_graph)

# calculate alignments
latent_dissim = np.zeros((len(ase_graphs), len(ase_graphs)))
Qs = []
for j in tqdm(range(0, len(ase_graphs) - 1)):
    for i in range(j+1, len(ase_graphs)):
        statistic, _, misc_dict = latent_distribution_test(ase_graphs[i], ase_graphs[j], test='mgc', metric='euclidean', \
            n_bootstraps=0, align_type='seedless_procrustes', input_graph=False)
        latent_dissim[i, j] = statistic
        Qs.append(misc_dict['Q'])

# scale values from 0 to 1 and make matrix symmetric
scaled_latent_dissim = latent_dissim / np.max(latent_dissim)
scaled_latent_dissim = scaled_latent_dissim + scaled_latent_dissim.T

References

1

Laura E Suárez, Yossi Yovel, Martijin P Van den Heuvel, Olaf Sporns, Yaniv Assaf, Guillaume Lajoie, and Bratislav Misic. A connectomics-based taxonomy of mammals. bioRxiv, 03 2022. doi:10.1101/2022.03.11.483995.

2

Eric W. Weisstein. L^1-norm. Accessed May 16, 2022 [Online]. URL: https://mathworld.wolfram.com/L1-Norm.html.

3

Eric W. Weisstein. L^2-norm. Accessed May 16, 2022 [Online]. URL: https://mathworld.wolfram.com/L2-Norm.html.

4(1,2)

Eric W. Weisstein. Matrix norm. Accessed May 16, 2022 [Online]. URL: https://mathworld.wolfram.com/MatrixNorm.html.

5

K. Ch. Das. The laplacian spectrum of a graph. Computers & Mathematics with Applications, 48(5-6):715–724, 09 2004. doi:10.1016/j.camwa.2004.05.005.

6

Gene H. Golub and Charles F. Van Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, MD, 1985.