Kernels

Kernels¶

We build our simple kernels using global, node-wise, and edge-wise properties of the network. Both datasets have the shape \((N, n, n)\) where \(N\) is the number of networks in the dataset and \(n\) is the number of nodes in each network. Each graph is represented by an \((n\) x \(n)\) adjacency matrix, where the existence of an edge between node \(i\) and node \(j\) is denoted by a nonzero value at the \(i,j\) th element.

For Matched Networks¶

Attached is the code for calculating the dissimilarity matrix using different kernels for matched networks.

# HIDE CODE
import numpy as np

def calculate_dissim(graphs, method="density", norm=None, normalize=True):
    """ Calculate the dissimilarity matrix using the input kernel. """
    glob = False
    node = False
    edge = False

    if method == "density":
        glob = True
        num_nodes = graphs.shape[1]
        num_edges_possible = num_nodes ** 2 - num_nodes

        metric = np.zeros(len(graphs))
        for i, graph in enumerate(graphs):
            num_edges = np.count_nonzero(graph)
            metric[i] = num_edges / num_edges_possible
    
    elif method == "avgedgeweight":
        glob = True
        metric = np.zeros(len(graphs))
        for i, graph in enumerate(graphs):
            num_edges = np.count_nonzero(graph)
            metric[i] = np.sum(graph) / num_edges
    
    elif method == "avgadjmatrix":
        glob = True
        metric = np.zeros(len(graphs))
        for i, graph in enumerate(graphs):
            metric[i] = np.average(graph)   

    elif method == "degree":
        node = True
        metric = np.zeros((graphs.shape[0], graphs.shape[1]))
        for i, graph in enumerate(graphs):
            for j, row in enumerate(graph):
                metric[i, j] = np.count_nonzero(row)
    
    elif method == "strength":
        node = True
        metric = np.zeros((graphs.shape[0], graphs.shape[1]))
        for i, graph in enumerate(graphs):
            for j, row in enumerate(graph):
                metric[i, j] = np.sum(row)

    elif method == "edgeweight":
        edge = True
        metric = graphs
    
    else:
        print("Not a valid kernel name.")
    
    dissim_matrix = np.zeros((len(graphs), len(graphs)))

    for i, metric1 in enumerate(metric):
        for j, metric2 in enumerate(metric):
            if glob and norm == None:
                diff = np.abs(metric1- metric2)
            elif (node or edge) and norm == "l1":
                diff = np.linalg.norm(metric1 - metric2, ord=1)
            elif (node or edge) and norm == "l2":
                diff = np.linalg.norm(metric1 - metric2, ord=2)
            else:
                print("L1, L2 norms only apply to node or edge-wise kernels.")
            
            dissim_matrix[j, i] = diff

    if normalize:
        dissim_matrix = dissim_matrix / np.max(dissim_matrix)
    
    return dissim_matrix

Global Properties¶

For each of these kernels, each element of the \((N\) x \(N)\) dissimilarity matrix is the absolute difference between the metrics, where \(N\) is the number of networks in the dataset.

Density¶

Graph density refers to the ratio between the number of edges present in the graph and the number of possible edges and is calculated by

\[ \text{Density} = \frac{\text{Number of Edges Present}}{n(n-1)}\]

where \(n\) is the number of nodes in the graph.

Average Edge Weight¶

To calculate average edge weight, we sum all the nonzero values in the adjacency matrix and divide by the number of edges present.

\[ \text{Average Edge Weight} = \frac{\text{Sum of Edge Weights}}{\text{Number of Edges Present}}\]

Average of the Adjacency Matrix¶

To calculate the average of the adjacency matrix, we sum all the nonzero values in the adjacency matrix and divide by the total number of possible edges. Note that if the graph is unweighted, the average of the adjacency matrix would be the same as graph density.

\[ \text{Average of Adj. Matrix} = \frac{\text{Sum of Edge Weights}}{n(n-1)}\]

where \(n\) is the number of nodes in the graph.

Node-wise Properties¶

For each of these kernels, two \((N\) x \(N)\) dissimilarity matrices are constructed. Each element of the dissimilarity matrix is the \(L^1\)-norm and \(L^2\)-norm of the difference vector between the metrics.

Note

The \(L^1\)-norm is the sum of the magnitude of the vector. Formally,

\[ |\boldsymbol{x}|_1 = \sum_{i=1}^n |x_i|\]

where \(x_i\) for \(i = 1, 2, ..., n\) are the elements of the vector \(\boldsymbol{x}\) [Weisstein, n.d.].

Note

The \(L^2\)-norm is the Euclidean distance from the origin to the vector. Formally,

\[ |\boldsymbol{x}|_2 = \sqrt{\sum_{i=1}^n |x_i|^2}\]

where \(x_i\) for \(i = 1, 2, ..., n\) are the elements of the vector \(\boldsymbol{x}\) [Weisstein, n.d.].

Node Degree¶

The degree of a node is the number of edges a node has. In an \((n\) x \(n)\) adjacency matrix \(A\), the degree of node \(i\) is calculated by \(\sum_{j=1}^n a_{j}\) where \(a_{j}=1\) if \(A_{ij} \neq 0\) and \(a_{j}=0\) if \(A_{ij} = 0\). Thus for each graph, we have an \((i\) x \(1)\) vector where each element is the degree of node \(i\), and the dissimilarity matrix is constructed based on these vectors.

Node Strength¶

The strength of a node is the sum of the weights of the edges a node has. In an \((n\) x \(n)\) adjacency matrix \(A\), the degree of node \(i\) is calculated by \(\sum_{j=1}^n A_{ij}\). Thus for each graph, we have an \((i\) x \(1)\) vector where each element is the strength of node \(i\), and the dissimilarity matrix is constructed based on these vectors.

Edge-wise Properties¶

Similar to the kernels above, two \((N\) x \(N)\) dissimilarity matrices are constructed based on the \(L^1\)-matrix norm and \(L^2\)-matrix norm of the difference matrix between the metrics.

Note

The \(L^1\)-matrix norm is also called the maximum absolute column sum norm and is calculated by

\[ ||\boldsymbol{A}_1|| = \text{max}_j \sum_{i=1}^n |A_ij|\]

[Weisstein, n.d.].

Note

The \(L^2\)-matrix norm is also called the spectral norm and is calculated by

\[ ||\boldsymbol{A}_2|| = (\text{max eigenvalue of} A^H A)^{1/2}\]

where \(A^H\) is the conjugate transpose matrix. In our case, matrix \(A\) is an adjacency matrix and thus always real, so \(A^H = A^T\) [Weisstein, n.d.].

Edge Weight¶

For each pair of \((n\) x \(n)\) graphs, the difference in edge weights is calculated by the difference in their adjacency matrices, yielding an \((n\) x \(n\)) difference matrix. Then the dissimilarity matrix is constructed based on the \(L^1\) and \(L^2\)-matrix norm of these matrices.

For Unmatched Networks¶

The kernels we use above assume the nodes are matched. However, when we have no known matching of the nodes, we cannot use kernels based on the node-wise or edge-wise properties of the network because we cannot calculate a difference vector or matrix directly and calculate its \(L^1\) and \(L^2\) norm. Thus, we flatten the difference vector or matrix for those kernels and find the two-sample Kolmogorov-Smirnov test statistic for the two “distributions”. The metrics remain the same. The modified code is attached below.

# HIDE CODE
from scipy.stats import ks_2samp

def calculate_dissim_unmatched(graphs, method="degree", normalize=True):
    """ Calculate the dissimilarity matrix using the input kernel. """
    if method == "degree":
        metric = np.zeros((graphs.shape[0], graphs.shape[1]))
        for i, graph in enumerate(graphs):
            for j, node in enumerate(graph):
                metric[i, j] = np.count_nonzero(node)
    
    elif method == "strength":
        metric = np.zeros((graphs.shape[0], graphs.shape[1]))
        for i, graph in enumerate(graphs):
            for j, node in enumerate(graph):
                metric[i, j] = np.sum(node)

    elif method == "edgeweight":
        metric = []
        for graph in graphs:
            metric.append(np.ravel(np.nonzero(graph)))
            
    else:
        print("Not a valid kernel name.")
    
    dissim_matrix = np.zeros((len(graphs), len(graphs)))
    for i, metric1 in enumerate(metric):
        for j, metric2 in enumerate(metric):
            diff, _ = ks_2samp(np.array(metric1), np.array(metric2), mode='asymp')
            dissim_matrix[i, j] = diff
            
    if normalize:
        dissim_matrix = dissim_matrix / np.max(dissim_matrix)
    
    return dissim_matrix

Established Kernels¶

The following kernels use existing algorithms and/or equations that are much more complex and computationally expensive than those described above. We use them as a point of comparison for the performance for our “simple” kernels.

Laplacian Spectral Distance¶

We compute the I-DAD Laplacian of each graph, which is calculated by

\[L = I - D_i A D_i\]

where \(D\) is the diagonal matrix of degrees of node \(i\) raised to the -1/2 power, \(I\) is the identity matrix, and \(A\) is the adjacency matrix.

Graphs can often be characterized by its Laplacian eigenspectrum, which makes this a useful metric to consider when assessing similarity between graphs [Das, 2004]. The I-DAD Laplacian is particularly useful because all its eigenvalues are real, and for undirected graphs, all eigenvalues are guaranteed to be in the range [0, 2] [Suárez et al., 2022].

We implement the Laplacian spectral kernel by applying pass-to-ranks transformations on the networks, sorting the eigenvalues of the graph Laplacian without any smoothing, then calculating the \(L^2\)-norm of the difference vector to construct the dissimilarity matrix.

Note

Using the code attached below, there are several ways we can derive and use the Laplacian eigenspectrum. We can transform the graphs through pass-to-ranks or binarization before finding its graph Laplacian, and we can simply sort its eigenvalues or smooth them using a Gaussian kernel to construct the eigenspectrum. We also have multiple ways to quantify dissimilarity, such as calculating the cosine difference between two eigenspectra, or by finding the \(L^1\)-norm, or \(L^2\)-norm of the difference vector between two eigenspectra. The implementation we state above is based on which yields the highest discriminiability value on the matched networks. More details can be found in the Laplacian section in the Appendix.

We use the Laplacian spectral kernel for both matched and unmatched networks.

# HIDE CODE
import numpy as np
from graspologic.utils import pass_to_ranks, to_laplacian
from sklearn.neighbors import KernelDensity
from scipy.spatial.distance import cosine

def laplacian_dissim(graphs, transform: str=None, metric: str='l2', smooth_eigvals: bool=False, \
    normalize=True):
    if transform == 'pass-to-ranks':
        for i, graph in enumerate(graphs):
            graph = pass_to_ranks(graph)
            graphs[i] = graph
    elif transform == 'binarize':
        graphs[graphs != 0] = 1
    elif transform == None:
        graphs = graphs
    else:
        print('Supported transformations are "pass-to-ranks" (simple-nonzero), "binarize", or None.')
    
    eigs = []
    for i, graph in enumerate(graphs):
        # calculate laplacian
        lap = to_laplacian(graph, 'I-DAD')

        # find and sort eigenvalues
        w = np.linalg.eigvals(lap)
        w = np.sort(w)

        if smooth_eigvals:
            kde = KernelDensity(kernel='gaussian', bandwidth=0.015).fit(w.reshape(-1, 1))
            xs = np.linspace(0, 2, 2000)
            xs = xs[:, np.newaxis]
            log_dens = kde.score_samples(xs)
            eigs.append(np.exp(log_dens))
        else:
            eigs.append(w)

    dissim_matrix = np.zeros((len(graphs), len(graphs)))
    for i, eig1 in enumerate(eigs):
        for j, eig2 in enumerate(eigs):
            if metric == 'cosine':
                diff = cosine(eig1, eig2)
            elif metric == 'l1':
                diff = np.linalg.norm(eig1 - eig2, ord=1)
            elif metric == 'l2':
                diff = np.linalg.norm(eig1 - eig2, ord=2)
            dissim_matrix[i, j] = diff

    if normalize:
        dissim_matrix = dissim_matrix / np.max(dissim_matrix)
    
    return dissim_matrix

For Matched Networks - Omnibus Embedding¶

We generate omnibus embeddings of the matched networks, then calculate the Frobenius norm of the difference between two embeddings to construct the dissimilarity matrix.

Note

The Frobenius norm of a matrix \(A\) is given by

\[||A||_F = (\sum_{i,j} |A_{ij}|^2)^{1/2}\]

[Golub and Van Loan, 1985].

# HIDE CODE
from graspologic.embed import OmnibusEmbed

# embed using Omni
embedder = OmnibusEmbed(n_elbows=4)
omni_embedding = embedder.fit_transform(graphs)

omni_matrix = np.zeros((len(graphs), len(graphs)))
for i, embedding1 in enumerate(omni_embedding):
    for j, embedding2 in enumerate(omni_embedding):
        dist = np.linalg.norm(embedding1 - embedding2, ord="fro")
        omni_matrix[i, j] = dist

# scale values from 0 to 1
scaled_omni_dissim = omni_matrix / np.max(omni_matrix)

For Unmatched Networks - Latent Distribution Test Statistic¶

The latent distribution test is a two-sample hypothesis test for whether or not two graphs have the same distributions of latent positions. This test does not assume any known matching, as it can use alignment techniques such as seedless-procrustes to find the best orthogonal alignment between two embeddings.

First we generate adjacency spectral embeddings of the largest connected component of the unmatched graphs, then calculate the latent distribution test statistic for every pair to construct the dissimilarity matrix using parameters shown in the code below.

# HIDE CODE
from graspologic.inference import latent_distribution_test
from graspologic.embed import AdjacencySpectralEmbed
from graspologic.utils import largest_connected_component
from tqdm import tqdm
import warnings
warnings.filterwarnings("ignore")

ase_graphs = []
for i, graph in enumerate(graphs):
    lcc_graph = largest_connected_component(graph)
    ase_graph = AdjacencySpectralEmbed(n_components=4).fit_transform(lcc_graph)
    ase_graphs.append(ase_graph)

# calculate alignments
latent_dissim = np.zeros((len(ase_graphs), len(ase_graphs)))
Qs = []
for j in tqdm(range(0, len(ase_graphs) - 1)):
    for i in range(j+1, len(ase_graphs)):
        statistic, _, misc_dict = latent_distribution_test(ase_graphs[i], ase_graphs[j], test='mgc', metric='euclidean', \
            n_bootstraps=0, align_type='seedless_procrustes', input_graph=False)
        latent_dissim[i, j] = statistic
        Qs.append(misc_dict['Q'])

# scale values from 0 to 1 and make matrix symmetric
scaled_latent_dissim = latent_dissim / np.max(latent_dissim)
scaled_latent_dissim = scaled_latent_dissim + scaled_latent_dissim.T

References¶

1: Laura E Suárez, Yossi Yovel, Martijin P Van den Heuvel, Olaf Sporns, Yaniv Assaf, Guillaume Lajoie, and Bratislav Misic. A connectomics-based taxonomy of mammals. bioRxiv, 03 2022. doi:10.1101/2022.03.11.483995.
2: Eric W. Weisstein. L^1-norm. Accessed May 16, 2022 [Online]. URL: https://mathworld.wolfram.com/L1-Norm.html.
3: Eric W. Weisstein. L^2-norm. Accessed May 16, 2022 [Online]. URL: https://mathworld.wolfram.com/L2-Norm.html.
4(1,2): Eric W. Weisstein. Matrix norm. Accessed May 16, 2022 [Online]. URL: https://mathworld.wolfram.com/MatrixNorm.html.
5: K. Ch. Das. The laplacian spectrum of a graph. Computers & Mathematics with Applications, 48(5-6):715–724, 09 2004. doi:10.1016/j.camwa.2004.05.005.
6: Gene H. Golub and Charles F. Van Loan. Matrix Computations. The Johns Hopkins University Press, Baltimore, MD, 1985.

Datasets

Matched Networks

NDD II - Investigating Kernels

Kernels

Contents

Kernels¶

For Matched Networks¶

Global Properties¶

Density¶

Average Edge Weight¶

Average of the Adjacency Matrix¶

Node-wise Properties¶

Node Degree¶

Node Strength¶

Edge-wise Properties¶

Edge Weight¶

For Unmatched Networks¶

Established Kernels¶

Laplacian Spectral Distance¶

For Matched Networks - Omnibus Embedding¶

For Unmatched Networks - Latent Distribution Test Statistic¶

References¶