For details on the model, please see SIEM model.
In this notebook, we determine whether there exists a difference in connectivity \(p_i\) ipsi-laterally (same hemisphere) vs \(p_c\) contra-laterally (opposite hemisphere) connectivity within a particular modality.
Our test for this notebook is as follows:
\[\begin{align*} H_0: p_i \leq p_c \\ H_A: p_i > p_c \end{align*}\]in words, whether the connectivity ipsi-laterally exceeds the connectivity contra-laterally. We will do this in 2 ways:
We will perform this experiment for both dMRI and fMRI-derived connectomes.
For the data, we compute the weighted mean functional (rank of each edge) and diffusion (number of fibers). For the functional connectome, we threshold such that the largest 50% of edges are set to connected, and the smallest 50% set to disconnected. For the diffusion (which are natively sparse) we just threshold edges that are present to connected, and edges that are not present to disconnected (threshold about 0).
The data below can be downloaded and moved to appropriate folders as follows (note that the below section requires sudo access):
sudo mkdir /data/
sudo chmod -R 777 /data
cd /data
wget http://openconnecto.me/mrdata/share/derivatives/dwi_edgelists.tar.gz
wget http://openconnecto.me/mrdata/share/derivatives/fmri_edgelists.tar.gz
wget http://openconnecto.me/mrdata/share/connectome_stats/connectome_stats.zip
mkdir -p /data/connectome_stats /data/all_mr /data/all_mr/dwi/edgelists /data/all_mr/fmri/ranked/edgelists
mv dwi_edgelists.tar.gz /data/dwi/edgelists
cd /data/dwi/edgelists
tar -xvzf dwi_edgelists.tar.gz
mv /data/fmri_edgelists.tar.gz /data/fmri/ranked/edgelists
cd /data/fmri/ranked/edgelists
tar -xvzf fmri_edgelists.tar.gz
mv /data/connectome_stats.zip /data/connectome_stats.zip
cd /data/connectome_stats
unzip connectome_stats.zip
basepath = '/data/connectome_stats/'
fmri_gr = read_graph(file.path(basepath, 'fmrimean_1709.edgelist'), format="ncol")
vset <- V(fmri_gr)
ordered_v <- order(vset)
fmri_gr = read_graph(file.path(basepath, 'fmrimean_1709.edgelist'), format="ncol", predef=ordered_v)
fmri_mean = get.adjacency(fmri_gr, type="both", sparse=FALSE, attr='weight')
dwi_gr = read_graph(file.path(basepath, 'dwimean_2861.edgelist'), format="ncol", predef=ordered_v)
dwi_mean = get.adjacency(dwi_gr, type="both", sparse=FALSE, attr='weight')
fmri_thresh = thresh_matrix(fmri_mean)
dwi_thresh = thresh_matrix(dwi_mean, thresh=0)
gs.plot.plot_matrix(fmri_thresh, title = "Mean Thresholded Functional Connectome", legend.name = "connection", ffactor = TRUE) +
theme(panel.background = element_rect(fill = '#ffffff'))
gs.plot.plot_matrix(dwi_thresh, title = "Mean Thresholded DWI Connectome", legend.name = "connection", ffactor = TRUE) +
theme(panel.background = element_rect(fill = '#ffffff'))
here, we will compute the probability of an edge existing in each of 4 quadrants (2 ipsilateral quadrants; 2 contralateral quadrants):
nroi <- 70
group1 <- 1:35
group2 <- 36:70
groups <- list(group1, group2)
fmri_block = block_data(fmri_thresh, groups)
dwi_block = block_data(dwi_thresh, groups)
colnames(fmri_block) <- c("Left", "Right")
rownames(fmri_block) <- c("Left", "Right")
colnames(dwi_block) <- c("Left", "Right")
rownames(dwi_block) <- c("Left", "Right")
gs.plot.plot_matrix(fmri_block, title = "Blocked Functional Connectome", xlabel = "Hemisphere",
ylabel="Hemisphere", legend.name = "p", vfactor=TRUE, vlist = c("Left", "Right"), limits=c(0, 1)) +
theme(panel.background = element_rect(fill = '#ffffff'))
gs.plot.plot_matrix(dwi_block, title = "Blocked DWI Connectome", xlabel = "Hemisphere",
ylabel="Hemisphere", legend.name = "p", vfactor=TRUE, vlist = c("Left", "Right"), limits=c(0, 1)) +
theme(panel.background = element_rect(fill = '#ffffff'))
nroi <- 70
dwi.dsets = c('BNU1', 'BNU3', 'HNU1', 'KKI2009', 'NKI1', 'NKIENH', 'MRN1313', 'Templeton114', 'Templeton255', 'SWU4')
dwi.atlas = 'desikan'
dwi.basepath = '/data/all_mr/dwi/edgelists'
graphobj = fmriu.io.collection.open_graphs(basepath = dwi.basepath, atlases = dwi.atlas, datasets = dwi.dsets,
gname = 'graphs', fmt='edgelist', rtype = 'array')
## [1] "opening graphs for BNU1 dataset and desikan parcellation atlas..."
## [1] "opening graphs for BNU3 dataset and desikan parcellation atlas..."
## [1] "opening graphs for HNU1 dataset and desikan parcellation atlas..."
## [1] "opening graphs for KKI2009 dataset and desikan parcellation atlas..."
## [1] "opening graphs for NKI1 dataset and desikan parcellation atlas..."
## [1] "opening graphs for NKIENH dataset and desikan parcellation atlas..."
## [1] "opening graphs for MRN1313 dataset and desikan parcellation atlas..."
## [1] "opening graphs for Templeton114 dataset and desikan parcellation atlas..."
## [1] "opening graphs for Templeton255 dataset and desikan parcellation atlas..."
## [1] "opening graphs for SWU4 dataset and desikan parcellation atlas..."
dwi.graphs = graphobj$graphs
dwi.datasets = graphobj$dataset
dwi.subjects = graphobj$subjects
ne = 1225
nroi <- 70
group1 <- c() # edges in same hemisphere
group2 <- c() # edges across hemispheres
for (i in 1:nroi) {
for (j in 1:nroi) {
idx <- (i - 1)*nroi + j
if ((i <= 35 & j <= 35) | (i > 35 & j > 35)) {
group1 <- c(group1, idx)
} else {
group2 <- c(group2, idx)
}
}
}
Es <- list(group1, group2)
dwi.models <- sapply(1:dim(dwi.graphs)[1], function(i) {
gs.siem.fit(thresh_matrix(dwi.graphs[i,,], 0), Es, alt='greater')
}, simplify = FALSE)
We might want to visualize the distribution of \(\delta = \hat{p}_{ipsi} - \hat{p}_{contra}\) under the analytical model and compare to our empirical model:
ne = 1225
dwi.ips.phat <- sapply(dwi.models, function(model) model$pr[1])
dwi.contr.phat <- sapply(dwi.models, function(model)model$pr[2])
dwi.diff.distr.emp.mod = density(as.numeric(sapply(dwi.models, function(model) model$dpr[1,2])))
# variances sum
dwi.diff.distr.ana = dnorm(dwi.diff.distr.emp.mod$x, mean=mean(abs(dwi.ips.phat - dwi.contr.phat)),
sd=sqrt(model.var(mean(dwi.ips.phat), ne) + model.var(mean(dwi.contr.phat), ne)))
n_diff = length(dwi.diff.distr.emp.mod$x)
dwi.diff.dat = data.frame(x = c(dwi.diff.distr.emp.mod$x, dwi.diff.distr.emp.mod$x), y = c(dwi.diff.distr.emp.mod$y, dwi.diff.distr.ana),
distribution=c(rep("empirical", n_diff), rep("analytical", n_diff)))
dwi.diff.dat$distribution = factor(dwi.diff.dat$distribution)
ggplot(dat=dwi.diff.dat, aes(x=x, y=y, ymax=y, fill=distribution, color=distribution, group=distribution)) +
geom_ribbon(ymin=0, alpha=0.5) +
ylab('Density') +
xlab(TeX('$\\delta$')) +
ggtitle(TeX('Distribution of $\\delta = \\hat{p}_{ipsi} - \\hat{p}_{contr}$, DWI')) +
theme(panel.background = element_rect(fill = '#ffffff'))
which clearly shows a strong difference in the mean contra-laterally compared to ipsi-laterally, as our \(\delta\) is generally quite high. Performing a paired t-test between the ipsi-lateral and contra-lateral \(\hat{p}\), we find:
t.test(dwi.ips.phat, dwi.contr.phat, alternative="greater", var.equal=FALSE, paired=TRUE)
##
## Paired t-test
##
## data: dwi.ips.phat and dwi.contr.phat
## t = 256.61, df = 2857, p-value < 2.2e-16
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 0.3080553 Inf
## sample estimates:
## mean of the differences
## 0.3100433
which as we can see, indicates a significant difference in ipsi-lateral connectivity compared to contra-lateral connectivity with \(p < 2.2\times 10^{-16}\) for the diffusion connectomes. However, in this case, we note that the model is not very representative of the actual data observed. This is likely due to the fact that much of the data (50%) is acquired from 2 of the sites, so there likely are strong batch-effects present in the \(\hat{p}\), or that the diffusion connectivity is much more structured than the functional connectivity, and thus using a random block model may not be ideal.
Below, we look at the distribution of our \(p-\)values wehre we estimate one p-value per graph:
dwi.per.p <- sapply(dwi.models, function(model) model$pv[1,2])
dwi.p.dat = data.frame(p=dwi.per.p, dataset = dwi.datasets, modality='DWI')
dwi.p.dat$dataset = factor(dwi.p.dat$dataset)
ggplot(data=dwi.p.dat, aes(x=dataset, y=p, color=dataset, group=dataset)) +
geom_jitter() +
coord_trans(y = "log10") +
ggtitle(TeX(sprintf('DWI Per-subject P-value (1 graph), %.2f percent have $p < .05$', 100*sum(dwi.per.p < .05)/length(dwi.per.p)))) +
xlab('Dataset') +
ylab('p-value') +
theme(axis.text.x = element_text(angle=45), legend.position=NaN)
As we can see, with just \(1\) graph, we still see that ipsi-lateral connectivity exceeds contra-lateral connectivity with \(p < .05\) for just about all of the individual graphs. With just one graph, we can identify a significant difference with ipsi-lateral exceeding contra-lateral connectivity for the diffusion connectomes in \(99.9\%\) of the graphs at \(\alpha = .05\).
nroi <- 70
fmri.dsets = c('BNU1', 'BNU2', 'BNU3', 'HNU1', 'IBATRT', 'IPCAS1', 'IPCAS2', 'IPCAS5', 'IPCAS6', 'IPCAS8', 'MRN1', 'NYU1', 'SWU1', 'SWU2', 'SWU3', 'SWU4', 'UWM', 'XHCUMS')
fmri.atlas = 'desikan-2mm'
fmri.basepath = '/data/all_mr/fmri/ranked/edgelists/'
graphobj = fmriu.io.collection.open_graphs(basepath = fmri.basepath, atlases = fmri.atlas, datasets=fmri.dsets, fmt='edgelist', rtype = 'array')
## [1] "opening graphs for BNU1 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for BNU2 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for BNU3 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for HNU1 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for IBATRT dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for IPCAS1 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for IPCAS2 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for IPCAS5 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for IPCAS6 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for IPCAS8 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for MRN1 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for NYU1 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for SWU1 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for SWU2 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for SWU3 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for SWU4 dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for UWM dataset and desikan-2mm parcellation atlas..."
## [1] "opening graphs for XHCUMS dataset and desikan-2mm parcellation atlas..."
fmri.graphs = graphobj$graphs
fmri.datasets = graphobj$dataset
fmri.subjects = graphobj$subjects
ne = 1225
nroi <- 70
group1 <- c() # edges in same hemisphere
group2 <- c() # edges across hemispheres
for (i in 1:nroi) {
for (j in 1:nroi) {
idx <- (i - 1)*nroi + j
if ((i <= 35 & j <= 35) | (i > 35 & j > 35)) {
group1 <- c(group1, idx)
} else {
group2 <- c(group2, idx)
}
}
}
Es <- list(group1, group2)
fmri.models <- sapply(1:dim(fmri.graphs)[1], function(i) {
gs.siem.fit(thresh_matrix(fmri.graphs[i,,], 0.5), Es, alt='greater')
}, simplify = FALSE)
We might want to visualize the distribution of \(\hat{p}_{contra}\) and \(\hat{p}_{ipsi}\) under the analytical model and compare to our empirical model:
ne = 1225
fmri.ips.phat <- sapply(fmri.models, function(model) model$pr[1])
fmri.contr.phat <- sapply(fmri.models, function(model)model$pr[2])
fmri.diff.distr.emp.mod = density(as.numeric(fmri.ips.phat - fmri.contr.phat))
# variances sum
fmri.diff.distr.ana = dnorm(fmri.diff.distr.emp.mod$x, mean=mean(abs(fmri.ips.phat - fmri.contr.phat)),
sd=sqrt(model.var(mean(fmri.ips.phat), ne) + model.var(mean(fmri.contr.phat), ne)))
n_diff = length(fmri.diff.distr.emp.mod$x)
fmri.diff.dat = data.frame(x = c(fmri.diff.distr.emp.mod$x, fmri.diff.distr.emp.mod$x), y = c(fmri.diff.distr.emp.mod$y, fmri.diff.distr.ana),
distribution=c(rep("empirical", n_diff), rep("analytical", n_diff)))
fmri.diff.dat$distribution = factor(fmri.diff.dat$distribution)
ggplot(dat=fmri.diff.dat, aes(x=x, y=y, ymax=y, fill=distribution, color=distribution, group=distribution)) +
geom_ribbon(ymin=0, alpha=0.5) +
ylab('Density') +
xlab(TeX('$\\delta$')) +
ggtitle(TeX('Distribution of $\\delta = \\hat{p}_{ipsi} - \\hat{p}_{contr}$, fMRI')) +
theme(panel.background = element_rect(fill = '#ffffff'))
which clearly shows a much less strong difference in the means ipsi-laterally compared to contra-laterally, but still a present difference. Performing a t-test, we find:
t.test(fmri.ips.phat, fmri.contr.phat, alternative="greater", var.equal=FALSE, paired=TRUE)
##
## Paired t-test
##
## data: fmri.ips.phat and fmri.contr.phat
## t = 81.738, df = 1796, p-value < 2.2e-16
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
## 0.05045877 Inf
## sample estimates:
## mean of the differences
## 0.05149558
similar to the diffusion connectomes, the functional connectomes again exhibit a higher ipsi-lateral connectivity than contra-lateral connectivity that is significant with \(p < 2.2\times 10^{-16}\). The fit here is much better, likely due to the fact that fMRI connectivity is less structured and more random than diffusion connectivity.
Below, we look at the distribution of our \(p-\)values wehre we estimate one p-value per graph:
fmri.per.p <- sapply(fmri.models, function(model) model$pv[1,2])
fmri.p.dat = data.frame(p=fmri.per.p, dataset = fmri.datasets, modality='fMRI')
fmri.p.dat$dataset = factor(fmri.p.dat$dataset)
ggplot(data=fmri.p.dat, aes(x=dataset, y=p, color=dataset, group=dataset)) +
geom_jitter() +
ggtitle(TeX(sprintf('fMRI Per-subject P-value (1 graph), %.2f percent have $p < .05$', 100*sum(fmri.per.p < .05)/length(fmri.per.p)))) +
coord_trans(y = "log10") +
xlab('Dataset') +
ylab('p-value') +
theme(axis.text.x = element_text(angle=45), legend.position=NaN, panel.background = element_rect(fill = '#ffffff'))
As we can see, with just \(1\) graph, we do not have nearly the confidence that the ipsi-lateral connectivity exceeds the contra-lateral connectivity that we did with the diffusion graphs. With just one graph, we can identify a significant difference in ipsi-lateral vs. contra-lateral connectivity for the functional connectomes in just \(6.4\%\) of the graphs at \(\alpha = .05\).
We can compare the results from the fMRI and DWI simultaneously as appropriately colored density estimates to show the difference in the \(p\)-values:
dual.p.dat = rbind(dwi.p.dat, fmri.p.dat)
vline = data.frame(x=.05, type="sig")
labs = lapply(levels(dual.p.dat$modality), function(mod) {
pmod= dual.p.dat[dual.p.dat$modality == mod, ]$p
TeX(paste(sprintf('%s: %.2f', mod, 100*sum(pmod < .05)/length(pmod)), '% < $\\alpha$', sep=""))
})
dual.p.dat$grouping = paste(dual.p.dat$dataset, dual.p.dat$modality) # for the datasets that are shared
ggplot(data=dual.p.dat, aes(p, group=grouping, color=modality)) +
geom_line(stat="density", size=1, adjust=1.5) +
scale_x_log10(limits=c(.005, 1)) +
geom_vline(data=vline, aes(xintercept = x, linetype=type)) +
scale_color_discrete(name="Modality", breaks=levels(dual.p.dat$modality)) +
scale_linetype_manual(values=c("dashed"), name="Cutoff", breaks=c("sig"), labels=lapply(c("$\\alpha = 0.05$"), TeX)) +
xlab(TeX('$log(p)$')) +
ylab("Density") +
theme(panel.background = element_rect(fill = '#ffffff')) +
ggtitle("Hemispheric Intra-Modality")
Here, we again perform a test on 1 graph, except here the graphs used are the average functional and diffusion connectomes (the megameans). We feed this into a simple t-test with the appropriate assumptions (unequal variance, goal is to test for ipsilateral connectivity exceeding contralateral connectivity). The question here that we seek to answer is, given the average connectome for a particular modality, can we identify a significant difference in ipsi-lateral vs. contra-lateral connectivity?
fmri.agg.mod <- gs.siem.fit(fmri_thresh, Es)
print(fmri.agg.mod$pv[1,2])
## [1] 0.05781516
dwi.agg.mod <- gs.siem.fit(dwi_thresh, Es)
print(dwi.agg.mod$pv[1,2])
## [1] 0.01157091
As we can see above, the diffusion connectome is significant with \(p=.012\), whereas the functional connectome is significant with just \(p=.057\). Note that for this test, we only have one observation of each \(\hat{p}\), so we use a t-test but for the degrees of freedom to \(1\) (since it would otherwise be 0). At \(\alpha=0.5\), only the diffusion megamean connectome shows significance.