GitHub - ngmaclaren/observe: Tools and analysis for observing the dynamical state of network systems using sentinel nodes.

This repository contains code to select optimized sentinel node sets which can closely approximate the state of all network nodes, as well as simulation and analysis code which produced the findings in our manuscript. See below for an example analysis.

optNS

The optNS package is a small R package which implements our optimization algorithm; the name of the package is simply short for "optimize node sets." The ./optNS/ directory contains the code, which you can build with

R CMD build optNS

The built package is also included as optNS_0.1-1.tar.gz. To install, use

R CMD INSTALL optNS_0.1-1.tar.gz

Please contact me with any questions.

Shell scripts for high-performance computing clusters

Our method should run in reasonable time for small networks on a standard laptop (I used a laptop with four Intel i3-5010U CPUs at 2.10GHz). In order to support larger networks we used a high-performance computing cluster to simulate node states across ranges of control parameters (our "ground truth" simulations) and to optimize node sets. The shell scripts we used to do this are in ./shell/ and will need to be modified for your parallel computing environment.

Dependencies

In addition to the optNS package, we use an in-house differential equation simulation package called sdn. We use the following standard R repository packages:

deSolve
ROI
ROI.plugin.qpoases
igraph
parallel
optparse
latex2exp (for some plots in ./analysis)

Example

What follows is a simple demonstration of our method using the functions in optNS. The code below is also included in the repository as demo.R.

As tested, we rely on the following packages (setting the number of cores, color palette, and random seed for convenience):

library(parallel)
ncores <- detectCores()-1
library(igraph)
library(optNS)
library(sdn)
palette("R4")
set.seed(1234)

First, we choose a dynamics (our standard coupled double-well model) and generate a small network.

dynamics <- "doublewell"

N <- 50
g <- sample_pa(N, m = 2, directed = FALSE, start.graph = make_full_graph(3))

Then, we simulate the equilibrium states of all nodes. We will treat these states as the ground truth in the remaining analysis. We used our sdn package to do the ground truth simulations.

AL <- as_adj_list(g, "all")
times <- 0:15
params <- c(
    get(paste0(".", dynamics)),
    list(AL = AL)
)
control <- list(times = times, ncores = ncores)
model <- get(dynamics)
system.time(
    Y <- solve_in_range(
        params$Ds, "D", model, rep(params$xinit.low, N), params, control, "ode", method = "adams", maxsteps = 20000
    )
)

Simulating the ground truth of large networks can be slow. For this network, simulating the ground truth took 15.840 seconds on my laptop.

To run our optimization algorithm, we need the number of nodes in the node set n, the network g, and the "ground truth" simulation results to use for optimization. The matrix Y is the "ground truth" simulation results for the coupled double-well dynamics on the dolphin network: the rows correspond to control parameter values (here, $D \in [0, 1]$) and the columns to the nodes in the network.

y <- rowMeans(Y) # actual average
n <- floor(log(N)) # a small number in comparison to the size of the network
system.time(
    soln <- select_optimized(n, g, y, Y)
)

To produce one optimized nodeset we call optNS::select_optimized(), which took 2.623 seconds on my laptop. Note that larger networks require far more time and/or computing resources. We run the simulated annealing algorithm for $50N$ iterations, which we found to be sufficient to find a local optimum. Note that we hard-coded this value: if you wish to change it, see ./optNS/R/make-nodesets.R and the function select_optimized().

The output of select_optimized() is a list of vertex indices, approximation error, and degree sequence; optimized node weights are null unless you ask select_optimized() to use the quadratic optimization procedure.

str(soln)

To compute the approximation we select the columns from Y that correspond to the nodes our optimization algorithm returned:

Z <- Y[, soln$vs]
z <- rowMeans(Z)

We can visualize the results as follows:

par(mar = c(5, 5, 1, 1), pty = "s")
matplot(
    params$Ds, Y,
    type = "l", lty = 1, lwd = 0.25, col = "gray50",
    xlab = "D", ylab = expression(italic(x[i])), font.lab = 3,
    cex.lab = 2, cex.axis = 1.5
)
lines(params$Ds, y, lty = 1, lwd = 8, col = 1)
matlines(params$Ds, Z, lty = 1, lwd = 4, col = 2)
lines(params$Ds, z, lty = 1, lwd = 8, col = 3)

To make several node sets, we need a bit more time and a standardized function optNS::make_dataset() which can produce optimized or random node sets, among other options.

ntrials <- 10

system.time(
    opts <- make_dataset(
        ntrials = ntrials, ns.type = "opt", ncores = ncores,
        n = n, g = g, y = y, Y = Y
    )
)

That took 12.754 seconds on my laptop. Making random node sets takes very little time.

rands <- make_dataset(
    ntrials = 10*ntrials, ns.type = "rand", ncores = ncores,
    n = n, g = g, y = y, Y = Y
)

optNS also includes some convenience functions to retrieve information from lists of node sets. We can use those functions to make a plot that summarizes our simulation results:

error <- list(
    opt = get_error(opts),
    rand = get_error(rands)
)

xlim <- range(unlist(error))

dev.new(height = 5)
par(mar = c(5, 9, 1, 1))
plot(
    NULL, xlim = xlim, ylim = c(0.5, 2.5), xlab = "Approximation error", ylab = "", log = "x", yaxt = "n",
    cex.lab = 2, cex.axis = 1.5
)
points(error$opt, jitter(rep(2, length(error$opt)), amount = 0.1), pch = 1, col = 2)
points(error$rand, jitter(rep(1, length(error$rand)), amount = 0.1), pch = 0, col = 1)
axis(2, at = c(2, 1), tick = FALSE, labels = c("Optimized", "Random"), las = 2, cex.axis = 2)

Sentinel node analysis in empirical data

We used functional magnetic resonance imaging (fMRI) data from the Human Connectome Project to investigate the performance of our method in an empirical setting. We used the HCP Young Adult data set, described here. Registration is required to access this data, so we have not included it in our project repositories. We are including the files we used to analyze the data. The file collect-all-fmri.sh and its associated .sh and ../sims/*.R files forms networks, generates simulations, and selects node sets for each participant. Analysis and plotting can be found in ./analysis/analyze-fmri.R and ./analysis/plot-fmri.R, respectively. The data (.rds) files are available on Zenodo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

optNS

Shell scripts for high-performance computing clusters

Dependencies

Example

Sentinel node analysis in empirical data

About

Uh oh!

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
analysis		analysis
optNS		optNS
shell		shell
sims		sims
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.R		demo.R
optNS_0.1-1.tar.gz		optNS_0.1-1.tar.gz

License

ngmaclaren/observe

Folders and files

Latest commit

History

Repository files navigation

optNS

Shell scripts for high-performance computing clusters

Dependencies

Example

Sentinel node analysis in empirical data

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages