Skip to content
sneumann edited this page Oct 21, 2016 · 10 revisions

This is a set of various updates in current xcms developments.

Parallel processing in XCMS

XCMS has supported parallel processing since 2008 in several processing functions that promise a linear speed-up if run in parallel on multiple input files, like e.g findPeaks() used in the xcmsSet() function. The parallelism was controlled by the nSlave argument.

Several mechanisms were supported, the first one, the Message Passing Interface (MPI) is the most powerful, as it is the standard on big HPC cluster systems. It also integrates with the Sun Grid Engine cluster queueing framework, and can "glue" your cluster into a seemingly singly big machine. At one stage, I was able to do a peak picking with nSlaves=100 :-) Later, other backend packages like SNOW and parallel were added as well, and tried in a fixed order, until one was found to be installed, which was not very flexible.

In 2012 Martin Morgan started the BiocParallel package, to provide a common interface to a number of different approaches for (massively) parallel execution. In the current xcms3 development efforts, Johannes Rainer now improved the xcms parallel execution to use the new interface. The benefit is that now you have much more control over the parallel processing in XCMS.

Examples

some_code_snippets

We will deprecate the nSlave argument in April 2017, and remove it in October 2017.

Clone this wiki locally