InformaticsMatters
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎README.md‎
Lines changed: 65 additions & 11 deletions b/‎README.md‎
Lines changed: 65 additions & 11 deletions
diff --git a/‎build.gradle‎
Lines changed: 9 additions & 0 deletions b/‎build.gradle‎
Lines changed: 9 additions & 0 deletions
diff --git a/‎pipeline.test.template‎
Lines changed: 8 additions & 1 deletion b/‎pipeline.test.template‎
Lines changed: 8 additions & 1 deletion
diff --git a/‎src/groovy/ContainerExecutor.groovy‎
Lines changed: 79 additions & 0 deletions b/‎src/groovy/ContainerExecutor.groovy‎
Lines changed: 79 additions & 0 deletions
@@ -12,3 +12,4 @@ work
 .nextflow.log*
 /tmp
 **/*.egg-info
+**/.DS_Store
@@ -32,6 +32,16 @@ installed as normal:
     released when more invasive tests have been written. In the meantime we
     use this repository as a Git [submodule] in our existing pipelines.
 
+### Redirecting output
+Normally pipeline output files are written to a `tmp` directory inside
+the working copy of the repository you're running in. Alternatively you
+can write test output to your own directory (i.e. `/tmp/blob`) using
+the environment variable `POUT`: -
+
+    $ export POUT=/tmp/blob/
+
+Output files are removed when the test suite starts and when it passes.
+ 
 ### From within a pipeline repository
 You will find this repository located as a submodule. When checking the
 pipeline out (for example [Pipelines]) you will need to initialise the
@@ -51,25 +61,57 @@ When tests fail it logs as much as it can and continues. When all the tests
 have finished it prints a summary including a lit of failed tests along with
 of the number of test files and individual tests that were executed: -
 
-    -------
-    Summary
-    -------
-    Test Files   :   20
-    Tests        :   30
-    Tests ignored:    0
-    Tests failed :    0
-    -------
-    Passed: TRUE
-
+    +----------------+
+    :         Summary: 
+    +----------------+
+    :      Test files:  29
+    :     Tests found:  39
+    :    Tests passed:  20
+    :    Tests failed:   -
+    :   Tests skipped:  19
+    :   Tests ignored:   3
+    :        Warnings:   -
+    +----------------+
+    :          Result: SUCCESS
+    +----------------+
+
+Fields in the summary should be self-explanatory but a couple might benefit
+from further explanation: -
+
+-   `Tests skipped` are the tests that were found but not executed.
+    Normally these are the tests found during a _Container_ test
+    that can't be run (because the test has no corresponding container image).
+-   `Tests ignored` are test found that are not run because they have
+    been marked for non-execution as the test name begins with `ignore_`. 
+    
 ### From here
 If you have working copies of all your pipeline repositories checked-out
 in the same directory as this repository you can execute all the tests
 across all the repositories by running the tester from here. Simply
 run the following Gradle command from here: -
 
     $ ./gradlew runPipelineTester
+
+### In Docker
+You can run the pipeline tests in Docker using their expected container
+image (defined in the service descriptor). Doing this gives you added
+confidence that your pipeline will work wen deployed.
+
+You can use the docker-specific Gradle task: -
+
+    $ ./gradlew runDockerPipelineTester
 
-### Debugging test failures
+Or, by adding the `-d` or `--indocker` command-line argument into the basic
+task. To pass command-line options through Gradle into the underlying task
+you can also run the Docker tests like this: -
+
+    $ ./gradlew runPipelineTester -Pptargs=-d
+
+>   When you run _in docker_ only the tests that can run in Docker (those with
+    a defined image name) will be executed. Tests that cannot be executed in
+    Docker will be _skipped_.
+    
+## Debugging test failures
 Ideally your tests will pass. When they don't the test framework prints
 the collected log to the screen as it happens but also keeps all the files
 generated (by all the tests) in case they are of use for diagnostics.
@@ -80,6 +122,11 @@ for every file and test combination. For example, if the test file
 `pbf_ev.test` contains the test `test_pbf_ev_raw` any files it generates
 will be found in the directory `pbf_ev-test_pbf_ev_raw`.
 
+You can re-direct the test output to an existing directory of your choice by
+defining the Environment variable `POUT`: -
+
+    $ export POUT=/tmp/my-output-dir/
+    
 Some important notes: -
 
 -   Files generated by the pipelines are removed when the tester is
@@ -102,6 +149,13 @@ in order to create a set of tests for a new pipeline.
 >   At the moment the tester only permits one test file per pipeline so all 
     tests for a given pipeline need to be composed in one file. 
 
+## Testing the pipeline utilities
+The pipeline utilities consist of a number of Python-based modules
+that can be tested using `setup.py`. To test these modules run the
+following from the `src/python` directory: -
+
+    $ python setup.py test
+ 
 ---
 
 [Conda]: https://conda.io/docs/
 
@@ -34,3 +34,12 @@ task runPipelineTester(type: Exec) {
     commandLine 'groovy', 'PipelineTester.groovy', "$args"
 
 }
+
+task runDockerPipelineTester(type: Exec) {
+
+    description 'Runs the PipelineTester Docker tests'
+
+    workingDir 'src/groovy'
+    commandLine 'groovy', 'PipelineTester.groovy', '-indocker'
+
+}
@@ -60,6 +60,13 @@
     //
     // If a test is not working an you want to keep it and avoid running it
     // then simply prefix the test section name with `ignore_`.
+    //
+    // The location of any input data you provide in your test is normally
+    // located in the project's `data` directory or mounted as `/data`
+    // when running in a container. You can safely refer to data in both cases
+    // with the PIN environment variable created by the PipelineTester.
+    // To allow execution from the command-line and in a container the input
+    // data file `blob.dat` should be referred to using `${PIN}blob.dat`
 
     test_1 = [
 
@@ -71,7 +78,7 @@
         // If you provide a command you **cannot** provide parameters
         // (see the params section below).
 
-        command: ```python my_own_command
+        command: ```python my_own_command -i ${PIN}blob.dat
                  --my-own-param-1 32
                  --my-own-param-2 18.5```,
 
 
@@ -0,0 +1,79 @@
+#!/usr/bin/env groovy
+
+/**
+ * Copyright (c) 2018 Informatics Matters Ltd.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * The PipelineTester Container Executor class. The class is responsible for
+ * executing a pipeline command in the supplied Docker container image.
+ */
+class ContainerExecutor {
+
+    /**
+     * Executes the given command in the supplied container image. The data
+     * directory (pin) is mounted as `/data` in the running container
+     * and the designated output directory (pout) is mounted in the container
+     * as `/output`. Two environment variables are defined: PIN and POUT and
+     * are set to `/data` and `/output` respectively. The script is executed in
+     * the output directory in the container.
+     *
+     * @param command The command to run
+     * @param imageName The image to run the command in
+     * @param pin The pipeline input directory
+     *            (used to define the PIN environment variable)
+     * @param pout The pipeline output directory
+     *             (used to define the POUT environment variable)
+     * @param timeoutSeconds The time to allow for the command to execute
+     * @return A list containing the STDOUT and STDERR encapsulated in a
+     *         StringBuilder(), an integer command exit code and a timeout
+     *         boolean (set if the program execution timed out)
+     */
+    static execute(String command, String imageName,
+                   String pin, String pout,
+                   int timeoutSeconds) {
+
+        StringBuilder sout = new StringBuilder()
+        StringBuilder serr = new StringBuilder()
+
+        // Note: PIN and POUT have trailing forward-slashed for now
+        //       to allow a migratory use of $PIN}file references
+        //       rather than insisting on ${PIN}/file which would fail if
+        //       PIN wasn't defined - it's about lowest risk changes.
+
+        String cmd = "docker run -v $pin:/data -v $pout:/output" +
+                     " -w /output -e PIN=/data/ -e POUT=/output/ $imageName" +
+                     " sh -c '$command'"
+
+        def proc = ['sh', '-c', cmd].execute(null, new File('.'))
+        proc.consumeProcessOutput(sout, serr)
+        proc.waitForOrKill((long)timeoutSeconds * 1000)
+        int exitValue = proc.exitValue()
+
+        // Timeout?
+        //
+        // Some exit codes have a special meaning.
+        //
+        // We can, for example, assume that the process was killed
+        // if the exit code is 143 as 143, which is 128 + 15, means
+        // the program died with signal 15 (SIGTERM).
+        // See http://www.tldp.org/LDP/abs/html/exitcodes.html.
+        boolean timeout = exitValue == 143 ? true : false
+
+        return [sout, serr, exitValue, timeout]
+
+    }
+
+}