-
Notifications
You must be signed in to change notification settings - Fork 9
Add initial integration test code #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
e82d4c5
7c8612c
5a950eb
b14430e
0443af8
567ad3e
74c158b
bde1cf9
8624545
5585a0a
f793e38
069ae50
204c51f
6a93a9e
b3e4ee0
989a371
46d1f5f
5f88921
a68bd5f
0f8fad9
7bc4aa2
9ec071a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,90 @@ | ||
# spark-integration | ||
Integration tests for Spark | ||
--- | ||
layout: global | ||
title: Spark on Kubernetes Integration Tests | ||
--- | ||
|
||
# Running the Kubernetes Integration Tests | ||
|
||
Note that the integration test framework is currently being heavily revised and | ||
is subject to change. | ||
|
||
Note that currently the integration tests only run with Java 8. | ||
|
||
Running the integration tests requires a Spark distribution package tarball that | ||
contains Spark jars, submission clients, etc. You can download a tarball from | ||
http://spark.apache.org/downloads.html. Or, you can create a distribution from | ||
source code using `make-distribution.sh`. For example: | ||
|
||
``` | ||
$ git clone [email protected]:apache/spark.git | ||
$ cd spark | ||
$ ./dev/make-distribution.sh --tgz \ | ||
-Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver | ||
``` | ||
|
||
The above command will create a tarball like spark-2.3.0-SNAPSHOT-bin.tgz in the | ||
top-level dir. For more details, see the related section in | ||
[building-spark.md](https://github.com/apache/spark/blob/master/docs/building-spark.md#building-a-runnable-distribution) | ||
|
||
|
||
The integration tests also need a local path to the directory that | ||
contains `Dockerfile`s. In the main spark repo, the path is | ||
`/spark/resource-managers/kubernetes/docker/src/main/dockerfiles`. | ||
|
||
Once you prepare the inputs, the integration tests can be executed with Maven or | ||
your IDE. Note that when running tests from an IDE, the `pre-integration-test` | ||
phase must be run every time the Spark main code changes. When running tests | ||
from the command line, the `pre-integration-test` phase should automatically be | ||
invoked if the `integration-test` phase is run. | ||
|
||
With Maven, the integration test can be run using the following command: | ||
|
||
``` | ||
$ mvn clean integration-test \ | ||
-Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz \ | ||
-Dspark-dockerfiles-dir=spark/resource-managers/kubernetes/docker/src/main/dockerfiles | ||
``` | ||
|
||
# Running against an arbitrary cluster | ||
|
||
In order to run against any cluster, use the following: | ||
```sh | ||
$ mvn clean integration-test \ | ||
-Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz \ | ||
-Dspark-dockerfiles-dir=spark/resource-managers/kubernetes/docker/src/main/dockerfiles | ||
-DextraScalaTestArgs="-Dspark.kubernetes.test.master=k8s://https://<master> -Dspark.docker.test.driverImage=<driver-image> -Dspark.docker.test.executorImage=<executor-image>" | ||
``` | ||
|
||
# Preserve the Minikube VM | ||
|
||
The integration tests make use of | ||
[Minikube](https://github.com/kubernetes/minikube), which fires up a virtual | ||
machine and setup a single-node kubernetes cluster within it. By default the vm | ||
is destroyed after the tests are finished. If you want to preserve the vm, e.g. | ||
to reduce the running time of tests during development, you can pass the | ||
property `spark.docker.test.persistMinikube` to the test process: | ||
|
||
``` | ||
$ mvn clean integration-test \ | ||
-Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz \ | ||
-Dspark-dockerfiles-dir=spark/resource-managers/kubernetes/docker/src/main/dockerfiles | ||
-DextraScalaTestArgs=-Dspark.docker.test.persistMinikube=true | ||
``` | ||
|
||
# Reuse the previous Docker images | ||
|
||
The integration tests build a number of Docker images, which takes some time. | ||
By default, the images are built every time the tests run. You may want to skip | ||
re-building those images during development, if the distribution package did not | ||
change since the last run. You can pass the property | ||
`spark.docker.test.skipBuildImages` to the test process. This will work only if | ||
you have been setting the property `spark.docker.test.persistMinikube`, in the | ||
previous run since the docker daemon run inside the minikube environment. Here | ||
is an example: | ||
|
||
``` | ||
$ mvn clean integration-test \ | ||
-Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz \ | ||
-Dspark-dockerfiles-dir=spark/resource-managers/kubernetes/docker/src/main/dockerfiles | ||
"-DextraScalaTestArgs=-Dspark.docker.test.persistMinikube=true -Dspark.docker.test.skipBuildImages=true" | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,250 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<!-- | ||
~ Licensed to the Apache Software Foundation (ASF) under one or more | ||
~ contributor license agreements. See the NOTICE file distributed with | ||
~ this work for additional information regarding copyright ownership. | ||
~ The ASF licenses this file to You under the Apache License, Version 2.0 | ||
~ (the "License"); you may not use this file except in compliance with | ||
~ the License. You may obtain a copy of the License at | ||
~ | ||
~ http://www.apache.org/licenses/LICENSE-2.0 | ||
~ | ||
~ Unless required by applicable law or agreed to in writing, software | ||
~ distributed under the License is distributed on an "AS IS" BASIS, | ||
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
~ See the License for the specific language governing permissions and | ||
~ limitations under the License. | ||
--> | ||
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> | ||
<modelVersion>4.0.0</modelVersion> | ||
|
||
<artifactId>spark-kubernetes-integration-tests_2.11</artifactId> | ||
<groupId>spark-kubernetes-integration-tests</groupId> | ||
<version>0.1-SNAPSHOT</version> | ||
<properties> | ||
<commons-lang3.version>3.5</commons-lang3.version> | ||
<commons-logging.version>1.1.1</commons-logging.version> | ||
<docker-client.version>5.0.2</docker-client.version> | ||
<download-maven-plugin.version>1.3.0</download-maven-plugin.version> | ||
<exec-maven-plugin.version>1.4.0</exec-maven-plugin.version> | ||
<extraScalaTestArgs></extraScalaTestArgs> | ||
<guava.version>18.0</guava.version> | ||
<jsr305.version>1.3.9</jsr305.version> | ||
<kubernetes-client.version>3.0.0</kubernetes-client.version> | ||
<log4j.version>1.2.17</log4j.version> | ||
<scala.version>2.11.8</scala.version> | ||
<scala.binary.version>2.11</scala.binary.version> | ||
<scala-maven-plugin.version>3.2.2</scala-maven-plugin.version> | ||
<scalatest.version>2.2.6</scalatest.version> | ||
<scalatest-maven-plugin.version>1.0</scalatest-maven-plugin.version> | ||
<slf4j-log4j12.version>1.7.24</slf4j-log4j12.version> | ||
<sbt.project.name>kubernetes-integration-tests</sbt.project.name> | ||
<spark-distro-tgz>YOUR-SPARK-DISTRO-TARBALL-HERE</spark-distro-tgz> | ||
<spark-dockerfiles-dir>YOUR-DOCKERFILES-DIR-HERE</spark-dockerfiles-dir> | ||
<test.exclude.tags></test.exclude.tags> | ||
</properties> | ||
<packaging>jar</packaging> | ||
<name>Spark Project Kubernetes Integration Tests</name> | ||
|
||
<dependencies> | ||
<dependency> | ||
<groupId>commons-logging</groupId> | ||
<artifactId>commons-logging</artifactId> | ||
<version>${commons-logging.version}</version> | ||
</dependency> | ||
<dependency> | ||
<groupId>com.google.code.findbugs</groupId> | ||
<artifactId>jsr305</artifactId> | ||
<version>${jsr305.version}</version> | ||
</dependency> | ||
<dependency> | ||
<groupId>com.google.guava</groupId> | ||
<artifactId>guava</artifactId> | ||
<scope>test</scope> | ||
<!-- For compatibility with Docker client. Should be fine since this is just for tests.--> | ||
<version>${guava.version}</version> | ||
</dependency> | ||
<dependency> | ||
<groupId>com.spotify</groupId> | ||
<artifactId>docker-client</artifactId> | ||
<version>${docker-client.version}</version> | ||
<scope>test</scope> | ||
</dependency> | ||
<dependency> | ||
<groupId>io.fabric8</groupId> | ||
<artifactId>kubernetes-client</artifactId> | ||
<version>${kubernetes-client.version}</version> | ||
</dependency> | ||
<dependency> | ||
<groupId>log4j</groupId> | ||
<artifactId>log4j</artifactId> | ||
<version>${log4j.version}</version> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.apache.commons</groupId> | ||
<artifactId>commons-lang3</artifactId> | ||
<version>${commons-lang3.version}</version> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.scala-lang</groupId> | ||
<artifactId>scala-library</artifactId> | ||
<version>${scala.version}</version> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.scalatest</groupId> | ||
<artifactId>scalatest_${scala.binary.version}</artifactId> | ||
<version>${scalatest.version}</version> | ||
<scope>test</scope> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.slf4j</groupId> | ||
<artifactId>slf4j-log4j12</artifactId> | ||
<version>${slf4j-log4j12.version}</version> | ||
<scope>test</scope> | ||
</dependency> | ||
</dependencies> | ||
|
||
<build> | ||
<plugins> | ||
<plugin> | ||
<groupId>net.alchim31.maven</groupId> | ||
<artifactId>scala-maven-plugin</artifactId> | ||
<version>${scala-maven-plugin.version}</version> | ||
<executions> | ||
<execution> | ||
<goals> | ||
<goal>compile</goal> | ||
<goal>testCompile</goal> | ||
</goals> | ||
</execution> | ||
</executions> | ||
</plugin> | ||
<plugin> | ||
<groupId>org.codehaus.mojo</groupId> | ||
<artifactId>exec-maven-plugin</artifactId> | ||
<version>${exec-maven-plugin.version}</version> | ||
<executions> | ||
<execution> | ||
<id>unpack-spark-distro</id> | ||
<phase>pre-integration-test</phase> | ||
<goals> | ||
<goal>exec</goal> | ||
</goals> | ||
<configuration> | ||
<workingDirectory>${project.build.directory}</workingDirectory> | ||
<executable>/bin/sh</executable> | ||
<arguments> | ||
<argument>-c</argument> | ||
<argument>rm -rf spark-distro; mkdir spark-distro-tmp; cd spark-distro-tmp; tar xfz ${spark-distro-tgz}; mv * ../spark-distro; cd ..; rm -rf spark-distro-tmp</argument> | ||
</arguments> | ||
</configuration> | ||
</execution> | ||
<execution> | ||
<!-- TODO: Remove this hack once the upstream is fixed --> | ||
<id>copy-dockerfiles-if-missing</id> | ||
<phase>pre-integration-test</phase> | ||
<goals> | ||
<goal>exec</goal> | ||
</goals> | ||
<configuration> | ||
<workingDirectory>${project.build.directory}/spark-distro</workingDirectory> | ||
<executable>/bin/sh</executable> | ||
<arguments> | ||
<argument>-c</argument> | ||
<argument>test -d dockerfiles || cp -pr ${spark-dockerfiles-dir} dockerfiles</argument> | ||
</arguments> | ||
</configuration> | ||
</execution> | ||
<execution> | ||
<!-- TODO: Remove this hack once upstream is fixed by SPARK-22777 --> | ||
<id>set-exec-bit-on-docker-entrypoint-sh</id> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just set it in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that's better to be done by the upstream code. It's hard for this integration code to surgically do in-place edit of Dockerfile. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The PR has merged now. |
||
<phase>pre-integration-test</phase> | ||
<goals> | ||
<goal>exec</goal> | ||
</goals> | ||
<configuration> | ||
<workingDirectory>${project.build.directory}/spark-distro/dockerfiles</workingDirectory> | ||
<executable>/bin/chmod</executable> | ||
<arguments> | ||
<argument>+x</argument> | ||
<argument>spark-base/entrypoint.sh</argument> | ||
</arguments> | ||
</configuration> | ||
</execution> | ||
</executions> | ||
</plugin> | ||
<plugin> | ||
<groupId>com.googlecode.maven-download-plugin</groupId> | ||
<artifactId>download-maven-plugin</artifactId> | ||
<version>${download-maven-plugin.version}</version> | ||
<executions> | ||
<execution> | ||
<id>download-minikube-linux</id> | ||
<phase>pre-integration-test</phase> | ||
<goals> | ||
<goal>wget</goal> | ||
</goals> | ||
<configuration> | ||
<url>https://storage.googleapis.com/minikube/releases/v0.22.0/minikube-linux-amd64</url> | ||
<outputDirectory>${project.build.directory}/minikube-bin/linux-amd64</outputDirectory> | ||
<outputFileName>minikube</outputFileName> | ||
</configuration> | ||
</execution> | ||
<execution> | ||
<id>download-minikube-darwin</id> | ||
<phase>pre-integration-test</phase> | ||
<goals> | ||
<goal>wget</goal> | ||
</goals> | ||
<configuration> | ||
<url>https://storage.googleapis.com/minikube/releases/v0.22.0/minikube-darwin-amd64</url> | ||
<outputDirectory>${project.build.directory}/minikube-bin/darwin-amd64</outputDirectory> | ||
<outputFileName>minikube</outputFileName> | ||
</configuration> | ||
</execution> | ||
</executions> | ||
</plugin> | ||
<plugin> | ||
<!-- Triggers scalatest plugin in the integration-test phase instead of | ||
the test phase. --> | ||
<groupId>org.scalatest</groupId> | ||
<artifactId>scalatest-maven-plugin</artifactId> | ||
<version>${scalatest-maven-plugin.version}</version> | ||
<configuration> | ||
<reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory> | ||
<junitxml>.</junitxml> | ||
<filereports>SparkTestSuite.txt</filereports> | ||
<argLine>-ea -Xmx3g -XX:ReservedCodeCacheSize=512m ${extraScalaTestArgs}</argLine> | ||
<stderr/> | ||
<systemProperties> | ||
<log4j.configuration>file:src/test/resources/log4j.properties</log4j.configuration> | ||
<java.awt.headless>true</java.awt.headless> | ||
</systemProperties> | ||
<tagsToExclude>${test.exclude.tags}</tagsToExclude> | ||
</configuration> | ||
<executions> | ||
<execution> | ||
<id>test</id> | ||
<goals> | ||
<goal>test</goal> | ||
</goals> | ||
<configuration> | ||
<!-- The negative pattern below prevents integration tests such as | ||
KubernetesSuite from running in the test phase. --> | ||
<suffixes>(?<!Suite)</suffixes> | ||
</configuration> | ||
</execution> | ||
<execution> | ||
<id>integration-test</id> | ||
<phase>integration-test</phase> | ||
<goals> | ||
<goal>test</goal> | ||
</goals> | ||
</execution> | ||
</executions> | ||
</plugin> | ||
</plugins> | ||
|
||
</build> | ||
|
||
</project> |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# | ||
# Licensed to the Apache Software Foundation (ASF) under one or more | ||
# contributor license agreements. See the NOTICE file distributed with | ||
# this work for additional information regarding copyright ownership. | ||
# The ASF licenses this file to You under the Apache License, Version 2.0 | ||
# (the "License"); you may not use this file except in compliance with | ||
# the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
# | ||
|
||
# Set everything to be logged to the file target/integration-tests.log | ||
log4j.rootCategory=INFO, file | ||
log4j.appender.file=org.apache.log4j.FileAppender | ||
log4j.appender.file.append=true | ||
log4j.appender.file.file=target/integration-tests.log | ||
log4j.appender.file.layout=org.apache.log4j.PatternLayout | ||
log4j.appender.file.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss.SSS} %t %p %c{1}: %m%n | ||
|
||
# Ignore messages below warning level from a few verbose libraries. | ||
log4j.logger.com.sun.jersey=WARN | ||
log4j.logger.org.apache.hadoop=WARN | ||
log4j.logger.org.eclipse.jetty=WARN | ||
log4j.logger.org.mortbay=WARN | ||
log4j.logger.org.spark_project.jetty=WARN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change this default behavior now? I think we wanted to remove the minikube lifecycle management from these tests. cc/ @mccheah
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can. But existing Jenkins jobs requires minikube to be cleaned up when they are done. So they need to set this flag to be true. I am not sure it's worth the effort now, given that we are going to incorporate apache-spark-on-k8s/spark#521 in the near future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. Thanks for clarifying.