Skip to content

Use a pre-installed Minikube instance -- porting over logic from PR 521 #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jan 12, 2018
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 11 additions & 23 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,17 @@ is subject to change.

Note that currently the integration tests only run with Java 8.

Integration tests firstly require installing [Minikube](https://kubernetes.io/docs/getting-started-guides/minikube/) on
your machine, and for the `Minikube` binary to be on your `PATH`.. Refer to the Minikube documentation for instructions
on how to install it. It is recommended to allocate at least 8 CPUs and 8GB of memory to the Minikube cluster.

Running the integration tests requires a Spark distribution package tarball that
contains Spark jars, submission clients, etc. You can download a tarball from
http://spark.apache.org/downloads.html. Or, you can create a distribution from
source code using `make-distribution.sh`. For example:

```
$ git clone git@github.com:apache/spark.git
$ https://github.com/apache/spark.git
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this line mean only with a URL?

$ cd spark
$ ./dev/make-distribution.sh --tgz \
-Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver
Expand Down Expand Up @@ -46,37 +50,21 @@ In order to run against any cluster, use the following:
```sh
$ mvn clean integration-test \
-Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz \
-DextraScalaTestArgs="-Dspark.kubernetes.test.master=k8s://https://<master> -Dspark.docker.test.driverImage=<driver-image> -Dspark.docker.test.executorImage=<executor-image>"
```

# Preserve the Minikube VM

The integration tests make use of
[Minikube](https://github.com/kubernetes/minikube), which fires up a virtual
machine and setup a single-node kubernetes cluster within it. By default the vm
is destroyed after the tests are finished. If you want to preserve the vm, e.g.
to reduce the running time of tests during development, you can pass the
property `spark.docker.test.persistMinikube` to the test process:

```
$ mvn clean integration-test \
-Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz \
-DextraScalaTestArgs=-Dspark.docker.test.persistMinikube=true
-DextraScalaTestArgs="-Dspark.kubernetes.test.master=k8s://https://<master>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the master parameter now required? If so, explain how people can get it from a minikube?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubectl cluster-info is already discussed above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Master parameter shouldn't be required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...But this section is under the "running against an arbitrary cluster" section, so in this context the parameter would be required.

```

# Reuse the previous Docker images
# Specify existing docker images via image:tag

The integration tests build a number of Docker images, which takes some time.
By default, the images are built every time the tests run. You may want to skip
re-building those images during development, if the distribution package did not
change since the last run. You can pass the property
`spark.docker.test.skipBuildImages` to the test process. This will work only if
you have been setting the property `spark.docker.test.persistMinikube`, in the
previous run since the docker daemon run inside the minikube environment. Here
is an example:
`spark.kubernetes.test.imageDockerTag` to the test process and specify the Docker
image tag that is appropriate.
Here is an example:

```
$ mvn clean integration-test \
-Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz \
"-DextraScalaTestArgs=-Dspark.docker.test.persistMinikube=true -Dspark.docker.test.skipBuildImages=true"
-Dspark.kubernetes.test.imageDockerTag=latest
```
32 changes: 0 additions & 32 deletions integration-test/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,6 @@
<slf4j-log4j12.version>1.7.24</slf4j-log4j12.version>
<sbt.project.name>kubernetes-integration-tests</sbt.project.name>
<spark-distro-tgz>YOUR-SPARK-DISTRO-TARBALL-HERE</spark-distro-tgz>
<spark-dockerfiles-dir>YOUR-DOCKERFILES-DIR-HERE</spark-dockerfiles-dir>
<test.exclude.tags></test.exclude.tags>
</properties>
<packaging>jar</packaging>
Expand Down Expand Up @@ -141,37 +140,6 @@
</execution>
</executions>
</plugin>
<plugin>
<groupId>com.googlecode.maven-download-plugin</groupId>
<artifactId>download-maven-plugin</artifactId>
<version>${download-maven-plugin.version}</version>
<executions>
<execution>
<id>download-minikube-linux</id>
<phase>pre-integration-test</phase>
<goals>
<goal>wget</goal>
</goals>
<configuration>
<url>https://storage.googleapis.com/minikube/releases/v0.22.0/minikube-linux-amd64</url>
<outputDirectory>${project.build.directory}/minikube-bin/linux-amd64</outputDirectory>
<outputFileName>minikube</outputFileName>
</configuration>
</execution>
<execution>
<id>download-minikube-darwin</id>
<phase>pre-integration-test</phase>
<goals>
<goal>wget</goal>
</goals>
<configuration>
<url>https://storage.googleapis.com/minikube/releases/v0.22.0/minikube-darwin-amd64</url>
<outputDirectory>${project.build.directory}/minikube-bin/darwin-amd64</outputDirectory>
<outputFileName>minikube</outputFileName>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<!-- Triggers scalatest plugin in the integration-test phase instead of
the test phase. -->
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,9 @@ import org.scalatest.concurrent.{Eventually, PatienceConfiguration}
import org.scalatest.time.{Minutes, Seconds, Span}

import org.apache.spark.deploy.k8s.integrationtest.backend.IntegrationTestBackendFactory
import org.apache.spark.deploy.k8s.integrationtest.constants.SPARK_DISTRO_PATH
import org.apache.spark.deploy.k8s.integrationtest.backend.minikube.MinikubeTestBackend
import org.apache.spark.deploy.k8s.integrationtest.constants._
import org.apache.spark.deploy.k8s.integrationtest.config._

private[spark] class KubernetesSuite extends FunSuite with BeforeAndAfterAll with BeforeAndAfter {

Expand All @@ -52,6 +54,9 @@ private[spark] class KubernetesSuite extends FunSuite with BeforeAndAfterAll wit
before {
sparkAppConf = kubernetesTestComponents.newSparkAppConf()
.set("spark.kubernetes.driver.label.spark-app-locator", APP_LOCATOR_LABEL)
.set(DRIVER_DOCKER_IMAGE, tagImage("spark-driver"))
.set(EXECUTOR_DOCKER_IMAGE, tagImage("spark-executor"))
.set(INIT_CONTAINER_DOCKER_IMAGE, tagImage("spark-init"))
kubernetesTestComponents.createNamespace()
}

Expand All @@ -60,21 +65,25 @@ private[spark] class KubernetesSuite extends FunSuite with BeforeAndAfterAll wit
}

test("Run SparkPi with no resources") {
doMinikubeCheck
runSparkPiAndVerifyCompletion()
}

test("Run SparkPi with a very long application name.") {
doMinikubeCheck
sparkAppConf.set("spark.app.name", "long" * 40)
runSparkPiAndVerifyCompletion()
}

test("Run SparkPi with a master URL without a scheme.") {
doMinikubeCheck
val url = kubernetesTestComponents.kubernetesClient.getMasterUrl
sparkAppConf.set("spark.master", s"k8s://${url.getHost}:${url.getPort}")
runSparkPiAndVerifyCompletion()
}

test("Run SparkPi with custom driver pod name, labels, annotations, and environment variables.") {
doMinikubeCheck
sparkAppConf
.set("spark.kubernetes.driver.pod.name", "spark-integration-spark-pi")
.set("spark.kubernetes.driver.label.label1", "label1-value")
Expand Down Expand Up @@ -143,6 +152,10 @@ private[spark] class KubernetesSuite extends FunSuite with BeforeAndAfterAll wit
}
}
}
private def doMinikubeCheck(): Unit = {
assume(testBackend == MinikubeTestBackend)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I remember @foxish just deleted this line recently so integration tests can run against GCE. Can you check with @foxish?

Copy link
Contributor

@mccheah mccheah Jan 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Comment was for the wrong section)

}
private def tagImage(image: String): String = s"$image:${testBackend.dockerImageTag()}"

private def doBasicDriverPodCheck(driverPod: Pod): Unit = {
assert(driverPod.getMetadata.getLabels.get("spark-role") === "driver")
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,41 @@ package org.apache.spark.deploy.k8s.integrationtest
import java.io.Closeable
import java.net.URI

import java.io.{IOException,InputStream,OutputStream}

object Utils extends Logging {

def tryWithResource[R <: Closeable, T](createResource: => R)(f: R => T): T = {
val resource = createResource
try f.apply(resource) finally resource.close()
}

def tryWithSafeFinally[T](block: => T)(finallyBlock: => Unit): T = {
var originalThrowable: Throwable = null
try {
block
} catch {
case t: Throwable =>
// Purposefully not using NonFatal, because even fatal exceptions
// we don't want to have our finallyBlock suppress
originalThrowable = t
throw originalThrowable
} finally {
try {
finallyBlock
} catch {
case t: Throwable =>
if (originalThrowable != null) {
originalThrowable.addSuppressed(t)
logWarning(s"Suppressing exception in finally: " + t.getMessage, t)
throw originalThrowable
} else {
throw t
}
}
}
}

def checkAndGetK8sMasterUrl(rawMasterURL: String): String = {
require(rawMasterURL.startsWith("k8s://"),
"Kubernetes master URL must start with k8s://.")
Expand Down Expand Up @@ -57,4 +85,30 @@ object Utils extends Logging {

s"k8s://$resolvedURL"
}

class RedirectThread(
in: InputStream,
out: OutputStream,
name: String,
propagateEof: Boolean = false) extends Thread(name) {
setDaemon(true)
override def run() {
scala.util.control.Exception.ignoring(classOf[IOException]) {
// FIXME: We copy the stream on the level of bytes to avoid encoding problems.
Utils.tryWithSafeFinally {
val buf = new Array[Byte](1024)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this line 99-105 go into its own subroutine?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative, do we have Guava available? Could we just use ByteStreams.copy() instead of the entire body here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These Utils are taken from spark core.... should I modify them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that the cleanest modification (using Guava) is something we should do. There's no reason to replicate this code from the Spark core.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this class being used anywhere anymore actually, so we can remove this.

var len = in.read(buf)
while (len != -1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This input reading loop is typically done with a break command inside an infinite loop. (So there aren't two reads in the code)

Not that it's too important to fix this, but have you considered using an approach like this so it can use breaks?

https://alvinalexander.com/scala/break-continue-for-while-loops-in-scala-examples-how-to

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto to above ^^

out.write(buf, 0, len)
out.flush()
len = in.read(buf)
}
} {
if (propagateEof) {
out.close()
}
}
}
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ import io.fabric8.kubernetes.client.{ConfigBuilder, DefaultKubernetesClient}

import org.apache.spark.deploy.k8s.integrationtest.Utils
import org.apache.spark.deploy.k8s.integrationtest.backend.IntegrationTestBackend
import org.apache.spark.deploy.k8s.integrationtest.constants.GCE_TEST_BACKEND
import org.apache.spark.deploy.k8s.integrationtest.config._

private[spark] class GCETestBackend(val master: String) extends IntegrationTestBackend {
private var defaultClient: DefaultKubernetesClient = _
Expand All @@ -37,5 +37,7 @@ private[spark] class GCETestBackend(val master: String) extends IntegrationTestB
defaultClient
}

override def name(): String = GCE_TEST_BACKEND
override def dockerImageTag(): String = {
return System.getProperty(KUBERNETES_TEST_DOCKER_TAG_SYSTEM_PROPERTY, "latest")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not generate a random ID like minikube backend code does? i.e. UUID.randomUUID().toString.replaceAll("-", "")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Minikube case we're building these images from scratch. In the GCE case, we don't create a Docker manager and hence are not building the images there. But this in itself seems to contradict this section of our readme:

If you're using a non-local cluster, you must provide an image repository which you have write access to, using the -i option, in order to store docker images generated during the test.

which indicates that GCE-backed tests should be building images as well. Is this correct @foxish?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That readme section is meant to highlight that we push the images to an image repository only in the cloud testing case, and don't have to in the minikube case since the images are built in the minikube VM's docker environment. That documentation pertains only to the use of the script, which avoids using maven for building images.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem then with using a random ID tag here is that it's impossible for this tag to actually match anything. Using "latest" at least guarantees that we pick up some image in the default case.

We can be more strict here and require the tag be explicitly specified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking a little closer I think the miscommunication is because the docker image manager isn't serving the image tag but is instead being handed the tag by the test backend. The responsibilities thus aren't clear and the coupling of the provision of a custom tag vs. a generated tag, and how that influences whether or not images are built or deleted, is unclear.

I'm moving the generation of the tag vs. using the user-provided one into the docker manager. This should hopefully clarify the connection.

}
}
Original file line number Diff line number Diff line change
Expand Up @@ -23,16 +23,16 @@ import org.apache.spark.deploy.k8s.integrationtest.backend.GCE.GCETestBackend
import org.apache.spark.deploy.k8s.integrationtest.backend.minikube.MinikubeTestBackend

private[spark] trait IntegrationTestBackend {
def name(): String
def initialize(): Unit
def getKubernetesClient(): DefaultKubernetesClient
def dockerImageTag(): String
def cleanUp(): Unit = {}
}

private[spark] object IntegrationTestBackendFactory {
def getTestBackend(): IntegrationTestBackend = {
Option(System.getProperty("spark.kubernetes.test.master"))
.map(new GCETestBackend(_))
.getOrElse(new MinikubeTestBackend())
.getOrElse(MinikubeTestBackend)
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -20,73 +20,37 @@ import java.nio.file.Paths

import io.fabric8.kubernetes.client.{ConfigBuilder, DefaultKubernetesClient}

import org.apache.commons.lang3.SystemUtils
import org.apache.spark.deploy.k8s.integrationtest.{Logging, ProcessUtils}

// TODO support windows
private[spark] object Minikube extends Logging {
private val MINIKUBE_EXECUTABLE_DEST = if (SystemUtils.IS_OS_MAC_OSX) {
Paths.get("target", "minikube-bin", "darwin-amd64", "minikube").toFile
} else if (SystemUtils.IS_OS_WINDOWS) {
throw new IllegalStateException("Executing Minikube based integration tests not yet " +
" available on Windows.")
} else {
Paths.get("target", "minikube-bin", "linux-amd64", "minikube").toFile
}

private val EXPECTED_DOWNLOADED_MINIKUBE_MESSAGE = "Minikube is not downloaded, expected at " +
s"${MINIKUBE_EXECUTABLE_DEST.getAbsolutePath}"

private val MINIKUBE_STARTUP_TIMEOUT_SECONDS = 60

// NOTE: This and the following methods are synchronized to prevent deleteMinikube from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are deleting this note. Maybe we don't need "synchronized" any more. Kill "synchronized" below?

// destroying the minikube VM while other methods try to use the VM.
// Such a race condition can corrupt the VM or some VM provisioning tools like VirtualBox.
def startMinikube(): Unit = synchronized {
assert(MINIKUBE_EXECUTABLE_DEST.exists(), EXPECTED_DOWNLOADED_MINIKUBE_MESSAGE)
if (getMinikubeStatus != MinikubeStatus.RUNNING) {
executeMinikube("start", "--memory", "6000", "--cpus", "8")
} else {
logInfo("Minikube is already started.")
}
}

def getMinikubeIp: String = synchronized {
assert(MINIKUBE_EXECUTABLE_DEST.exists(), EXPECTED_DOWNLOADED_MINIKUBE_MESSAGE)
val outputs = executeMinikube("ip")
.filter(_.matches("^\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}$"))
assert(outputs.size == 1, "Unexpected amount of output from minikube ip")
outputs.head
}

def getMinikubeStatus: MinikubeStatus.Value = synchronized {
assert(MINIKUBE_EXECUTABLE_DEST.exists(), EXPECTED_DOWNLOADED_MINIKUBE_MESSAGE)
val statusString = executeMinikube("status")
.filter(_.contains("minikube: "))
.filter(line => line.contains("minikubeVM: ") || line.contains("minikube:"))
.head
.replaceFirst("minikubeVM: ", "")
.replaceFirst("minikube: ", "")
MinikubeStatus.unapply(statusString)
.getOrElse(throw new IllegalStateException(s"Unknown status $statusString"))
}

def getDockerEnv: Map[String, String] = synchronized {
assert(MINIKUBE_EXECUTABLE_DEST.exists(), EXPECTED_DOWNLOADED_MINIKUBE_MESSAGE)
executeMinikube("docker-env", "--shell", "bash")
.filter(_.startsWith("export"))
.map(_.replaceFirst("export ", "").split('='))
.map(arr => (arr(0), arr(1).replaceAllLiterally("\"", "")))
.toMap
}

def deleteMinikube(): Unit = synchronized {
assert(MINIKUBE_EXECUTABLE_DEST.exists, EXPECTED_DOWNLOADED_MINIKUBE_MESSAGE)
if (getMinikubeStatus != MinikubeStatus.NONE) {
executeMinikube("delete")
} else {
logInfo("Minikube was already not running.")
}
}

def getKubernetesClient: DefaultKubernetesClient = synchronized {
val kubernetesMaster = s"https://${getMinikubeIp}:8443"
val userHome = System.getProperty("user.home")
Expand All @@ -105,13 +69,8 @@ private[spark] object Minikube extends Logging {
}

private def executeMinikube(action: String, args: String*): Seq[String] = {
if (!MINIKUBE_EXECUTABLE_DEST.canExecute) {
if (!MINIKUBE_EXECUTABLE_DEST.setExecutable(true)) {
throw new IllegalStateException("Failed to make the Minikube binary executable.")
}
}
ProcessUtils.executeProcess(Array(MINIKUBE_EXECUTABLE_DEST.getAbsolutePath, action) ++ args,
MINIKUBE_STARTUP_TIMEOUT_SECONDS)
ProcessUtils.executeProcess(
Array("minikube", action) ++ args, MINIKUBE_STARTUP_TIMEOUT_SECONDS)
}
}

Expand Down
Loading