Skip to content
This repository was archived by the owner on Jan 9, 2020. It is now read-only.

Commit a8330eb

Browse files
authored
Merge pull request #388 from apache-spark-on-k8s/branch-2.2-kubernetes-g
Branch 2.2 kubernetes
2 parents a2c7b21 + beb1361 commit a8330eb

File tree

181 files changed

+13281
-80
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

181 files changed

+13281
-80
lines changed

.travis.yml

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,10 +25,22 @@
2525
sudo: required
2626
dist: trusty
2727

28-
# 2. Choose language and target JDKs for parallel builds.
28+
# 2. Choose language, target JDK and env's for parallel builds.
2929
language: java
3030
jdk:
3131
- oraclejdk8
32+
env: # Used by the install section below.
33+
# Configure the unit test build for spark core and kubernetes modules,
34+
# while excluding some flaky unit tests using a regex pattern.
35+
- PHASE=test \
36+
PROFILES="-Pmesos -Pyarn -Phadoop-2.7 -Pkubernetes" \
37+
MODULES="-pl core,resource-managers/kubernetes/core -am" \
38+
ARGS="-Dtest=none -Dsuffixes='^org\.apache\.spark\.(?!ExternalShuffleServiceSuite|SortShuffleSuite$|rdd\.LocalCheckpointSuite$|deploy\.SparkSubmitSuite$|deploy\.StandaloneDynamicAllocationSuite$).*'"
39+
# Configure the full build.
40+
- PHASE=install \
41+
PROFILES="-Pmesos -Pyarn -Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver" \
42+
MODULES="" \
43+
ARGS="-T 4 -q -DskipTests"
3244

3345
# 3. Setup cache directory for SBT and Maven.
3446
cache:
@@ -40,11 +52,12 @@ cache:
4052
notifications:
4153
email: false
4254

43-
# 5. Run maven install before running lint-java.
55+
# 5. Run maven build before running lints.
4456
install:
4557
- export MAVEN_SKIP_RC=1
46-
- build/mvn -T 4 -q -DskipTests -Pmesos -Pyarn -Pkinesis-asl -Phive -Phive-thriftserver install
58+
- build/mvn ${PHASE} ${PROFILES} ${MODULES} ${ARGS}
4759

48-
# 6. Run lint-java.
60+
# 6. Run lints.
4961
script:
5062
- dev/lint-java
63+
- dev/lint-scala

README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,42 @@
1+
# Apache Spark On Kubernetes
2+
3+
This repository, located at https://github.com/apache-spark-on-k8s/spark, contains a fork of Apache Spark that enables running Spark jobs natively on a Kubernetes cluster.
4+
5+
## What is this?
6+
7+
This is a collaboratively maintained project working on [SPARK-18278](https://issues.apache.org/jira/browse/SPARK-18278). The goal is to bring native support for Spark to use Kubernetes as a cluster manager, in a fully supported way on par with the Spark Standalone, Mesos, and Apache YARN cluster managers.
8+
9+
## Getting Started
10+
11+
- [Usage guide](https://apache-spark-on-k8s.github.io/userdocs/) shows how to run the code
12+
- [Development docs](resource-managers/kubernetes/README.md) shows how to get set up for development
13+
- Code is primarily located in the [resource-managers/kubernetes](resource-managers/kubernetes) folder
14+
15+
## Why does this fork exist?
16+
17+
Adding native integration for a new cluster manager is a large undertaking. If poorly executed, it could introduce bugs into Spark when run on other cluster managers, cause release blockers slowing down the overall Spark project, or require hotfixes which divert attention away from development towards managing additional releases. Any work this deep inside Spark needs to be done carefully to minimize the risk of those negative externalities.
18+
19+
At the same time, an increasing number of people from various companies and organizations desire to work together to natively run Spark on Kubernetes. The group needs a code repository, communication forum, issue tracking, and continuous integration, all in order to work together effectively on an open source product.
20+
21+
We've been asked by an Apache Spark Committer to work outside of the Apache infrastructure for a short period of time to allow this feature to be hardened and improved without creating risk for Apache Spark. The aim is to rapidly bring it to the point where it can be brought into the mainline Apache Spark repository for continued development within the Apache umbrella. If all goes well, this should be a short-lived fork rather than a long-lived one.
22+
23+
## Who are we?
24+
25+
This is a collaborative effort by several folks from different companies who are interested in seeing this feature be successful. Companies active in this project include (alphabetically):
26+
27+
- Bloomberg
28+
- Google
29+
- Haiwen
30+
- Hyperpilot
31+
- Intel
32+
- Palantir
33+
- Pepperdata
34+
- Red Hat
35+
36+
--------------------
37+
38+
(original README below)
39+
140
# Apache Spark
241

342
Spark is a fast and general cluster computing system for Big Data. It provides

assembly/pom.xml

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
<parent>
2222
<groupId>org.apache.spark</groupId>
2323
<artifactId>spark-parent_2.11</artifactId>
24-
<version>2.2.0</version>
24+
<version>2.2.0-k8s-0.3.0-SNAPSHOT</version>
2525
<relativePath>../pom.xml</relativePath>
2626
</parent>
2727

@@ -148,6 +148,16 @@
148148
</dependency>
149149
</dependencies>
150150
</profile>
151+
<profile>
152+
<id>kubernetes</id>
153+
<dependencies>
154+
<dependency>
155+
<groupId>org.apache.spark</groupId>
156+
<artifactId>spark-kubernetes_${scala.binary.version}</artifactId>
157+
<version>${project.version}</version>
158+
</dependency>
159+
</dependencies>
160+
</profile>
151161
<profile>
152162
<id>hive</id>
153163
<dependencies>

common/network-common/pom.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
<parent>
2323
<groupId>org.apache.spark</groupId>
2424
<artifactId>spark-parent_2.11</artifactId>
25-
<version>2.2.0</version>
25+
<version>2.2.0-k8s-0.3.0-SNAPSHOT</version>
2626
<relativePath>../../pom.xml</relativePath>
2727
</parent>
2828

common/network-shuffle/pom.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
<parent>
2323
<groupId>org.apache.spark</groupId>
2424
<artifactId>spark-parent_2.11</artifactId>
25-
<version>2.2.0</version>
25+
<version>2.2.0-k8s-0.3.0-SNAPSHOT</version>
2626
<relativePath>../../pom.xml</relativePath>
2727
</parent>
2828

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one or more
3+
* contributor license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright ownership.
5+
* The ASF licenses this file to You under the Apache License, Version 2.0
6+
* (the "License"); you may not use this file except in compliance with
7+
* the License. You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
package org.apache.spark.network.shuffle.kubernetes;
19+
20+
import org.apache.spark.network.client.RpcResponseCallback;
21+
import org.apache.spark.network.client.TransportClient;
22+
import org.apache.spark.network.sasl.SecretKeyHolder;
23+
import org.apache.spark.network.shuffle.ExternalShuffleClient;
24+
import org.apache.spark.network.shuffle.protocol.RegisterDriver;
25+
import org.apache.spark.network.util.TransportConf;
26+
import org.slf4j.Logger;
27+
import org.slf4j.LoggerFactory;
28+
29+
import java.io.IOException;
30+
import java.nio.ByteBuffer;
31+
32+
/**
33+
* A client for talking to the external shuffle service in Kubernetes cluster mode.
34+
*
35+
* This is used by the each Spark executor to register with a corresponding external
36+
* shuffle service on the cluster. The purpose is for cleaning up shuffle files
37+
* reliably if the application exits unexpectedly.
38+
*/
39+
public class KubernetesExternalShuffleClient extends ExternalShuffleClient {
40+
private static final Logger logger = LoggerFactory
41+
.getLogger(KubernetesExternalShuffleClient.class);
42+
43+
/**
44+
* Creates an Kubernetes external shuffle client that wraps the {@link ExternalShuffleClient}.
45+
* Please refer to docs on {@link ExternalShuffleClient} for more information.
46+
*/
47+
public KubernetesExternalShuffleClient(
48+
TransportConf conf,
49+
SecretKeyHolder secretKeyHolder,
50+
boolean saslEnabled) {
51+
super(conf, secretKeyHolder, saslEnabled);
52+
}
53+
54+
public void registerDriverWithShuffleService(String host, int port)
55+
throws IOException, InterruptedException {
56+
checkInit();
57+
ByteBuffer registerDriver = new RegisterDriver(appId, 0).toByteBuffer();
58+
TransportClient client = clientFactory.createClient(host, port);
59+
client.sendRpc(registerDriver, new RegisterDriverCallback());
60+
}
61+
62+
private class RegisterDriverCallback implements RpcResponseCallback {
63+
@Override
64+
public void onSuccess(ByteBuffer response) {
65+
logger.info("Successfully registered app " + appId + " with external shuffle service.");
66+
}
67+
68+
@Override
69+
public void onFailure(Throwable e) {
70+
logger.warn("Unable to register app " + appId + " with external shuffle service. " +
71+
"Please manually remove shuffle data after driver exit. Error: " + e);
72+
}
73+
}
74+
75+
@Override
76+
public void close() {
77+
super.close();
78+
}
79+
}

common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/mesos/MesosExternalShuffleClient.java

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@
3232
import org.apache.spark.network.client.TransportClient;
3333
import org.apache.spark.network.sasl.SecretKeyHolder;
3434
import org.apache.spark.network.shuffle.ExternalShuffleClient;
35-
import org.apache.spark.network.shuffle.protocol.mesos.RegisterDriver;
35+
import org.apache.spark.network.shuffle.protocol.RegisterDriver;
3636
import org.apache.spark.network.util.TransportConf;
3737

3838
/**

common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/protocol/BlockTransferMessage.java

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,6 @@
2323
import io.netty.buffer.Unpooled;
2424

2525
import org.apache.spark.network.protocol.Encodable;
26-
import org.apache.spark.network.shuffle.protocol.mesos.RegisterDriver;
2726
import org.apache.spark.network.shuffle.protocol.mesos.ShuffleServiceHeartbeat;
2827

2928
/**
Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,19 +15,18 @@
1515
* limitations under the License.
1616
*/
1717

18-
package org.apache.spark.network.shuffle.protocol.mesos;
18+
package org.apache.spark.network.shuffle.protocol;
1919

2020
import com.google.common.base.Objects;
2121
import io.netty.buffer.ByteBuf;
2222

2323
import org.apache.spark.network.protocol.Encoders;
24-
import org.apache.spark.network.shuffle.protocol.BlockTransferMessage;
2524

2625
// Needed by ScalaDoc. See SPARK-7726
2726
import static org.apache.spark.network.shuffle.protocol.BlockTransferMessage.Type;
2827

2928
/**
30-
* A message sent from the driver to register with the MesosExternalShuffleService.
29+
* A message sent from the driver to register with an ExternalShuffleService.
3130
*/
3231
public class RegisterDriver extends BlockTransferMessage {
3332
private final String appId;

common/network-yarn/pom.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
<parent>
2323
<groupId>org.apache.spark</groupId>
2424
<artifactId>spark-parent_2.11</artifactId>
25-
<version>2.2.0</version>
25+
<version>2.2.0-k8s-0.3.0-SNAPSHOT</version>
2626
<relativePath>../../pom.xml</relativePath>
2727
</parent>
2828

0 commit comments

Comments
 (0)