Skip to content

Commit bd2ebfd

Browse files
vatsalmevadavelvia
authored andcommitted
docs: Mesos config tips (spark-jobserver#832)
1 parent 947344e commit bd2ebfd

File tree

2 files changed

+115
-1
lines changed

2 files changed

+115
-1
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ spark-jobserver provides a RESTful interface for submitting and managing [Apache
66
This repo contains the complete Spark job server project, including unit tests and deploy scripts.
77
It was originally started at [Ooyala](http://www.ooyala.com), but this is now the main development repo.
88

9-
See [Troubleshooting Tips](doc/troubleshooting.md) as well as [Yarn tips](doc/yarn.md).
9+
Other useful links: [Troubleshooting Tips](doc/troubleshooting.md), [Yarn tips](doc/yarn.md), [Mesos tips](doc/mesos.md).
1010

1111
Also see [Chinese docs / 中文](doc/chinese/job-server.md).
1212

doc/mesos.md

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
## Configuring Job Server for Mesos
2+
3+
### Mesos client mode
4+
5+
Configuring job-server for Mesos cluster mode is straight forward. All you need to change is `spark.master` config to
6+
point to Mesos master URL in job-server config file.
7+
8+
Example config file (important settings are marked with # important):
9+
10+
spark {
11+
master = <mesos master URL here> # important, example: mesos://mesos-master:5050
12+
13+
# Default # of CPUs for jobs to use for Spark standalone cluster
14+
job-number-cpus = 4
15+
16+
jobserver {
17+
port = 8090
18+
jobdao = spark.jobserver.io.JobSqlDAO
19+
20+
sqldao {
21+
# Directory where default H2 driver stores its data. Only needed for H2.
22+
rootdir = /database
23+
24+
# Full JDBC URL / init string. Sorry, needs to match above.
25+
# Substitutions may be used to launch job-server, but leave it out here in the default or tests won't pass
26+
jdbc.url = "jdbc:h2:file:/database/h2-db"
27+
}
28+
}
29+
30+
# universal context configuration. These settings can be overridden, see README.md
31+
context-settings {
32+
num-cpu-cores = 2 # Number of cores to allocate. Required.
33+
memory-per-node = 512m # Executor memory per node, -Xmx style eg 512m, #1G, etc.
34+
}
35+
}
36+
37+
### Mesos cluster Mode
38+
39+
Configuring job-server for Mesos cluster mode is a bit tricky as compared to client mode.
40+
41+
Here is the checklist for the changes needed for the same:
42+
43+
- You need to start Mesos dispatcher in your cluster by running `./sbin/start-mesos-dispatcher.sh` available in
44+
spark package. This step is not specific to job-server and as mentioned in [official spark documentation](https://spark.apache.org/docs/latest/running-on-mesos.html#cluster-mode) this is needed
45+
to submit spark job in Mesos cluster mode.
46+
47+
- Add following config at the end of job-server's settings.sh file:
48+
49+
```
50+
REMOTE_JOBSERVER_DIR=<path to job-server directory> # copy job-server directory on this location on all mesos agent nodes
51+
MESOS_SPARK_DISPATCHER=<mesos dispatcher URL> # example: mesos://mesos-dispatcher:7077
52+
```
53+
54+
- Set `spark.jobserver.driver-mode` property to `mesos-cluster` in job-server config file.
55+
56+
- Also override akka default configs in job-server config file to support big remote jars fetch, we have to set frame
57+
size to some large value, for example:
58+
59+
```
60+
akka.remote.netty.tcp {
61+
# use remote IP address to form akka cluster, not 127.0.0.1. This should be the IP of of the machine where the file
62+
# resides. That means for each mesos agents (where job-server directory is copied on REMOTE_JOBSERVER_DIR path),
63+
# the hostname should be the remote IP of that node.
64+
#
65+
hostname = "xxxxx"
66+
# This controls the maximum message size, including job results, that can be sent
67+
maximum-frame-size = 104857600b
68+
}
69+
```
70+
71+
- set `spark.master` to Mesos master URL (and not mesos-dispatcher URL).
72+
73+
- set `spark.jobserver.context-per-jvm` to `true` in job-server config file.
74+
75+
Example config file (important settings are marked with # important):
76+
77+
spark {
78+
master = <mesos master URL here> # important, example: mesos://mesos-master:5050
79+
80+
# Default # of CPUs for jobs to use for Spark standalone cluster
81+
job-number-cpus = 4
82+
83+
jobserver {
84+
port = 8090
85+
driver-mode = mesos-cluster #important
86+
context-per-jvm = true #important
87+
jobdao = spark.jobserver.io.JobSqlDAO
88+
89+
sqldao {
90+
# Directory where default H2 driver stores its data. Only needed for H2.
91+
rootdir = /database
92+
93+
# Full JDBC URL / init string. Sorry, needs to match above.
94+
# Substitutions may be used to launch job-server, but leave it out here in the default or tests won't pass
95+
jdbc.url = "jdbc:h2:file:/database/h2-db"
96+
}
97+
}
98+
99+
# universal context configuration. These settings can be overridden, see README.md
100+
context-settings {
101+
num-cpu-cores = 2 # Number of cores to allocate. Required.
102+
memory-per-node = 512m # Executor memory per node, -Xmx style eg 512m, #1G, etc.
103+
}
104+
}
105+
106+
akka.remote.netty.tcp {
107+
# use remote IP address to form akka cluster, not 127.0.0.1. This should be the IP of of the machine where the file
108+
# resides. That means for each mesos agents (where job-server directory is copied on REMOTE_JOBSERVER_DIR path),
109+
# the hostname should be the remote IP of that node.
110+
#
111+
hostname = "xxxxx" #important
112+
# This controls the maximum message size, including job results, that can be sent
113+
maximum-frame-size = 104857600b #important
114+
}

0 commit comments

Comments
 (0)