|
1 | 1 | ## Configuring Job Server for Mesos
|
2 | 2 |
|
| 3 | +See also running on [cluster](cluster.md), YARN in [client mode](yarn.md) and running on [EMR](EMR.md). |
| 4 | + |
3 | 5 | ### Mesos client mode
|
4 | 6 |
|
5 |
| -Configuring job-server for Mesos cluster mode is straight forward. All you need to change is `spark.master` config to |
| 7 | +Configuring job-server for Mesos client mode is straight forward. All you need to change is `spark.master` config to |
6 | 8 | point to Mesos master URL in job-server config file.
|
7 | 9 |
|
8 | 10 | Example config file (important settings are marked with # important):
|
9 | 11 |
|
10 | 12 | spark {
|
11 |
| - master = <mesos master URL here> # important, example: mesos://mesos-master:5050 |
12 |
| - |
13 |
| - # Default # of CPUs for jobs to use for Spark standalone cluster |
14 |
| - job-number-cpus = 4 |
15 |
| - |
16 |
| - jobserver { |
17 |
| - port = 8090 |
18 |
| - jobdao = spark.jobserver.io.JobSqlDAO |
19 |
| - |
20 |
| - sqldao { |
21 |
| - # Directory where default H2 driver stores its data. Only needed for H2. |
22 |
| - rootdir = /database |
23 |
| - |
24 |
| - # Full JDBC URL / init string. Sorry, needs to match above. |
25 |
| - # Substitutions may be used to launch job-server, but leave it out here in the default or tests won't pass |
26 |
| - jdbc.url = "jdbc:h2:file:/database/h2-db" |
27 |
| - } |
28 |
| - } |
29 |
| - |
30 |
| - # universal context configuration. These settings can be overridden, see README.md |
31 |
| - context-settings { |
32 |
| - num-cpu-cores = 2 # Number of cores to allocate. Required. |
33 |
| - memory-per-node = 512m # Executor memory per node, -Xmx style eg 512m, #1G, etc. |
34 |
| - } |
| 13 | + master = <mesos master URL here> # example: mesos://mesos-master:5050 |
35 | 14 | }
|
36 | 15 |
|
37 |
| -### Mesos cluster Mode |
| 16 | +### Mesos cluster mode |
38 | 17 |
|
39 | 18 | Configuring job-server for Mesos cluster mode is a bit tricky as compared to client mode.
|
40 | 19 |
|
41 |
| -Here is the checklist for the changes needed for the same: |
42 |
| - |
43 |
| -- You need to start Mesos dispatcher in your cluster by running `./sbin/start-mesos-dispatcher.sh` available in |
| 20 | +You need to start Mesos dispatcher in your cluster by running `./sbin/start-mesos-dispatcher.sh` available in |
44 | 21 | spark package. This step is not specific to job-server and as mentioned in [official spark documentation](https://spark.apache.org/docs/latest/running-on-mesos.html#cluster-mode) this is needed
|
45 | 22 | to submit spark job in Mesos cluster mode.
|
46 | 23 |
|
47 |
| -- Add following config at the end of job-server's settings.sh file: |
48 |
| - |
49 |
| - ``` |
50 |
| - REMOTE_JOBSERVER_DIR=<path to job-server directory> # copy job-server directory on this location on all mesos agent nodes |
51 |
| - MESOS_SPARK_DISPATCHER=<mesos dispatcher URL> # example: mesos://mesos-dispatcher:7077 |
52 |
| - ``` |
53 |
| -
|
54 |
| -- Set `spark.jobserver.driver-mode` property to `mesos-cluster` in job-server config file. |
| 24 | +Add the following config to you job-server config file: |
| 25 | +- set `spark.master` property to messos dispatcher URL (example: `mesos://mesos-dispatcher:7077`) |
| 26 | +- set `spark.submit.deployMode` property to `cluster` |
| 27 | +- set `spark.jobserver.context-per-jvm` to `true` |
| 28 | +- set `akka.remote.netty.tcp.hostname` to the cluster interface of the host running the frontend |
| 29 | +- set `akka.remote.netty.tcp.maximum-frame-size` to support big remote jars fetch |
55 | 30 |
|
56 |
| -- Also override akka default configs in job-server config file to support big remote jars fetch, we have to set frame |
57 |
| -size to some large value, for example: |
58 |
| -
|
59 |
| -``` |
60 |
| -akka.remote.netty.tcp { |
61 |
| - # use remote IP address to form akka cluster, not 127.0.0.1. This should be the IP of of the machine where the file |
62 |
| - # resides. That means for each mesos agents (where job-server directory is copied on REMOTE_JOBSERVER_DIR path), |
63 |
| - # the hostname should be the remote IP of that node. |
64 |
| - # |
65 |
| - hostname = "xxxxx" |
66 |
| - # This controls the maximum message size, including job results, that can be sent |
67 |
| - maximum-frame-size = 104857600b |
68 |
| -} |
69 |
| -``` |
70 |
| -
|
71 |
| -- set `spark.master` to Mesos master URL (and not mesos-dispatcher URL). |
72 |
| -
|
73 |
| -- set `spark.jobserver.context-per-jvm` to `true` in job-server config file. |
74 |
| -
|
75 |
| -Example config file (important settings are marked with # important): |
| 31 | +Example job server config (replace `CLUSTER-IP` with the internal IP of the host running the job server frontend): |
76 | 32 |
|
77 | 33 | spark {
|
78 |
| - master = <mesos master URL here> # important, example: mesos://mesos-master:5050 |
79 |
| - |
80 |
| - # Default # of CPUs for jobs to use for Spark standalone cluster |
81 |
| - job-number-cpus = 4 |
| 34 | + master = <mesos dispatcher URL> # example: mesos://mesos-dispatcher:7077 |
| 35 | + submit.deployMode = cluster |
82 | 36 |
|
83 | 37 | jobserver {
|
84 |
| - port = 8090 |
85 |
| - driver-mode = mesos-cluster #important |
86 |
| - context-per-jvm = true #important |
87 |
| - jobdao = spark.jobserver.io.JobSqlDAO |
88 |
| - |
89 |
| - sqldao { |
90 |
| - # Directory where default H2 driver stores its data. Only needed for H2. |
91 |
| - rootdir = /database |
| 38 | + context-per-jvm = true |
92 | 39 |
|
93 |
| - # Full JDBC URL / init string. Sorry, needs to match above. |
94 |
| - # Substitutions may be used to launch job-server, but leave it out here in the default or tests won't pass |
95 |
| - jdbc.url = "jdbc:h2:file:/database/h2-db" |
| 40 | + # start a H2 DB server, reachable in your cluster |
| 41 | + sqldao { |
| 42 | + jdbc { |
| 43 | + url = "jdbc:h2:tcp://CLUSTER-IP:9092/h2-db;AUTO_RECONNECT=TRUE" |
| 44 | + } |
96 | 45 | }
|
97 |
| - } |
98 |
| -
|
99 |
| - # universal context configuration. These settings can be overridden, see README.md |
100 |
| - context-settings { |
101 |
| - num-cpu-cores = 2 # Number of cores to allocate. Required. |
102 |
| - memory-per-node = 512m # Executor memory per node, -Xmx style eg 512m, #1G, etc. |
| 46 | + startH2Server = false |
103 | 47 | }
|
104 | 48 | }
|
105 | 49 |
|
106 |
| - akka.remote.netty.tcp { |
107 |
| - # use remote IP address to form akka cluster, not 127.0.0.1. This should be the IP of of the machine where the file |
108 |
| - # resides. That means for each mesos agents (where job-server directory is copied on REMOTE_JOBSERVER_DIR path), |
109 |
| - # the hostname should be the remote IP of that node. |
110 |
| - # |
111 |
| - hostname = "xxxxx" #important |
| 50 | + # start akka on this interface, reachable from your cluster |
| 51 | + akka { |
| 52 | + remote.netty.tcp { |
| 53 | + hostname = "CLUSTER-IP" |
| 54 | + |
112 | 55 | # This controls the maximum message size, including job results, that can be sent
|
113 |
| - maximum-frame-size = 104857600b #important |
| 56 | + maximum-frame-size = 100 MiB |
| 57 | + } |
114 | 58 | }
|
| 59 | + |
| 60 | +- Optional: Add following config at the end of job-server's settings.sh file: |
| 61 | + |
| 62 | + ``` |
| 63 | + REMOTE_JOBSERVER_DIR=<path to job-server directory> # copy of job-server directory on all mesos agent nodes |
| 64 | + ``` |
0 commit comments