Skip to content
This repository was archived by the owner on Jan 9, 2020. It is now read-only.

Commit b8a08f2

Browse files
10110346cloud-fan
authored andcommitted
[SPARK-21506][DOC] The description of "spark.executor.cores" may be not correct
## What changes were proposed in this pull request? The number of cores assigned to each executor is configurable. When this is not explicitly set, multiple executors from the same application may be launched on the same worker too. ## How was this patch tested? N/A Author: liuxian <[email protected]> Closes apache#18711 from 10110346/executorcores.
1 parent 3b5c2a8 commit b8a08f2

File tree

5 files changed

+21
-10
lines changed

5 files changed

+21
-10
lines changed

core/src/main/scala/org/apache/spark/deploy/client/StandaloneAppClient.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -170,7 +170,7 @@ private[spark] class StandaloneAppClient(
170170

171171
case ExecutorAdded(id: Int, workerId: String, hostPort: String, cores: Int, memory: Int) =>
172172
val fullId = appId + "/" + id
173-
logInfo("Executor added: %s on %s (%s) with %d cores".format(fullId, workerId, hostPort,
173+
logInfo("Executor added: %s on %s (%s) with %d core(s)".format(fullId, workerId, hostPort,
174174
cores))
175175
listener.executorAdded(fullId, workerId, hostPort, cores, memory)
176176

core/src/main/scala/org/apache/spark/deploy/master/Master.scala

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -581,7 +581,13 @@ private[deploy] class Master(
581581
* The number of cores assigned to each executor is configurable. When this is explicitly set,
582582
* multiple executors from the same application may be launched on the same worker if the worker
583583
* has enough cores and memory. Otherwise, each executor grabs all the cores available on the
584-
* worker by default, in which case only one executor may be launched on each worker.
584+
* worker by default, in which case only one executor per application may be launched on each
585+
* worker during one single schedule iteration.
586+
* Note that when `spark.executor.cores` is not set, we may still launch multiple executors from
587+
* the same application on the same worker. Consider appA and appB both have one executor running
588+
* on worker1, and appA.coresLeft > 0, then appB is finished and release all its cores on worker1,
589+
* thus for the next schedule iteration, appA launches a new executor that grabs all the free
590+
* cores on worker1, therefore we get multiple executors from appA running on worker1.
585591
*
586592
* It is important to allocate coresPerExecutor on each worker at a time (instead of 1 core
587593
* at a time). Consider the following example: cluster has 4 workers with 16 cores each.

core/src/main/scala/org/apache/spark/scheduler/cluster/StandaloneSchedulerBackend.scala

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,7 @@ private[spark] class StandaloneSchedulerBackend(
153153

154154
override def executorAdded(fullId: String, workerId: String, hostPort: String, cores: Int,
155155
memory: Int) {
156-
logInfo("Granted executor ID %s on hostPort %s with %d cores, %s RAM".format(
156+
logInfo("Granted executor ID %s on hostPort %s with %d core(s), %s RAM".format(
157157
fullId, hostPort, cores, Utils.megabytesToString(memory)))
158158
}
159159

docs/configuration.md

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1015,7 +1015,7 @@ Apart from these, the following properties are also available, and may be useful
10151015
<td>0.5</td>
10161016
<td>
10171017
Amount of storage memory immune to eviction, expressed as a fraction of the size of the
1018-
region set aside by <code>s​park.memory.fraction</code>. The higher this is, the less
1018+
region set aside by <code>spark.memory.fraction</code>. The higher this is, the less
10191019
working memory may be available to execution and tasks may spill to disk more often.
10201020
Leaving this at the default value is recommended. For more detail, see
10211021
<a href="tuning.html#memory-management-overview">this description</a>.
@@ -1041,7 +1041,7 @@ Apart from these, the following properties are also available, and may be useful
10411041
<td><code>spark.memory.useLegacyMode</code></td>
10421042
<td>false</td>
10431043
<td>
1044-
Whether to enable the legacy memory management mode used in Spark 1.5 and before.
1044+
Whether to enable the legacy memory management mode used in Spark 1.5 and before.
10451045
The legacy mode rigidly partitions the heap space into fixed-size regions,
10461046
potentially leading to excessive spilling if the application was not tuned.
10471047
The following deprecated memory fraction configurations are not read unless this is enabled:
@@ -1115,11 +1115,8 @@ Apart from these, the following properties are also available, and may be useful
11151115
<td>
11161116
The number of cores to use on each executor.
11171117

1118-
In standalone and Mesos coarse-grained modes, setting this
1119-
parameter allows an application to run multiple executors on the
1120-
same worker, provided that there are enough cores on that
1121-
worker. Otherwise, only one executor per application will run on
1122-
each worker.
1118+
In standalone and Mesos coarse-grained modes, for more detail, see
1119+
<a href="spark-standalone.html#Executors Scheduling">this description</a>.
11231120
</td>
11241121
</tr>
11251122
<tr>

docs/spark-standalone.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -328,6 +328,14 @@ export SPARK_MASTER_OPTS="-Dspark.deploy.defaultCores=<value>"
328328
This is useful on shared clusters where users might not have configured a maximum number of cores
329329
individually.
330330

331+
# Executors Scheduling
332+
333+
The number of cores assigned to each executor is configurable. When `spark.executor.cores` is
334+
explicitly set, multiple executors from the same application may be launched on the same worker
335+
if the worker has enough cores and memory. Otherwise, each executor grabs all the cores available
336+
on the worker by default, in which case only one executor per application may be launched on each
337+
worker during one single schedule iteration.
338+
331339
# Monitoring and Logging
332340

333341
Spark's standalone mode offers a web-based user interface to monitor the cluster. The master and each worker has its own web UI that shows cluster and job statistics. By default you can access the web UI for the master at port 8080. The port can be changed either in the configuration file or via command-line options.

0 commit comments

Comments
 (0)