Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,7 @@ add_library(libninja OBJECT
src/graph.cc
src/graphviz.cc
src/jobserver.cc
src/jobserver_pool.cc
src/json.cc
src/line_printer.cc
src/manifest_parser.cc
Expand Down Expand Up @@ -285,6 +286,7 @@ if(BUILD_TESTING)
src/explanations_test.cc
src/graph_test.cc
src/jobserver_test.cc
src/jobserver_pool_test.cc
src/json_test.cc
src/lexer_test.cc
src/manifest_parser_test.cc
Expand Down
2 changes: 2 additions & 0 deletions configure.py
Original file line number Diff line number Diff line change
Expand Up @@ -551,6 +551,7 @@ def has_re2c() -> bool:
'graph',
'graphviz',
'jobserver',
'jobserver_pool',
'json',
'line_printer',
'manifest_parser',
Expand Down Expand Up @@ -655,6 +656,7 @@ def has_re2c() -> bool:
'explanations_test',
'graph_test',
'jobserver_test',
'jobserver_pool_test',
'json_test',
'lexer_test',
'manifest_parser_test',
Expand Down
63 changes: 51 additions & 12 deletions doc/manual.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -192,12 +192,34 @@ GNU Jobserver support

Since version 1.13., Ninja builds can follow the
https://www.gnu.org/software/make/manual/html_node/Job-Slots.html[GNU Make jobserver]
client protocol. This is useful when Ninja is invoked as part of a larger
build system controlled by a top-level GNU Make instance, or any other
jobserver pool implementation, as it allows better coordination between
concurrent build tasks.
protocol.

This feature is automatically enabled under the following conditions:
The protocol is useful to efficiently control parallelism across a set of
concurrent and cooperating processes. This is useful when Ninja is invoked
as part of a larger build system controlled by a top-level Ninja or
GNU Make instance, or any other jobserver pool implementation.

Ninja becomes a protocol client automatically if it detects the right
values in the `MAKEFLAGS` environment variable (see exact conditions below).

Since version 1.14, Ninja can also be a protocol server, if needed, using
the `--jobserver-pool` command-line flag, or if `enable_jobserver_pool = 1`
is set in the Ninja build plan.

In jobserver-enabled builds, there is one top-level "server" process which:

- Sets up a shared pool of job tokens.
- Sets the `MAKEFLAGS` environment variable with special values
to reference the pool.
- Launches child processes (concurrent sub-commands).

Said child processes can be protocol clients if they:

- Recognize the special `MAKEFLAGS` values specific to the protocol.
- Use it to access the shared pool to acquire and release job tokens
during the build.

Ninja automatically becomes a protocol client during builds when:

- Dry-run (i.e. `-n` or `--dry-run`) is not enabled.

Expand All @@ -208,18 +230,30 @@ This feature is automatically enabled under the following conditions:
jobserver mode using `--jobserver-auth=SEMAPHORE_NAME` on Windows, or
`--jobserver-auth=fifo:PATH` on Posix.

In this case, Ninja will use the jobserver pool of job slots to control
parallelism, instead of its default parallel implementation.

Note that load-average limitations (i.e. when using `-l<count>`)
are still being enforced in this mode.

IMPORTANT: On Posix, only the FIFO-based version of the protocol, which is
implemented by GNU Make 4.4 and higher, is supported. Ninja will detect
when a pipe-based jobserver is being used (i.e. when `MAKEFLAGS` contains
`--jobserver-auth=<read>,<write>`) and will print a warning, but will
otherwise ignore it.

Using `--jobserver-pool` or `enable_jobserver_pool = 1` will make Ninja
act as a protocol server, unless any of these are true:

- An existing pool was detected, as this keeps all processes cooperating
properly.

- `-j1` is used on the command-line, as this is asking Ninja to explicitly
not perform parallel builds.

- Dry-run is enabled.

The size of the pool setup by Ninja matches its parallel count, determined
by the `-j<COUNT>` option, or auto-detected if that one is not provided.

The load-average limitations (i.e. when using `-l<count>`) are still being
enforced in both modes.


Environment variables
~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -950,7 +984,7 @@ previous one, it closes the previous scope.
Top-level variables
~~~~~~~~~~~~~~~~~~~

Two variables are significant when declared in the outermost file scope.
Three variables are significant when declared in the outermost file scope.

`builddir`:: a directory for some Ninja output files. See <<ref_log,the
discussion of the build log>>. (You can also store other build output
Expand All @@ -959,6 +993,11 @@ Two variables are significant when declared in the outermost file scope.
`ninja_required_version`:: the minimum version of Ninja required to process
the build correctly. See <<ref_versioning,the discussion of versioning>>.

`enable_jobserver_pool`:: If set to `1` (any other value is ignored), enable
jobserver pool mode, as if `--jobserver-pool` was passed on the command
line. Note that `0` does not disable the feature, and that the size of
the pool is determined by the parallel job count that is either auto-detected
or controlled by the `-j<COUNT>` command-line option.

[[ref_rule]]
Rule variables
Expand Down
134 changes: 130 additions & 4 deletions misc/jobserver_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@

# Set this to True to debug command invocations.
_DEBUG = False
_DEBUG = True

default_env = dict(os.environ)
default_env.pop("NINJA_STATUS", None)
Expand Down Expand Up @@ -110,13 +111,13 @@ def span_output_file(span_n: int) -> str:
return "out%02d" % span_n


def generate_build_plan(command_count: int) -> str:
def generate_build_plan(command_count: int, prefix: str = "") -> str:
"""Generate a Ninja build plan for |command_count| parallel tasks.

Each task calls the test helper script which waits for 50ms
then writes its own start and end time to its output file.
"""
result = f"""
result = prefix + f"""
rule span
command = {sys.executable} -S {_JOBSERVER_TEST_HELPER_SCRIPT} --duration-ms=50 $out

Expand Down Expand Up @@ -272,7 +273,7 @@ def run_ninja_with_jobserver_pipe(args):
ret.check_returncode()
return ret.stdout, ret.stderr

output, error = run_ninja_with_jobserver_pipe(["all"])
output, error = run_ninja_with_jobserver_pipe(["-v", "all"])
if _DEBUG:
print(f"OUTPUT [{output}]\nERROR [{error}]\n", file=sys.stderr)
self.assertTrue(error.find("Pipe-based protocol is not supported!") >= 0)
Expand All @@ -282,14 +283,139 @@ def run_ninja_with_jobserver_pipe(args):

# Using an explicit -j<N> ignores the jobserver pool.
b.ninja_clean()
output, error = run_ninja_with_jobserver_pipe(["-j1", "all"])
output, error = run_ninja_with_jobserver_pipe(["-v", "-j1", "all"])
if _DEBUG:
print(f"OUTPUT [{output}]\nERROR [{error}]\n", file=sys.stderr)
self.assertFalse(error.find("Pipe-based protocol is not supported!") >= 0)

max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, 1)

def test_jobserver_pool_mode_with_flag(self):
task_count = 4
build_plan = generate_build_plan(task_count)
with BuildDir(build_plan) as b:
# First, run the full tasks with with {task_count} tokens, this should allow all
# tasks to run in parallel.
ret = b.ninja_run(
ninja_args=["--jobserver-pool", "all"],
)
max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, task_count)

# Second, use 2 tokens only, and verify that this was enforced by Ninja and
# that both a pool and a client were setup by Ninja.
b.ninja_clean()
ret = b.ninja_spawn(
["-j2", "--jobserver-pool", "--verbose", "all"],
capture_output=True,
)
self.assertEqual(ret.returncode, 0)
self.assertTrue(
"ninja: Creating jobserver pool for 2 parallel jobs" in ret.stdout,
msg="Ninja failed to setup jobserver pool!",
)
self.assertTrue(
"ninja: Jobserver mode detected: " in ret.stdout,
msg="Ninja failed to setup jobserver client!",
)
max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, 2)

# Third, verify that --jobs=1 serializes all tasks.
b.ninja_clean()
b.ninja_run(
["--jobserver-pool", "-j1", "all"],
)
max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, 1)

# On Linux, use taskset to limit the number of available cores to 1
# and verify that the jobserver overrides the default Ninja parallelism
# and that {task_count} tasks are still spawned in parallel.
if platform.system() == "Linux":
# First, run without a jobserver, with a single CPU, Ninja will
# use a parallelism of 2 in this case (GuessParallelism() in ninja.cc)
b.ninja_clean()
b.ninja_run(
["all"],
prefix_args=["taskset", "-c", "0"],
)
max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, 2)

# Now with a jobserver with {task_count} tasks.
b.ninja_clean()
b.ninja_run(
["--jobserver-pool", f"-j{task_count}", "all"],
prefix_args=["taskset", "-c", "0"],
)
max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, task_count)

def test_jobserver_pool_mode_ignored_with_existing_pool(self):
task_count = 4
build_plan = generate_build_plan(task_count)
with BuildDir(build_plan) as b:
# Setup a top-level pool with 2 jobs, and verify that `--jobserver-pool` respected it.
ret = b.ninja_run(
ninja_args=["--jobserver-pool", "all"],
prefix_args=[sys.executable, "-S", _JOBSERVER_POOL_SCRIPT, "--jobs=2"],
)
max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, 2)

def test_jobserver_pool_mode_with_variable(self):
task_count = 4
build_plan = generate_build_plan(task_count, prefix = "enable_jobserver_pool = 1\n")
with BuildDir(build_plan) as b:
# First, run the full tasks with with {task_count} tokens, this should allow all
# tasks to run in parallel.
ret = b.ninja_run(
ninja_args=["all"],
)
max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, task_count)

# Second, use 2 tokens only, and verify that this was enforced by Ninja.
b.ninja_clean()
b.ninja_run(
["-j2", "all"],
)
max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, 2)

# Third, verify that --jobs=1 serializes all tasks.
b.ninja_clean()
b.ninja_run(
["-j1", "all"],
)
max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, 1)

# On Linux, use taskset to limit the number of available cores to 1
# and verify that the jobserver overrides the default Ninja parallelism
# and that {task_count} tasks are still spawned in parallel.
if platform.system() == "Linux":
# First, run without a jobserver, with a single CPU, Ninja will
# use a parallelism of 2 in this case (GuessParallelism() in ninja.cc)
b.ninja_clean()
b.ninja_run(
["all"],
prefix_args=["taskset", "-c", "0"],
)
max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, 2)

# Now with a jobserver with {task_count} tasks.
b.ninja_clean()
b.ninja_run(
[f"-j{task_count}", "all"],
prefix_args=["taskset", "-c", "0"],
)
max_overlaps = compute_max_overlapped_spans(b.path, task_count)
self.assertEqual(max_overlaps, task_count)

def _test_MAKEFLAGS_value(
self, ninja_args: T.List[str] = [], prefix_args: T.List[str] = []
):
Expand Down
6 changes: 5 additions & 1 deletion src/build.h
Original file line number Diff line number Diff line change
Expand Up @@ -184,8 +184,12 @@ struct BuildConfig {
};
Verbosity verbosity = NORMAL;
bool dry_run = false;
/// Number of concurrent jobs, auto-detected or specified explicitly.
int parallelism = 1;
bool disable_jobserver_client = false;
/// True if -j<count> was used on the command line.
bool explicit_parallelism = false;
/// True if --jobserver-pool was used on the command line.
bool jobserver_pool = false;
int failures_allowed = 1;
/// The maximum load average we must not exceed. A negative value
/// means that we do not have any limit.
Expand Down
Loading