Skip to content

Conversation

@gpshead
Copy link
Member

@gpshead gpshead commented Nov 10, 2022

This adds authentication. In the past only filesystem permissions protected this socket from code injection into the forkserver process by limiting access to the same UID, which didn't exist when Linux abstract namespace sockets were used (see issue) meaning that any process in the same system network namespace could inject code. We've since stopped using abstract namespace sockets by default, but protecting our control sockets regardless of type seems desirable.

This reuses the HMAC based shared key auth already used by multiprocessing.connection sockets for other purposes.

Doing this is useful so that filesystem permissions are not relied upon and trust isn't implied by default between all processes running as the same UID with access to the unix socket.

Tasks remaining

  • clean up the file descriptor leak from the new tests.
  • Microbenchmarking
  • Q: Decide if this needs an off switch. A: Nope. If we find reason to during betas before 3.14 we can add it. Rationale: This change is in the noise for most uses, multiprocessing.Pool worker processes are long lived by default so spawning new ones via the start method is infrequent compared to the number of tasks they are given. Only applications using maxtasksperchild= with a very low value might be able to notice, but even then the amount of work done in a worker process should far exceed any additional overhead this security measure adds to requesting forkserver to spawn new processes.

pyperformance benchmarks

No significant changes. Including concurrent_imap which exercises multiprocessing.Pool.imap in that suite.

Microbenchmarks

This does slightly slow down forkserver use. How much so appears to depend on the platform. Modern platforms and simple platforms are less impacted. This PR adds additional IPC round trips to the control socket to tell forkserver to spawn a new process. Systems with potentially high latency IPC are naturally impacted more.

Using my multiprocessing process-creation-benchmark.py:

I switched between this PR branch and main via a simple git checkout after my build as the changes are pure Python so no rebuild is needed.

On an AMD zen4 system:

889 Procs/sec dropped to 874. 1.5% slower. Insignificant.

AMD 7800X3D single-CCD 8 cores.

% ../b/python process-creation-benchmark.py 5 forkserver
Process Creation Microbenchmark (max 7 active processes) (5 iterations)
multiprocessing start method: forkserver
sys.version='3.14.0a1+ (~main branch~, Nov 10 2024) [GCC 13.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          666.77      0.049     108.39
128         831.09      0.154      44.05
384         887.16      0.433       9.27
1024        886.02      1.156       1.37
2048        888.99      2.304       2.76

% ./b/python ~/Downloads/process-creation-benchmark.py 5 forkserver
Process Creation Microbenchmark (max 7 active processes) (5 iterations)
multiprocessing start method: forkserver
sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey-dirty:07c01d459f8, Nov 10 2024) [GCC 13.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          640.53      0.052     130.27
128         809.62      0.158      38.79
384         867.22      0.443       7.66
1024        873.75      1.172       2.76
2048        873.57      2.344       2.85
Expand for baseline fork (2659 Procs/sec) and spawn (268) measurements. ``` % ../b/python ~/Downloads/process-creation-benchmark.py 13 fork Process Creation Microbenchmark (max 7 active processes) (13 iterations) multiprocessing start method: fork sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey-dirty:07c01d459f8, Nov 10 2024) [GCC 13.2.0]' -------------------------------------------------------------------------------- Total Procs/sec Time (s) StdDev -------------------------------------------------------------------------------- 32 2,300.78 0.014 78.91 128 2,391.11 0.054 114.68 384 2,650.31 0.145 13.23 1024 2,646.28 0.387 16.47 2048 2,641.08 0.775 13.65 5120 2,659.42 1.925 11.82 % ../b/python ~/Downloads/process-creation-benchmark.py 13 spawn Process Creation Microbenchmark (max 7 active processes) (13 iterations) multiprocessing start method: spawn sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey-dirty:07c01d459f8, Nov 10 2024) [GCC 13.2.0]' -------------------------------------------------------------------------------- Total Procs/sec Time (s) StdDev -------------------------------------------------------------------------------- 32 235.96 0.136 13.91 128 259.53 0.493 0.79 384 267.62 1.435 1.00 1024 267.89 3.822 0.35 ```

On an Intel Broadwell Xeon E5-2698 v4 system:

828 Procs/sec dropped to 717. ~15% slower. Significant. BUT... if I drop the active processes from 19 to 9. The difference was far less. 414 dropped to 398 for a ~4% slower. Moderate.

20 cores, 2 ring busses, 4 memory controllers, single socket. A large die Broadwell Xeon is complicated. At high parallelism counts, interprocess communication latencies add up. I predict similar results from multi-core-complex-die zen/epycs and multi socket systems, probably also on big.little mixed power/perf core arrangements.

% ../b/python ~/process-creation-benchmark.py 13 forkserver
Process Creation Microbenchmark (max 19 active processes) (13 iterations)
multiprocessing start method: forkserver
sys.version='3.14.0a1+ (~main branch~, Nov 10 2024) [GCC 13.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          535.14      0.062      77.23
128         735.49      0.174       6.53
384         798.69      0.481       4.43
1024        820.84      1.248       1.90
2048        827.63      2.475       4.31

% ../b/python ~/process-creation-benchmark.py 13 forkserver
Process Creation Microbenchmark (max 19 active processes) (13 iterations)
multiprocessing start method: forkserver
sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey-dirty:07c01d459f8, Nov 10 2024) [GCC 13.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          449.24      0.073      63.19
128         614.39      0.208      16.66
384         668.49      0.575      11.36
1024        716.77      1.430      18.10
2048        716.73      2.858      13.12
Expand for baseline fork (1265 Procs/sec) and spawn (233) measurements.
% ../b/python ~/process-creation-benchmark.py 13 fork
Process Creation Microbenchmark (max 19 active processes) (13 iterations)
multiprocessing start method: fork
sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey-dirty:07c01d459f8, Nov 10 2024) [GCC 13.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32        1,241.39      0.026      51.43
128       1,259.44      0.102       5.01
384       1,254.59      0.306       3.86
1024      1,258.45      0.814       6.77
2048      1,265.48      1.618       8.34
% ./b/python ~/process-creation-benchmark.py 13 spawn
Process Creation Microbenchmark (max 19 active processes) (13 iterations)
multiprocessing start method: spawn
sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey-dirty:07c01d459f8, Nov 10 2024) [GCC 13.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          188.08      0.170       2.58
128         221.20      0.579       0.75
384         227.56      1.687       0.85
1024        233.34      4.388       0.54

On an Raspberry Pi 5

126 Proc/sec dropped to 121. A ~4% slowdown. Moderate.

Raspberry Pi 5 running 32-bit raspbian.

% ./python ../process-creation-benchmark.py 
Process Creation Microbenchmark (max 3 active processes) (5 iterations)
multiprocessing start method: forkserver
sys.version='3.14.0a1+ (~main branch~, Nov 10 2024, 19:06:56) [GCC 12.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          121.23      0.266       9.82
128         125.45      1.020       0.84
384         125.71      3.055       0.27

% ./python ../process-creation-benchmark.py 
Process Creation Microbenchmark (max 3 active processes) (5 iterations)
multiprocessing start method: forkserver
sys.version='3.14.0a1+ (heads/security/multiprocessing-forkserver-authkey:07c01d4, Nov 10 2024, 19:06:56) [GCC 12.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          114.57      0.281      10.29
128         119.70      1.069       0.28
384         120.84      3.178       0.41
Expand for baseline fork (973 Procs/sec) and spawn (32) measurements.
% /python ../process-creation-benchmark.py 5 fork
Process Creation Microbenchmark (max 3 active processes) (5 iterations)
multiprocessing start method: fork
sys.version='3.14.0a1+ (~main branch~, Nov 10 2024, 19:06:56) [GCC 12.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32          933.01      0.034      44.03
128         973.00      0.132       1.33
384         968.48      0.396       1.55
1024        972.78      1.053       0.77
% ./python ../process-creation-benchmark.py 5 spawn
Process Creation Microbenchmark (max 3 active processes) (5 iterations)
multiprocessing start method: spawn
sys.version='3.14.0a1+ (~main branch~, Nov 10 2024, 19:06:56) [GCC 12.2.0]'
--------------------------------------------------------------------------------
Total    Procs/sec   Time (s)     StdDev
--------------------------------------------------------------------------------
32           31.97      1.001       0.12
128          32.46      3.943       0.02

This adds authentication. In the past only filesystem permissions
protected this socket from code injection into the forkserver process by
limiting access to the same UID, which didn't exist when Linux abstract
namespace sockets were used (see issue) meaning that any process in the
same system network namespace could inject code.

This reuses the hmac based shared key auth already used on
multiprocessing sockets used for other purposes.

Doing this is useful so that filesystem permissions are not relied upon
and trust isn't implied by default between all processes running as the
same UID.
@gpshead gpshead added type-feature A feature request or enhancement 3.12 only security fixes topic-multiprocessing labels Nov 10, 2022
@gpshead gpshead self-assigned this Nov 10, 2022
@gpshead gpshead changed the title gh-97514: Authenticate the forkserver control socket. gh-97514: [3.12+] Authenticate the forkserver control socket. Nov 10, 2022
@gpshead gpshead added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 11, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @gpshead for commit c83193d 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 11, 2022
@gpshead
Copy link
Member Author

gpshead commented Nov 11, 2022

from the buildbots... tests leak some file descriptors. not too surprising given the bit of code the test pokes into, i'll see what can be done to manage those.

I can't add new testcases to test_multiprocessing_forkserver itself, i
had to put them within an existing _test_multiprocessing test class.  I
don't know why, but refleaks are fragile and that test suite is...
rediculiously complicated with all that it does.
I'm not sure _why_ the hang happened, the forkserver process wasn't exiting when
the alive_w fd was closed in the parent during tearDownModule(), instead it remained
in its selector() loop.  regardless the part of the test this removes fixes it and
it only happened on macOS.
@gpshead gpshead added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 13, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @gpshead for commit ca47b6f 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 13, 2022
@gpshead gpshead requested a review from ambv December 11, 2022 00:09
@gpshead gpshead added 3.13 bugs and security fixes and removed 3.12 only security fixes labels Jun 21, 2023
@gpshead gpshead changed the title gh-97514: [3.12+] Authenticate the forkserver control socket. gh-97514: Authenticate the forkserver control socket. Jun 21, 2023
@gpshead gpshead removed the 3.13 bugs and security fixes label Sep 24, 2024
Copy link
Member

@ericsnowcurrently ericsnowcurrently left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly looks good. The tests helped me understand a bit better. Sorry if some of my comments demonstrate ignorance about the multiprocessing implementation. Thanks for working on this.

@gpshead gpshead added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Nov 10, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @gpshead for commit 6bb9db4 🤖

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Nov 10, 2024
@gpshead gpshead added 🔨 test-with-buildbots Test PR w/ buildbots; report in status section 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section labels Nov 10, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @gpshead for commit 07c01d4 🤖

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @gpshead for commit 07c01d4 🤖

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed 🔨 test-with-buildbots Test PR w/ buildbots; report in status section 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section labels Nov 10, 2024
@gpshead gpshead added the type-security A security issue label Nov 10, 2024
Copy link
Member

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! thanks for addressing my comments.

Copy link
Member

@ericsnowcurrently ericsnowcurrently left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@gpshead gpshead merged commit 7191b76 into python:main Nov 20, 2024
39 of 40 checks passed
@gpshead gpshead deleted the security/multiprocessing-forkserver-authkey branch November 20, 2024 16:19
ebonnal pushed a commit to ebonnal/cpython that referenced this pull request Jan 12, 2025
…-99309)

This adds authentication to the forkserver control socket. In the past only filesystem permissions protected this socket from code injection into the forkserver process by limiting access to the same UID, which didn't exist when Linux abstract namespace sockets were used (see issue) meaning that any process in the same system network namespace could inject code. We've since stopped using abstract namespace sockets by default, but protecting our control sockets regardless of type is a good idea.

This reuses the HMAC based shared key auth already used by `multiprocessing.connection` sockets for other purposes.

Doing this is useful so that filesystem permissions are not relied upon and trust isn't implied by default between all processes running as the same UID with access to the unix socket.

### pyperformance benchmarks

No significant changes. Including `concurrent_imap` which exercises `multiprocessing.Pool.imap` in that suite.

### Microbenchmarks

This does _slightly_ slow down forkserver use. How much so appears to depend on the platform. Modern platforms and simple platforms are less impacted. This PR adds additional IPC round trips to the control socket to tell forkserver to spawn a new process. Systems with potentially high latency IPC are naturally impacted more.

Typically a 1-4% slowdown on a very targeted process creation microbenchmark, with a worst case overloaded system slowdown of 20%.  No evidence that these slowdowns appear in practical sense.  See the PR for details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3.14 bugs and security fixes topic-multiprocessing type-feature A feature request or enhancement type-security A security issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants