-
Notifications
You must be signed in to change notification settings - Fork 202
mitogen: Fix non-blocking IO errors in first stage of bootstrap #1307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Nothing in the test suite is currently making the first stage stdin non-blocking, based on 0a247a3 passing in https://github.com/mitogen-hq/mitogen/actions/runs/16418222551. |
0a247a3 to
39b78df
Compare
|
Adding "Default log_output" to /etc/sudoers didn't reproduce it (based on 39b78df and https://github.com/mitogen-hq/mitogen/actions/runs/16419018956 |
67c0407 to
abaac0b
Compare
28cfc2b to
20e4be0
Compare
|
Now have a failing test in 20e4be0 to form the basis of a regression test |
978f247 to
ceb95eb
Compare
|
Using >>> import select, time
>>> f = open('/dev/null', 'r')
>>> f.read(1000)
''
>>> import time; time.time(); select.select([f], [], []); time.time()
1755249587.856334
([<_io.TextIOWrapper name='/dev/null' mode='r' encoding='UTF-8'>], [], [])
1755249587.856464 |
97e4e41 to
4ef8c9c
Compare
|
The first attempt passing all tests is 4ef8c9c. I switched the failing test from /dev/null to /dev/zero. |
4c0319d to
c54d827
Compare
c54d827 to
22c20ab
Compare
22c20ab to
4987cf5
Compare
We were not raising CalledProcessError when exit status != 0.
Prep for reusing it in non-Ansible tests
Before
./preamble_size.py
SSH command size: 759
Bootstrap (mitogen.core) size: 18227 (17.80KiB)
Original Minimized Compressed
mitogen.parent 98853 96.5KiB 51103 49.9KiB 51.7% 12881 12.6KiB 13.0%
mitogen.fork 8445 8.2KiB 4139 4.0KiB 49.0% 1652 1.6KiB 19.6%
mitogen.ssh 10827 10.6KiB 6893 6.7KiB 63.7% 2099 2.0KiB 19.4%
mitogen.sudo 12089 11.8KiB 5924 5.8KiB 49.0% 2249 2.2KiB 18.6%
mitogen.select 12325 12.0KiB 2929 2.9KiB 23.8% 964 0.9KiB 7.8%
mitogen.service 41581 40.6KiB 22398 21.9KiB 53.9% 5847 5.7KiB 14.1%
mitogen.fakessh 15767 15.4KiB 8149 8.0KiB 51.7% 2676 2.6KiB 17.0%
mitogen.master 55317 54.0KiB 28846 28.2KiB 52.1% 7528 7.4KiB 13.6%
After:
SSH command size: 759
Bootstrap (mitogen.core) size: 18227 (17.80KiB)
Original Minimized Compressed
mitogen.parent 98853 96.5KiB 51103 49.9KiB 51.7% 12881 12.6KiB 13.0%
mitogen.fork 8445 8.2KiB 4139 4.0KiB 49.0% 1652 1.6KiB 19.6%
mitogen.ssh 10827 10.6KiB 6893 6.7KiB 63.7% 2099 2.0KiB 19.4%
mitogen.sudo 12089 11.8KiB 5924 5.8KiB 49.0% 2249 2.2KiB 18.6%
mitogen.select 12325 12.0KiB 2929 2.9KiB 23.8% 964 0.9KiB 7.8%
mitogen.service 41581 40.6KiB 22398 21.9KiB 53.9% 5847 5.7KiB 14.1%
mitogen.fakessh 15767 15.4KiB 8149 8.0KiB 51.7% 2676 2.6KiB 17.0%
mitogen.master 55317 54.0KiB 28846 28.2KiB 52.1% 7528 7.4KiB 13.6%
After:
SSH command size: 759
Preamble (mitogen.core + econtext) size: 18227 (17.80KiB)
Original Minimized Compressed
mitogen.core 152218 148.7KiB 68437 66.8KiB 45.0% 18124 17.7KiB 11.9%
mitogen.parent 98853 96.5KiB 51103 49.9KiB 51.7% 12881 12.6KiB 13.0%
mitogen.fork 8445 8.2KiB 4139 4.0KiB 49.0% 1652 1.6KiB 19.6%
mitogen.ssh 10827 10.6KiB 6893 6.7KiB 63.7% 2099 2.0KiB 19.4%
mitogen.sudo 12089 11.8KiB 5924 5.8KiB 49.0% 2249 2.2KiB 18.6%
mitogen.select 12325 12.0KiB 2929 2.9KiB 23.8% 964 0.9KiB 7.8%
mitogen.service 41581 40.6KiB 22398 21.9KiB 53.9% 5847 5.7KiB 14.1%
mitogen.fakessh 15767 15.4KiB 8149 8.0KiB 51.7% 2676 2.6KiB 17.0%
mitogen.master 55317 54.0KiB 28846 28.2KiB 52.1% 7528 7.4KiB 13.6%
Previously the command size could very depanding on the current username, hostname, and process pid. Before ``` SSH command size: 759 Preamble (mitogen.core + econtext) size: 18227 (17.80KiB) ... ``` After SSH command size: 755 Preamble (mitogen.core + econtext) size: 18227 (17.80KiB) ... ```
This is mainly for peace of mind. With all this non-blocking IO investigation I'm getting a bit paranoid wrt file objects. refs mitogen-hq#712
When /etc/sudoers has log_output (or similar) enabled the process spawned by
`ctx.sudo()` via `mitogen.parent.Connection.start_child()` receives a stdin
that is in non-blocking mode. The immediate symptom is that `os.openfd(0,
...).read(n)` sometimes returns `None`, causing the first stage to raise an
unhandled TypeError.
The fix (for now) is to use `select.select()` in a while loop to read stdin.
This increases the command size slightly, but I think it's a reasonable
tradeoff until/unless the cause is more fully understood.
All CI tests are now run with sudoers log_output enabled, in order to catch
regressions. `first_stage_test.CommandLineTest` has been amended, because it
relied on implementation details of the bootstrap process that are no longer
true.
Before
```
SSH command size: 755
Preamble (mitogen.core + econtext) size: 18227 (17.80KiB)
Original Minimized Compressed
mitogen.core 152218 148.7KiB 68437 66.8KiB 45.0% 18124 17.7KiB 11.9%
mitogen.parent 98853 96.5KiB 51103 49.9KiB 51.7% 12881 12.6KiB 13.0%
mitogen.fork 8445 8.2KiB 4139 4.0KiB 49.0% 1652 1.6KiB 19.6%
mitogen.ssh 10827 10.6KiB 6893 6.7KiB 63.7% 2099 2.0KiB 19.4%
mitogen.sudo 12089 11.8KiB 5924 5.8KiB 49.0% 2249 2.2KiB 18.6%
mitogen.select 12325 12.0KiB 2929 2.9KiB 23.8% 964 0.9KiB 7.8%
mitogen.service 41581 40.6KiB 22398 21.9KiB 53.9% 5847 5.7KiB 14.1%
mitogen.fakessh 15767 15.4KiB 8149 8.0KiB 51.7% 2676 2.6KiB 17.0%
mitogen.master 55317 54.0KiB 28846 28.2KiB 52.1% 7528 7.4KiB 13.6%
```
After
```
SSH command size: 798
Preamble (mitogen.core + econtext) size: 18227 (17.80KiB)
Original Minimized Compressed
mitogen.core 152218 148.7KiB 68437 66.8KiB 45.0% 18124 17.7KiB 11.9%
mitogen.parent 98944 96.6KiB 51180 50.0KiB 51.7% 12910 12.6KiB 13.0%
mitogen.fork 8445 8.2KiB 4139 4.0KiB 49.0% 1652 1.6KiB 19.6%
mitogen.ssh 10827 10.6KiB 6893 6.7KiB 63.7% 2099 2.0KiB 19.4%
mitogen.sudo 12089 11.8KiB 5924 5.8KiB 49.0% 2249 2.2KiB 18.6%
mitogen.select 12325 12.0KiB 2929 2.9KiB 23.8% 964 0.9KiB 7.8%
mitogen.service 41581 40.6KiB 22398 21.9KiB 53.9% 5847 5.7KiB 14.1%
mitogen.fakessh 15767 15.4KiB 8149 8.0KiB 51.7% 2676 2.6KiB 17.0%
mitogen.master 55317 54.0KiB 28846 28.2KiB 52.1% 7528 7.4KiB 13.6%
```
4987cf5 to
85d6046
Compare
|
Rebased on 0.3.28dev |
When /etc/sudoers has log_output (or similar) enabled the process spawned by
ctx.sudo()viamitogen.parent.Connection.start_child()receives a stdin that is in non-blocking mode. The immediate symptom is thatos.openfd(0, ...).read(n)sometimes returnsNone, causing the first stage to raise an unhandledTypeError.The fix (for now) is to use
select.select()in as while loop to read stdin. This increases the command size slightly, but I think it's a reasonable tradeoff until/unless the cause is more fully understood.All CI tests are now run with sudoers log_output enabled, in order to catch regressions.
first_stage_test.CommandLineTesthas been amended, because it relied on implementation details of the bootstrap process that are no longer true.Fixes #1306, thanks to @Forgetyk for their detailed error reporting, diagnosis, and initial PR #1299
Preamble size
Before
After