-
-
Notifications
You must be signed in to change notification settings - Fork 99
Fix race conditions and improve robustness during socket I/O #779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
887399d
to
4f1662e
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #779 +/- ##
==========================================
- Coverage 79.25% 79.19% -0.06%
==========================================
Files 29 29
Lines 4203 4292 +89
Branches 539 544 +5
==========================================
+ Hits 3331 3399 +68
- Misses 728 747 +19
- Partials 144 146 +2 |
4f1662e
to
4833dac
Compare
048d898
to
f0471ca
Compare
1af9bf3
to
5f1d7f7
Compare
cheroot/makefile.py
Outdated
except OSError as sock_err: | ||
error_code = sock_err.errno | ||
if error_code in _errors.acceptable_sock_shutdown_error_codes: | ||
# The socket is gone, so just ignore this error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking of the refactoring being made with exception hierarchy, this would also be replaced.
I wonder, however, if it's right to ignore the problem here. I think that if the connection is broken, such an exception should bubble up to the layer where the reset of the data writing context exists. This would probably be the caller. And not just write()
but whatever attempts writing. That layer would know how it needs to handle connection errors.
If we just suppress them, then write()
would pretend to have written something into the socket successfully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Maybe just remove this branch then and let the errors rise.
cheroot/server.py
Outdated
self.conn.wfile.write(chunk) | ||
data = chunk | ||
|
||
with contextlib.suppress(ConnectionError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the layering, I think that something up the stack should process this exception, not a low-ish level write method. What calls it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking deeper, I see that _process_connections_until_interrupted()
in thread pool is already set up to process connection errors. So this can just be dropped.
If an network error occurrs anytime while processing a single request, there's nothing to do but just log it. And such things should be happening on the top level. Let the connection errors bubble up all the way to that layer and don't suppress them.
The only thing we need to make sure of is that any network errors that present themselves in other ways, are turned into one of the specific connection error exceptions.
chunk_size_hex = hex(len(chunk))[2:].encode('ascii') | ||
buf = [chunk_size_hex, CRLF, chunk, CRLF] | ||
self.conn.wfile.write(EMPTY.join(buf)) | ||
data = EMPTY.join(buf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data = EMPTY.join(buf) | |
self.conn.wfile.write(EMPTY.join(buf)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I saw a few places in this module where socket.error
/ OSError
is being suppressed. I haven't checked them all but it seems they need to be unshielded too so that the connection error bubbles to the top properly.
cheroot/ssl/pyopenssl.py
Outdated
"""Handle SysCallError during close/shutdown.""" | ||
try: | ||
# Call the proxied method (e.g., self._ssl_conn.close()) | ||
return getattr(self._ssl_conn, method_name)() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just move calling self._ssl_conn.close()
/ self._ssl_conn.shutdown()
into the methods above. And put the below exception handling/conversion into a decorator. It's quite easy to do with @contextlib.contextmanager
.
cheroot/ssl/pyopenssl.py
Outdated
) | ||
conn_err_cls = connection_error_map.get( | ||
error_code, | ||
errors.CherootConnectionError, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed this place.
cheroot/ssl/pyopenssl.py
Outdated
def close(self): | ||
"""Close the connection, translating OpenSSL errors for shutdown.""" | ||
self._lock.acquire() | ||
try: | ||
return self._safe_close_call('close') | ||
finally: | ||
self._lock.release() | ||
|
||
def shutdown(self): | ||
"""Shutdown the connection, translating OpenSSL errors.""" | ||
self._lock.acquire() | ||
try: | ||
return self._safe_close_call('shutdown') | ||
finally: | ||
self._lock.release() | ||
|
||
def _safe_close_call(self, method_name): | ||
"""Handle SysCallError during close/shutdown.""" | ||
try: | ||
# Call the proxied method (e.g., self._ssl_conn.close()) | ||
return getattr(self._ssl_conn, method_name)() | ||
except SSL.SysCallError as ssl_syscall_err: | ||
connection_error_map = { | ||
errno.EBADF: ConnectionError, # socket is gone? | ||
errno.ECONNABORTED: ConnectionAbortedError, | ||
errno.ECONNREFUSED: ConnectionRefusedError, | ||
errno.ECONNRESET: ConnectionResetError, | ||
errno.ENOTCONN: ConnectionError, | ||
errno.EPIPE: BrokenPipeError, | ||
errno.ESHUTDOWN: BrokenPipeError, | ||
} | ||
error_code = ( | ||
ssl_syscall_err.args[0] if ssl_syscall_err.args else None | ||
) | ||
error_msg = ( | ||
os.strerror(error_code) | ||
if error_code is not None | ||
else repr(ssl_syscall_err) | ||
) | ||
conn_err_cls = connection_error_map.get( | ||
error_code, | ||
errors.CherootConnectionError, | ||
) | ||
raise conn_err_cls( | ||
error_code, | ||
f'Faied to {method_name!s} the PyOpenSSL connection: {error_msg!s}', | ||
) from ssl_syscall_err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this should do:
def close(self): | |
"""Close the connection, translating OpenSSL errors for shutdown.""" | |
self._lock.acquire() | |
try: | |
return self._safe_close_call('close') | |
finally: | |
self._lock.release() | |
def shutdown(self): | |
"""Shutdown the connection, translating OpenSSL errors.""" | |
self._lock.acquire() | |
try: | |
return self._safe_close_call('shutdown') | |
finally: | |
self._lock.release() | |
def _safe_close_call(self, method_name): | |
"""Handle SysCallError during close/shutdown.""" | |
try: | |
# Call the proxied method (e.g., self._ssl_conn.close()) | |
return getattr(self._ssl_conn, method_name)() | |
except SSL.SysCallError as ssl_syscall_err: | |
connection_error_map = { | |
errno.EBADF: ConnectionError, # socket is gone? | |
errno.ECONNABORTED: ConnectionAbortedError, | |
errno.ECONNREFUSED: ConnectionRefusedError, | |
errno.ECONNRESET: ConnectionResetError, | |
errno.ENOTCONN: ConnectionError, | |
errno.EPIPE: BrokenPipeError, | |
errno.ESHUTDOWN: BrokenPipeError, | |
} | |
error_code = ( | |
ssl_syscall_err.args[0] if ssl_syscall_err.args else None | |
) | |
error_msg = ( | |
os.strerror(error_code) | |
if error_code is not None | |
else repr(ssl_syscall_err) | |
) | |
conn_err_cls = connection_error_map.get( | |
error_code, | |
errors.CherootConnectionError, | |
) | |
raise conn_err_cls( | |
error_code, | |
f'Faied to {method_name!s} the PyOpenSSL connection: {error_msg!s}', | |
) from ssl_syscall_err | |
@_morph_syscall_to_connection_error('close') | |
def close(self): | |
"""Close the connection, translating OpenSSL errors for shutdown.""" | |
with self._lock: | |
return self._ssl_conn.close() | |
@_morph_syscall_to_connection_error('shutdown') | |
def shutdown(self): | |
"""Shutdown the connection, translating OpenSSL errors.""" | |
with self._lock: | |
return self._ssl_conn.shutdown() | |
@contextlib.contextmanager | |
def _morph_syscall_to_connection_error(method_name, /): | |
"""Handle :exc:`SSL.SysCallError` in a wrapped method.""" | |
try: | |
yield | |
except SSL.SysCallError as ssl_syscall_err: | |
connection_error_map = { | |
errno.EBADF: ConnectionError, # socket is gone? | |
errno.ECONNABORTED: ConnectionAbortedError, | |
errno.ECONNREFUSED: ConnectionRefusedError, | |
errno.ECONNRESET: ConnectionResetError, | |
errno.ENOTCONN: ConnectionError, | |
errno.EPIPE: BrokenPipeError, | |
errno.ESHUTDOWN: BrokenPipeError, | |
} | |
error_code = ( | |
ssl_syscall_err.args[0] if ssl_syscall_err.args else None | |
) | |
error_msg = ( | |
os.strerror(error_code) | |
if error_code is not None | |
else repr(ssl_syscall_err) | |
) | |
conn_err_cls = connection_error_map.get( | |
error_code, | |
ConnectionError, | |
) | |
raise conn_err_cls( | |
error_code, | |
f'Faied to {method_name!s} the PyOpenSSL connection: {error_msg!s}', | |
) from ssl_syscall_err | |
cheroot/makefile.py
Outdated
except OSError as sock_err: | ||
error_code = sock_err.errno | ||
if error_code in _errors.acceptable_sock_shutdown_error_codes: | ||
# The socket is gone, so just ignore this error. | ||
return | ||
raise | ||
else: | ||
# The 'try' block completed without an exception | ||
if n == 0: | ||
# This could happen with non-blocking write | ||
# when nothing was written | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm now questioning the addition of this whole thing. Or at least the new except
block. Yet to hear your thoughts about the else
portion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the OSError exception handling as I agree with your earlier point we should just let those errors rise. Regarding the else block - I've changed to a simple if statement to check for n == 0 to prevent an infinite loop.
@julianz- could you scan the modules more broadly for places where we could stop suppressing connection errors on the low level layers? I'm not looking into the tests until we figure out the architectural overview of the whole thing. But I feel like we're getting closer. I think we might need to split this PR into two at some point, will see. |
Thank you @webknjaz for all your great suggestions and points. I will take a look at everything. The layering is actually more complicated than I had realized. |
d9df8d9
to
b26544f
Compare
Fixes to make socket I/O more resilient during connection teardown. 1. BufferedWriter's write(): Added error handling to ignore common socket errors (e.g., ECONNRESET, EPIPE, ENOTCONN, EBADF) that occur when the underlying connection has been unexpectedly closed by the client or OS. This prevents a crash when attempting to write to a defunct socket. 2. BufferedWriters's close(): Made idempotent, allowing safe repeated calls without raising exceptions. 3. Needed to add explicit handling of WINDOWS environments as these are seen to throw Windows specific WSAENOTSOCK errors. Includes new unit tests to cover the idempotency and graceful handling of already closed underlying buffers.
b26544f
to
f7d8469
Compare
Fixed race conditions to make socket I/O more resilient during connection teardown.
write()
: Added error handling to ignore common socket errors (e.g.,ECONNRESET
,EPIPE
,ENOTCONN
,EBADF
) that occur when the underlying connection has been unexpectedly closed by the client or OS. This prevents a crash when attempting to write to a defunct socket.close()
: Made idempotent, allowing safe repeated calls without raising exceptions.WSAENOTSOCK
errors.Includes new unit tests to cover the idempotency and graceful handling of already closed underlying buffers.
This change is