fix: catch BaseException in cleanup loops to survive asyncio.CancelledError#11
fix: catch BaseException in cleanup loops to survive asyncio.CancelledError#11ramparte wants to merge 1 commit intomicrosoft:mainfrom
Conversation
…dError asyncio.CancelledError inherits from BaseException (not Exception) since Python 3.9. The coordinator cleanup loop and session cleanup used `except Exception`, which misses CancelledError. When cancellation arrives during shutdown, it escapes the try/except, aborts the for-loop, and all remaining module cleanup functions are skipped — causing "Unclosed client session" warnings from aiohttp in modules like tool-web and hooks-notify. Changes: - coordinator.cleanup(): except Exception → except BaseException - coordinator.collect_contributions(): same fix for consistency - session.cleanup(): wrap coordinator.cleanup() so loader.cleanup() always runs - session.execute(): except Exception → except BaseException so CancelledError reaches the cancellation status/event tracking code - session._safe_exception_str(): widen type to BaseException to match callers 🤖 Generated with [Amplifier](https://github.com/microsoft/amplifier) Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR fixes intermittent "Unclosed client session" warnings from aiohttp by ensuring cleanup functions execute even when asyncio.CancelledError is raised. Since Python 3.8+, CancelledError inherits from BaseException rather than Exception, so cleanup loops catching only Exception would abort when cancellation occurs.
Changes:
- Modified exception handling in cleanup and contribution collection loops to catch
BaseExceptioninstead ofException - Wrapped coordinator cleanup in session.cleanup() with exception handling to ensure loader cleanup always runs
- Updated
_safe_exception_str()type annotation to acceptBaseExceptionfor compatibility with new usage
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| amplifier_core/coordinator.py | Changed exception handlers in cleanup() and collect_contributions() from Exception to BaseException to survive asyncio.CancelledError |
| amplifier_core/session.py | Changed exception handler in execute() to catch BaseException, wrapped coordinator cleanup with exception handling, and updated _safe_exception_str() type annotation |
Comments suppressed due to low confidence (1)
amplifier_core/session.py:454
- Catching
BaseExceptionwithout re-raising critical exceptions likeKeyboardInterruptandSystemExitcan prevent proper program termination. While this change allows cancellation tracking code to execute whenasyncio.CancelledErroris raised, it also catches critical exceptions that should propagate immediately.
Since the code already re-raises the exception after logging, consider explicitly re-raising critical exceptions before the cancellation check:
except BaseException as e:
if isinstance(e, (KeyboardInterrupt, SystemExit)):
raise
# Check if this was a cancellation-related exception
if self.coordinator.cancellation.is_cancelled:
... except BaseException as e:
# Check if this was a cancellation-related exception
if self.coordinator.cancellation.is_cancelled:
self.status.status = "cancelled"
from .events import CANCEL_COMPLETED
await self.coordinator.hooks.emit(
CANCEL_COMPLETED,
{
"was_immediate": self.coordinator.cancellation.is_immediate,
"error": _safe_exception_str(e),
},
)
logger.info(f"Execution cancelled: {_safe_exception_str(e)}")
raise
else:
self.status.status = "failed"
self.status.last_error = {"message": _safe_exception_str(e)}
logger.error(f"Execution failed: {_safe_exception_str(e)}")
raise
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if inspect.iscoroutine(result): | ||
| await result | ||
| except Exception as e: | ||
| except BaseException as e: |
There was a problem hiding this comment.
Catching BaseException without re-raising critical exceptions like KeyboardInterrupt and SystemExit can prevent proper program termination. While this allows cleanup to survive asyncio.CancelledError, it also suppresses other critical exceptions that should propagate immediately.
Consider explicitly re-raising these critical exceptions:
except BaseException as e:
if isinstance(e, (KeyboardInterrupt, SystemExit)):
raise
logger.error(f"Error during cleanup: {e}")This ensures cleanup survives CancelledError while still allowing the user to forcefully terminate the program with Ctrl+C.
| except BaseException as e: | |
| except BaseException as e: | |
| if isinstance(e, (KeyboardInterrupt, SystemExit)): | |
| raise |
| except BaseException as e: | ||
| logger.warning( | ||
| f"Contributor '{contributor['name']}' on channel '{channel}' failed: {e}" | ||
| ) |
There was a problem hiding this comment.
Catching BaseException without re-raising critical exceptions like KeyboardInterrupt and SystemExit can prevent proper program termination. While this allows the contribution collection to survive asyncio.CancelledError, it also suppresses other critical exceptions.
Consider explicitly re-raising these critical exceptions:
except BaseException as e:
if isinstance(e, (KeyboardInterrupt, SystemExit)):
raise
logger.warning(f"Contributor '{contributor['name']}' on channel '{channel}' failed: {e}")| logger.error(f"Error during coordinator cleanup: {e}") | ||
| # Clean up sys.path modifications - must always run |
There was a problem hiding this comment.
Catching BaseException and logging without re-raising means critical exceptions like KeyboardInterrupt and SystemExit will be suppressed during coordinator cleanup. This prevents proper program termination and means loader.cleanup() might run when it shouldn't.
Consider explicitly re-raising these critical exceptions:
try:
await self.coordinator.cleanup()
except BaseException as e:
if isinstance(e, (KeyboardInterrupt, SystemExit)):
raise
logger.error(f"Error during coordinator cleanup: {e}")| logger.error(f"Error during coordinator cleanup: {e}") | |
| # Clean up sys.path modifications - must always run | |
| # Re-raise critical exceptions to allow proper termination | |
| if isinstance(e, (KeyboardInterrupt, SystemExit)): | |
| raise | |
| logger.error(f"Error during coordinator cleanup: {_safe_exception_str(e)}") | |
| # Clean up sys.path modifications - must always run for non-critical errors |
…dError asyncio.CancelledError is a BaseException subclass (Python 3.9+), so `except Exception` in kernel loop sites lets it escape, aborting remaining cleanup functions, hook handlers, and contribution collection. Different patterns per site semantics: - collect_contributions: catch CancelledError → break (partial results) - cleanup loops: catch BaseException → continue, re-raise fatal after - session.cleanup: try/finally ensures loader.cleanup always runs - session.execute: catch BaseException for status tracking (re-raises) - hooks emit/emit_and_collect: catch CancelledError → log, continue - cancellation.trigger_callbacks: catch CancelledError → continue Also fixes identical vulnerability in hooks.py (emit, emit_and_collect) and cancellation.py (trigger_callbacks) beyond the original PR scope. Includes 10 new tests for CancelledError resilience across all sites. Inspired-by: ramparte (PR #11) Co-Authored-By: ramparte <ramparte@users.noreply.github.com> Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
…dError (#12) asyncio.CancelledError is a BaseException subclass (Python 3.9+), so `except Exception` in kernel loop sites lets it escape, aborting remaining cleanup functions, hook handlers, and contribution collection. Different patterns per site semantics: - collect_contributions: catch CancelledError → break (partial results) - cleanup loops: catch BaseException → continue, re-raise fatal after - session.cleanup: try/finally ensures loader.cleanup always runs - session.execute: catch BaseException for status tracking (re-raises) - hooks emit/emit_and_collect: catch CancelledError → log, continue - cancellation.trigger_callbacks: catch CancelledError → continue Includes 10 new tests for CancelledError resilience across all sites. Inspired-by: ramparte (PR #11) Co-Authored-By: ramparte <ramparte@users.noreply.github.com> Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
Problem
Users intermittently see "Unclosed client session" warnings from aiohttp on session exit:
Root Cause
Since Python 3.9,
asyncio.CancelledErrorinherits fromBaseException, notException.The coordinator's cleanup loop (
coordinator.py:345-357) catches onlyexcept Exception:When task cancellation arrives during shutdown (common with Ctrl+C or normal session teardown),
CancelledErrorescapes the try/except, aborts the for-loop entirely, and all remaining cleanup functions are skipped. Modules liketool-webandhooks-notifythat createaiohttp.ClientSessionobjects never get their cleanup called, producing the warnings.Note: the individual module cleanup functions are actually defensively coded (they use
asyncio.shield()and catchCancelledErrorinternally). The bug is that the coordinator never reaches them because an earlier cleanup function'sCancelledErrorbreaks out of the loop.The same pattern exists in
session.pywhereexcept Exceptioninexecute()meansCancelledErrornever reaches the cancellation status tracking code (lines 437-448), and wherecoordinator.cleanup()failure would also preventloader.cleanup()from running.Fix
coordinator.pycleanup()except Exception→except BaseExceptioncoordinator.pycollect_contributions()session.pycleanup()session.pyexecute()except Exception→except BaseExceptionsession.py_safe_exception_str()Testing
test_event_taxonomy.py::test_events_follow_naming_convention— unrelated event naming convention test)🤖 Generated with Amplifier