You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: Handle in-flight requests during stack shutdown more gracefully (#3310)
Closes#3300@balegas this error was a symptom of what we have discussed about races
during shutdown - we have requests in-flight during the shutdown that
will start erroring as resources are no longer present.
I have looked into the logs of the occurrences and have pinpointed two
causes and added handlers for them:
- After a live request timeout, we check the global last processed LSN
to attach to an up to date message - but if during that 20 second wait
the stack goes down (and is still down by the end of the 20 seconds), it
fails either with table not present or no value for the LSN
- I had originally fixed this by doing another `hold_until_stack_ready`
after a live timeout, but @alco had removed that in his PR that moved
the shape status out of the connection subsystem - however this is still
an issue in how we do multitenancy/cloud so I'm putting it back in
without a timeout (no longer wakes connections either)
- Added a test to cover it as the test that was there was not actually
doing the right thing
- An in-flight request might reach `await_snapshot_start`, which if it
finds a shape but not shape process, it enters a retry loop as it
expects the process to exist shortly - which is not the case if the
stack is shutting down and eventually errors with tables not found
- I am now catching argument errors/tables not existing and returning a
"prettier" error that will be turned into a 500 but without a Sentry
error - ultimately this is an issue we have to deal with for what to do
with in-flight requests
0 commit comments