Skip to content

Conversation

@gregw
Copy link
Contributor

@gregw gregw commented Oct 21, 2025

Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.

Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.
@gregw
Copy link
Contributor Author

gregw commented Oct 21, 2025

The issue is fundamentally that completeStream is being called twice.

For this to happen, we need a failure to be detected in the ChannelCallback.succeeded() method so that the following code isrun:

if (failure != null)
{
httpChannelState._callbackFailure = failure;
if (!stream.isCommitted())
errorResponse = new ErrorResponse(request);
else
completeStream = true;
}

This means that completeStream will be called, even though the other "legs of the 3 legged stool" are not complete - specifically we may still be inside the call to HandlerInvoker.run(), as in this stack for thread 258 :

2025-10-20 08:14:42,457 INFO  [WebServerImpl-258] trace.jetty.session.complete: complete() called on session [ManagedSession@4df349fb{id=MYSECRETSESSIONID,x=MYSECRETSESSIONID.node0,req=6,res=true}]
java.lang.Exception: complete stack
	...
	at org.eclipse.jetty.session.AbstractSessionManager.complete(AbstractSessionManager.java)
	at org.eclipse.jetty.session.AbstractSessionManager$SessionStreamWrapper.doComplete(AbstractSessionManager.java:1509)
	at org.eclipse.jetty.server.handler.ContextHandler$ScopedContext.run(ContextHandler.java:1518)
	at org.eclipse.jetty.session.AbstractSessionManager$SessionStreamWrapper.failed(AbstractSessionManager.java:1479)
	at org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.completeStream(HttpChannelState.java:788)
	at org.eclipse.jetty.server.internal.HttpChannelState$ChannelCallback.succeeded(HttpChannelState.java:1591)
	at org.eclipse.jetty.server.handler.gzip.GzipResponseAndCallback.succeeded(GzipResponseAndCallback.java:95)
	at org.eclipse.jetty.ee10.servlet.ServletChannel.onCompleted(ServletChannel.java:765)
	at org.eclipse.jetty.ee10.servlet.ServletChannel.handle(ServletChannel.java:429)
	at org.eclipse.jetty.ee10.servlet.ServletHandler.handle(ServletHandler.java:470)
	at org.eclipse.jetty.ee10.servlet.SessionHandler.handle(SessionHandler.java:717)
	at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1071)
	at org.eclipse.jetty.server.Handler$Wrapper.handle(Handler.java:740)
	at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:138)
	at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:611)
	at org.eclipse.jetty.server.Handler$Sequence.handle(Handler.java:805)
	at org.eclipse.jetty.server.Server.handle(Server.java:182)
	at org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.run(HttpChannelState.java:677)
	at org.eclipse.jetty.util.thread.Invocable$ReadyTask.run(Invocable.java:177)
	at org.eclipse.jetty.http2.server.internal.HttpStreamOverHTTP2$1.run(HttpStreamOverHTTP2.java:144)
	...

We can see that there is a failure detected in ChannelCallback.succeeded() because SessionStreamWrapper.failed is ultimately called. This means that there must have been one of the following application errors:

These are all plausible application errors, especially with something like server sent events.

So once thread 258 has called completeStream it returns all the way out of handling and it can be seen calling completeStream again, in this stack trace:

java.lang.Exception: complete stack
	...
	at org.eclipse.jetty.session.AbstractSessionManager.complete(AbstractSessionManager.java)
	at org.eclipse.jetty.session.AbstractSessionManager$SessionStreamWrapper.doComplete(AbstractSessionManager.java:1509)
	at org.eclipse.jetty.server.handler.ContextHandler$ScopedContext.run(ContextHandler.java:1524)
	at org.eclipse.jetty.session.AbstractSessionManager$SessionStreamWrapper.failed(AbstractSessionManager.java:1479)
	at org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.completeStream(HttpChannelState.java:788)
	at org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.run(HttpChannelState.java:712)
	at org.eclipse.jetty.util.thread.Invocable$ReadyTask.run(Invocable.java:177)
	at org.eclipse.jetty.http2.server.internal.HttpStreamOverHTTP2$1.run(HttpStreamOverHTTP2.java:144)
	...

which is called from this code after the handler has been invoked:

try (AutoLock ignored = _lock.lock())
{
stream = _stream;
_handling = null;
_handled = true;
failure = _callbackFailure;
callbackCompleted = _callbackCompleted;
lastStreamSendComplete = lockedIsLastStreamSendCompleted();
completeStream = callbackCompleted && lastStreamSendComplete;
if (LOG.isDebugEnabled())
LOG.debug("handler invoked: completeStream={} failure={} callbackCompleted={} {}", completeStream, failure, callbackCompleted, HttpChannelState.this);
}
if (LOG.isDebugEnabled())
LOG.debug("stream={}, failure={}, callbackCompleted={}, completeStream={}", stream, failure, callbackCompleted, completeStream);
if (completeStream)
{
if (LOG.isDebugEnabled())
LOG.debug("completeStream({}, {})", stream, Objects.toString(failure));
completeStream(stream, failure);
}

Note that in order for this code to actually call completeStream then it must be true that completeStream = callbackCompleted && lastStreamSendComplete. Note that this is normally not the case for HTTP/1 because the first call to completeStream would have recycled the HttpChannelState and the callbackCompleted and lastStreamSendComplete will both be false. However, for H2, HttpChannelStates are re-used after being recycled, so another request may have come in and set the fields of the state again, so that the second call to completeStream incorrectly completes that new request.

Thus I believe the core fix is to not call completeStream whilst we are still handling. Furthermore, if we are to ignore the last write leg of the stool, we should explicitly force lastStreamSendComplete to true;

Unfortunately I have been unable to produce a unit test for this, as I believe it needs precisely unlucky timing and an application error.

Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.
@gregw
Copy link
Contributor Author

gregw commented Oct 21, 2025

@sbordet @lorban Can you review the diagnosis that @janbartel and I have come up with. I'm 90% sure this is it, but I cannot reproduce (any thoughts how we might be able to do that?).

@gregw
Copy link
Contributor Author

gregw commented Oct 21, 2025

Note that we added this completeStream call in #9684

Copy link
Contributor

@sbordet sbordet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ChannelCallback.failed() seems not entirely correct either.

Can we write test cases for this scenario?

Comment on lines 1583 to 1584
// We are committed and still handling, so let the HandlerInvoker complete, ignoring any pending reads/writes.
httpChannelState._streamSendState = StreamSendState.LAST_COMPLETE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, not sure we should ignore pending writes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or pending reads.... it is a difficult one. I'll at least change the code to enumerate the possible states

else
else if (httpChannelState._handling == null)
// We are committed, but no longer handling, so will complete here, ignoring any pending reads/writes
completeStream = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is right.

If there is a write pending, we cannot complete the stream, or we bypass one leg.
I think we should call lockedIsLastStreamSendCompleted() here to figure out whether that leg is done.

@gregw
Copy link
Contributor Author

gregw commented Oct 21, 2025

ChannelCallback.failed() seems not entirely correct either.

@sbordet I will look....

Can we write test cases for this scenario?

Very hard, because unless there is another thread racing the second completeStream is not called. I'm open to suggestions.

gregw added 3 commits October 22, 2025 11:02
Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.

refactor success and failure into single method.
Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.

refactor success and failure into single method.

Improved EventSourceServlet
Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.

refactor success and failure into single method.

Improved EventSourceServlet
@gregw
Copy link
Contributor Author

gregw commented Oct 22, 2025

@sbordet @lorban I'm getting concerned at the number of tests this PR is breaking in its current state.
I think we need to take time to consider the more "cleanup" changes and probably only make this in 12.1.x

So I propose that this PR should simply be 83c1718 for 12.0.x (perhaps with the EventSourceServlet cleanups), and then we can do a wider cleanup and refactor in 12.1.x next month.

@gregw gregw requested a review from sbordet October 22, 2025 19:50
gregw added 3 commits October 23, 2025 07:33
Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.

refactor success and failure into single method.

Improved EventSourceServlet
Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.

refactor success and failure into single method.

Improved EventSourceServlet
Fix #13470 by calling completeStream only once even when there is a failure in the channel callback.

refactor success and failure into single method.

Improved EventSourceServlet
@lorban
Copy link
Contributor

lorban commented Oct 23, 2025

Regarding the minimal 12.0 fix, I think httpChannelState._handling == null should actually be httpChannelState._handled

@gregw
Copy link
Contributor Author

gregw commented Oct 24, 2025

@lorban this is passing tests now.... so let's go for this one?
@sbordet @lorban Review please!!

Comment on lines +1566 to +1568
Throwable unconsumed = stream.consumeAvailable();
if (failure != null)
ExceptionUtil.addSuppressedIfNotAssociated(failure, unconsumed);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just ExceptionUtil.combine(failure, stream.consumeAvailable()); as the line above?
But then, this else block is identical to the else-if above?

* @param throwable The type of {@link Throwable} class to check association against. Can be null.
* @return true if the {@link Throwable} is associated with the provided type; otherwise, false.
*/
public static boolean isAssociated(Throwable failure, Class<? extends Throwable> throwable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reads better hasAssociated() to me.

else if (response.lockedIsWriting())
{
// We are currently writing, so let the completion of that write handle the failure
httpChannelState._callbackFailure = failure;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary line.

else
{
// There has been no last write, but we will just fail the stream instead.
completeStream = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like it should be this, so the third leg is complete:

Suggested change
completeStream = true;
httpChannelState._streamSendState = StreamSendState.FAILED;
completeStream = true;

Comment on lines +1632 to +1636
else if (!httpChannelState.lockedIsLastStreamSendCompleted() && !response.lockedIsWriting())
{
// last write is not going to happen after failure, so we can just fail anyway
httpChannelState._streamSendState = StreamSendState.LAST_COMPLETE;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If lockedIsWriting() I should do the same logic as above: steal the write callback, etc.

I would prefer this else-if to be written as just else, because the lack of a catch-all else branch feels like we're forgetting something: I prefer to have an explicit branch that says "here we should really do nothing".

private class HandlerInvoker implements Invocable.Task, Callback
// HandlerInvoker is used as the Response's _writeCallback when ChannelCallback is succeeded and the last send still
// needs to be done, i.e.: _streamSendState set to LAST_SENDING by lockedLastStreamSend().
private class HandlerInvoker implements Task, Callback
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should split the functionality of this class.
Leave run() in HandlerInvoker, but move the Callback functionality into a LastStreamSendCallback class.

Comment on lines +1623 to +1624
failedCallback = response._writeCallback;
response._writeCallback = httpChannelState._handlerInvoker;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this way, we may wait forever for the write to complete and invoke the _handlerInvoker.

How about this:

Suggested change
failedCallback = response._writeCallback;
response._writeCallback = httpChannelState._handlerInvoker;
Runnable task = response.lockedFailWrite(failure);
failedCallback = Callback.from(task, httpChannelState._handlerInvoker);

In 12.1.x we will leverage the new cancelSend() feature automatically.

LAST_COMPLETE,

/** Failing, so last send will never happen */
FAILED
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This state is only read, never assigned.

{
assert _callbackCompleted;
_streamSendState = StreamSendState.LAST_COMPLETE;
completeStream = _handling == null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
completeStream = _handled;


if (writeFailure == NOTHING_TO_SEND)
{
httpChannelState._writeInvoker.run(callback::succeeded);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Retain the InvocationType.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Jetty 12.0: ManagedSession issues due to recursion and/pr multiple completions of the stream.

3 participants