Skip to content

Conversation

@jvanneman
Copy link

@jvanneman jvanneman commented Nov 25, 2025

https://issues.apache.org/jira/browse/SOLR-18002

Description

When a server exceeds the idle timeout it triggers a TimeoutException which is skipped when checking whether the server should be marked as a zombie. This results in unresponsive servers continuing to receive traffic and high client latencies as the idle timeout is consistently triggered on every request to that replica.

Another option for handling this would be to track timeouts to that server over a period of time and only mark it as a zombie past some threshold. This would be a little more involved but might better handle random slow requests. Other exceptions mark the server as a zombie immediately, so this PR follows that pattern.

Solution

This change includes TimeoutException in the list of exceptions that mark a server as a zombie.

Tests

This adds an integration test that leverages ServerSocket to create a blackhole server which lets a client connect but never responds, thus triggering the idleTimeout error condition.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide
  • I have added a changelog entry for my change

@jvanneman jvanneman changed the title idle timeouts should cause servers to be added to the zombie list SOLR-18002: idle timeouts should cause servers to be added to the zombie list Nov 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant