Skip to content

Conversation

@zhubotang-wq
Copy link
Contributor

This PR is based on discussion #100229 and #100162 .

Additional thread model description is added to the "transport layer" section.

@github-actions
Copy link
Contributor

github-actions bot commented Sep 18, 2025

🔍 Preview links for changed docs

@github-actions
Copy link
Contributor

ℹ️ Important: Docs version tagging

👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version.

We use applies_to tags to mark version-specific features and changes.

Expand for a quick overview

When to use applies_to tags:

✅ At the page level to indicate which products/deployments the content applies to (mandatory)
✅ When features change state (e.g. preview, ga) in a specific version
✅ When availability differs across deployments and environments

What NOT to do:

❌ Don't remove or replace information that applies to an older version
❌ Don't add new information that applies to a specific version without an applies_to tag
❌ Don't forget that applies_to tags can be used at the page, section, and inline level

🤔 Need help?

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks - I left a few small suggestions

Comment on lines 106 to 107
> Upon entry into the "transport layer", [NodeClient] delegates the remaining processing to its [TaskManager] thread. The [TaskManager]
> thread eventually returns control to the original [EventLoop] thread to write the response back to the Netty channel.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't a separate TaskManager thread. Some transport actions do their work directly on the Netty event loop (ok if quick, bad if not) while others will dispatch the work onto a separate thread (pool).

In practice today because of #97916 transport actions invoked on the local node via the NodeClient will always start to run on the calling thread and may then have to dispatch to a separate pool manually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. Fixed the inaccuracy in previous description.

> Upon entry into the "transport layer", [NodeClient] delegates the remaining processing to its [TaskManager] thread. The [TaskManager]
> thread eventually returns control to the original [EventLoop] thread to write the response back to the Netty channel.
>
> [TransportAction] can also be initiated through peer-to-peer communication between nodes. In such cases, the [InboundHandler]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might also be worth mentioning about how the responses to these remote actions are threaded, using the executor defined in the response handler, see org.elasticsearch.transport.TransportResponseHandler#executor().

Also note that requests received remotely are always deserialized on the Netty event loop, but responses are deserialized after dispatching to the relevant executor. Outbound messages (both requests and responses) are serialized on the calling thread.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. Added referenced to TransportResponseHandler and missing details on division of workload between IO and logic layer thread pools.

[TransportService]:https://github.com/elastic/elasticsearch/blob/v9.0.1/server/src/main/java/org/elasticsearch/transport/TransportService.java
[TransportSingleShardAction]:https://github.com/elastic/elasticsearch/blob/v9.0.1/server/src/main/java/org/elasticsearch/action/support/single/shard/TransportSingleShardAction.java
[Transport]:https://github.com/elastic/elasticsearch/blob/v9.0.1/server/src/main/java/org/elasticsearch/transport/Transport.java
[EventLoop]:https://github.com/netty/netty/blob/4.2/transport/src/main/java/io/netty/channel/EventLoop.java
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@zhubotang-wq zhubotang-wq force-pushed the fix/document_general_architecture_threading_models branch from b9ecf19 to 21f95b1 Compare September 22, 2025 00:49
@DiannaHohensee
Copy link
Contributor

@zhubotang-wq not a big deal, but we try to avoid force-pushes if possible, because it can erase PR history.

Copy link
Contributor

@DiannaHohensee DiannaHohensee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some style nits

> the `Transport*Action` class that handles it.
>
> A netty [EventLoop] thread handles the initial steps of a Rest*Action request lifecycle such as decoding, validation and routing.
> Upon entry into the "transport layer", [NodeClient] delegates the decision of execution to individual [TransportAction]. Each action
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I prefer not to make every mention of a class name a code link. Maybe use a link the first time a class is mentioned, if you think it's helpful to go look at some particular piece of code for additional information.

When I see a link, I think I should click on it, and then I'm disappointed when it's redundant and I have too many tabs open (the last part is my own problem I suppose 😌).

Instead of highlighting text with a hyperlink, you can use back ticks (``) to emphasize that it's a code name. I'd replace all your new text's [TransportAction] with TransportAction, since there's already a link reference above the new text. Same for [ActionType]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Dialed down the use of code links more sparingly as to not distract readers.

@DiannaHohensee DiannaHohensee added Team:Distributed Coordination Meta label for Distributed Coordination team :Distributed Coordination/Network Http and internode communication implementations labels Sep 22, 2025
@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Sep 22, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination)

@zhubotang-wq
Copy link
Contributor Author

zhubotang-wq commented Sep 29, 2025

@zhubotang-wq not a big deal, but we try to avoid force-pushes if possible, because it can erase PR history.

Ack. This meant is always preferable to do git merge rather than git rebase, which by definitions changes SHAs and requires accompanied either --force or --force-with-lease, i looked up both had the effects of wiping SHAs and comments along with them.

So we use git merge to preserve SHAs of commits during the code review phases, then during the PR merge back, we squash the commit history into a single commit, thus leave upstream with a clean history?

@DiannaHohensee
Copy link
Contributor

So we use git merge to preserve SHAs of commits during the code review phases, then during the PR merge back, we squash the commit history into a single commit, thus leave upstream with a clean history?

Yes, merge main into the work branch however many times you need, and then hitting the merge button on the PR will handle squashing / rebasing all the work before the commit. When you hit the merge button, you'll be asked for the final commit message.

@zhubotang-wq
Copy link
Contributor Author

Could use an approval to get this doc update merged. Thanks!

Copy link
Contributor

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks @zhubotang-wq

@zhubotang-wq zhubotang-wq merged commit fb5fe5e into elastic:main Oct 2, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed Coordination/Network Http and internode communication implementations documentation >non-issue Team:Distributed Coordination Meta label for Distributed Coordination team v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants