- 
                Notifications
    You must be signed in to change notification settings 
- Fork 25.6k
Document Rest*Action and Transport*Action threading model #135043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document Rest*Action and Transport*Action threading model #135043
Conversation
| 🔍 Preview links for changed docs | 
| ℹ️ Important: Docs version tagging👋 Thanks for updating the docs! Just a friendly reminder that our docs are now cumulative. This means all 9.x versions are documented on the same page and published off of the main branch, instead of creating separate pages for each minor version. We use applies_to tags to mark version-specific features and changes. Expand for a quick overviewWhen to use applies_to tags:✅ At the page level to indicate which products/deployments the content applies to (mandatory) What NOT to do:❌ Don't remove or replace information that applies to an older version 🤔 Need help?
 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks - I left a few small suggestions
| > Upon entry into the "transport layer", [NodeClient] delegates the remaining processing to its [TaskManager] thread. The [TaskManager] | ||
| > thread eventually returns control to the original [EventLoop] thread to write the response back to the Netty channel. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There isn't a separate TaskManager thread. Some transport actions do their work directly on the Netty event loop (ok if quick, bad if not) while others will dispatch the work onto a separate thread (pool).
In practice today because of #97916 transport actions invoked on the local node via the NodeClient will always start to run on the calling thread and may then have to dispatch to a separate pool manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. Fixed the inaccuracy in previous description.
| > Upon entry into the "transport layer", [NodeClient] delegates the remaining processing to its [TaskManager] thread. The [TaskManager] | ||
| > thread eventually returns control to the original [EventLoop] thread to write the response back to the Netty channel. | ||
| > | ||
| > [TransportAction] can also be initiated through peer-to-peer communication between nodes. In such cases, the [InboundHandler] | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might also be worth mentioning about how the responses to these remote actions are threaded, using the executor defined in the response handler, see org.elasticsearch.transport.TransportResponseHandler#executor().
Also note that requests received remotely are always deserialized on the Netty event loop, but responses are deserialized after dispatching to the relevant executor. Outbound messages (both requests and responses) are serialized on the calling thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack. Added referenced to TransportResponseHandler  and missing details on division of workload between IO and logic layer thread pools.
| [TransportService]:https://github.com/elastic/elasticsearch/blob/v9.0.1/server/src/main/java/org/elasticsearch/transport/TransportService.java | ||
| [TransportSingleShardAction]:https://github.com/elastic/elasticsearch/blob/v9.0.1/server/src/main/java/org/elasticsearch/action/support/single/shard/TransportSingleShardAction.java | ||
| [Transport]:https://github.com/elastic/elasticsearch/blob/v9.0.1/server/src/main/java/org/elasticsearch/transport/Transport.java | ||
| [EventLoop]:https://github.com/netty/netty/blob/4.2/transport/src/main/java/io/netty/channel/EventLoop.java | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might (also/instead) be worth linking https://www.elastic.co/docs/reference/elasticsearch/configuration-reference/networking-settings#modules-network-threading-model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
b9ecf19    to
    21f95b1      
    Compare
  
    | @zhubotang-wq not a big deal, but we try to avoid force-pushes if possible, because it can erase PR history. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some style nits
| > the `Transport*Action` class that handles it. | ||
| > | ||
| > A netty [EventLoop] thread handles the initial steps of a Rest*Action request lifecycle such as decoding, validation and routing. | ||
| > Upon entry into the "transport layer", [NodeClient] delegates the decision of execution to individual [TransportAction]. Each action | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I prefer not to make every mention of a class name a code link. Maybe use a link the first time a class is mentioned, if you think it's helpful to go look at some particular piece of code for additional information.
When I see a link, I think I should click on it, and then I'm disappointed when it's redundant and I have too many tabs open (the last part is my own problem I suppose 😌).
Instead of highlighting text with a hyperlink, you can use back ticks (``) to emphasize that it's a code name. I'd replace all your new text's [TransportAction] with TransportAction, since there's already a link reference above the new text. Same for [ActionType]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Dialed down the use of code links more sparingly as to not distract readers.
| Pinging @elastic/es-distributed-coordination (Team:Distributed Coordination) | 
…l_architecture_threading_models
| 
 Ack. This meant is always preferable to do  So we use git merge to preserve SHAs of commits during the code review phases, then during the PR merge back, we squash the commit history into a single commit, thus leave upstream with a clean history? | 
Co-authored-by: Dianna Hohensee <[email protected]>
Co-authored-by: Dianna Hohensee <[email protected]>
Co-authored-by: Dianna Hohensee <[email protected]>
Co-authored-by: Dianna Hohensee <[email protected]>
| 
 Yes, merge main into the work branch however many times you need, and then hitting the merge button on the PR will handle squashing / rebasing all the work before the commit. When you hit the merge button, you'll be asked for the final commit message. | 
| Could use an approval to get this doc update merged. Thanks! | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks @zhubotang-wq
This PR is based on discussion #100229 and #100162 .
Additional thread model description is added to the "transport layer" section.