-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[DOCS] Update documentation for index sorting and routing for logsdb #120721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Documentation preview: |
Pinging @elastic/es-docs (Team:Docs) |
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
hi @kkrik-es, thanks for these additions. I need to do an edit for clarity and style before you merge this, but I think it makes sense for @martijnvg to review first. I'll do an edit as soon as I can after that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please do, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I added some suggestions and comments. I'm happy to commit the edits directly to your branch if that's easier.
If I've introduced inaccuracies, please correct me. 🙂 And feel free to adjust line breaks etc. 👍
NOTE: If `host.name` is injected and `subobjects` is set to `true` (default), the `host` field is mapped as an object | ||
field named `host` with a `name` child field of type `keyword`. If `subobjects` is set to `false`, a single | ||
`host.name` field is mapped as a `keyword` field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you use the restructuring I suggested in the preceding lines, delete this note. I folded it into my suggestion above.
Using a custom sort configuration is required to minimize the possibility of creating hotspots, in case of a | ||
logging spike producing documents that all get routed to a single shard. To prevent this, and to improve storage | ||
efficiency, it is recommended to use a few fields that have a rather low cardinality and don't co-vary | ||
(e.g. `host.name` and `host.id` are likely a bad choice). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a custom sort configuration is required to minimize the possibility of creating hotspots, in case of a | |
logging spike producing documents that all get routed to a single shard. To prevent this, and to improve storage | |
efficiency, it is recommended to use a few fields that have a rather low cardinality and don't co-vary | |
(e.g. `host.name` and `host.id` are likely a bad choice). | |
Logging spikes can cause hotspots by producing documents that all get routed to a single | |
shard. To prevent hotspots and improve storage efficiency, your configuration should use a few sort fields that have a relatively low cardinality and don't co-vary (for example, `host.name` and `host.id` are not optimal). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic requires a custom sort config to reduce the likelihood of hotspots, as opposed to working with the default sort config. I think the updated text (and my version, possibly) missed this part. Maybe we can clarify this better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks, I see what you mean. I'd suggest losing the logging spikes sentence -- WDYT of this?
Using a custom sort configuration is required to minimize the possibility of creating hotspots, in case of a | |
logging spike producing documents that all get routed to a single shard. To prevent this, and to improve storage | |
efficiency, it is recommended to use a few fields that have a rather low cardinality and don't co-vary | |
(e.g. `host.name` and `host.id` are likely a bad choice). | |
A custom sort configuration is required, to minimize hotspots and improve storage efficiency. For best results, use a few sort fields that have a relatively low cardinality and don't co-vary | |
(for example, `host.name` and `host.id` are not optimal). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, though it'd be nice to explain what leads to hotspots - I don't think this is mentioned elsewhere in this page. Another possibility is to include such a note above, where we describe the option for custom sort config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK one more try 🙂
A custom sort configuration is required, to improve storage efficiency and to
minimize hotspots from logging spikes that route documents to a single shard.
For best results, use a few sort fields that have a relatively low cardinality and
don't co-vary (for example, `host.name` and `host.id` are not optimal).
Co-authored-by: Marci W <[email protected]>
Co-authored-by: Marci W <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and thanks for applying my changes 👍
💚 Backport successful
|
…lastic#120721) * [DOCS] Update documentation for index sorting and routing for logsdb * update * Apply suggestions from code review Co-authored-by: Marci W <[email protected]> * Update logs.asciidoc * Update docs/reference/data-streams/logs.asciidoc Co-authored-by: Marci W <[email protected]> * Update logs.asciidoc --------- Co-authored-by: Marci W <[email protected]>
…120721) (#120904) * [DOCS] Update documentation for index sorting and routing for logsdb * update * Apply suggestions from code review * Update logs.asciidoc * Update docs/reference/data-streams/logs.asciidoc * Update logs.asciidoc --------- Co-authored-by: Marci W <[email protected]>
Related to #109334