- 
                Notifications
    You must be signed in to change notification settings 
- Fork 163
Note in ECK autoscaling docs how cpu/ram is scaled. #1029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Michael Montgomery <[email protected]>
|  | ||
|  | ||
| ECK can leverage the [autoscaling API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-autoscaling) introduced in Elasticsearch 7.11 to adjust automatically the number of Pods and the allocated resources in a tier. Currently, autoscaling is supported for Elasticsearch [data tiers](/manage-data/lifecycle/data-tiers.md) and machine learning nodes. | ||
| ECK can leverage the [autoscaling API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-autoscaling) introduced in Elasticsearch 7.11 to adjust automatically the number of Pods and the allocated resources in a tier. Currently, autoscaling is supported for Elasticsearch [data tiers](/manage-data/lifecycle/data-tiers.md) and machine learning nodes. ECK scales Elasticsearch data and machine learning tiers exclusively by scaling storage. CPU and Memory are scaled *relative* to the storage resource min/max settings, and not independently. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ECK scales Elasticsearch data and machine learning tiers exclusively by scaling storage
I don't think this is true, at least for the ML and frozen tiers for which ES returns memory requirements: https://www.elastic.co/guide/en/elasticsearch/reference/current/autoscaling-deciders.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @barkbay. I have read through the docs, and I have updated appropriately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ECK scales Elasticsearch data tiers exclusively by scaling storage.
Maybe I'm missing something but I think this is still not true. The frozen tier is scaled based on both storage and memory requirements:
Frozen shards decider
Estimates required memory capacity based on the number of partially mounted shards. Available for policies governing frozen data nodes.
Frozen storage decider
Estimates required storage capacity as a percentage of the total data set of partially mounted indices. Available for policies governing frozen data nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@barkbay I had noted the Frozen tier at the end of this paragraph, but I've updated this again to try to clarify. If you have a suggestion for making this more clear (maybe a table would help?), I'm up for suggestions.
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
|  | ||
|  | ||
| ECK can leverage the [autoscaling API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-autoscaling) introduced in Elasticsearch 7.11 to adjust automatically the number of Pods and the allocated resources in a tier. Currently, autoscaling is supported for Elasticsearch [data tiers](/manage-data/lifecycle/data-tiers.md) and machine learning nodes. | ||
| ECK can leverage the [autoscaling API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-autoscaling) introduced in Elasticsearch 7.11 to adjust automatically the number of Pods and the allocated resources in a tier. Currently, autoscaling is supported for Elasticsearch [data tiers](/manage-data/lifecycle/data-tiers.md) and machine learning nodes. ECK scales Elasticsearch data tiers (excluding frozen tiers) exclusively by scaling storage. CPU and Memory are scaled *relative* to the storage resource min/max settings, and not independently in data tiers (again excluding frozen tiers). ECK can scale memory and CPU on ML tiers if specified in the `ElasticsearchAutoscaler.spec`. On Frozen tiers ECK can scale memory if specified in the `ElasticsearchAutoscaler.cpu`, but will scale CPU in relation to the storage. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should not consider the resource types returned for each tiers as an implementation detail. It feels like we are duplicating the Elasticsearch documentation which already explains what type of resources are estimated: https://www.elastic.co/guide/en/elasticsearch/reference/current/autoscaling-deciders.html
Instead we could explain how missing resources are calculated by the operator:
| ECK can leverage the [autoscaling API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-autoscaling) introduced in Elasticsearch 7.11 to adjust automatically the number of Pods and the allocated resources in a tier. Currently, autoscaling is supported for Elasticsearch [data tiers](/manage-data/lifecycle/data-tiers.md) and machine learning nodes. ECK scales Elasticsearch data tiers (excluding frozen tiers) exclusively by scaling storage. CPU and Memory are scaled *relative* to the storage resource min/max settings, and not independently in data tiers (again excluding frozen tiers). ECK can scale memory and CPU on ML tiers if specified in the `ElasticsearchAutoscaler.spec`. On Frozen tiers ECK can scale memory if specified in the `ElasticsearchAutoscaler.cpu`, but will scale CPU in relation to the storage. | |
| ECK can leverage the [autoscaling API](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-autoscaling) introduced in Elasticsearch 7.11 to adjust automatically the number of Pods and the allocated resources in a tier. Currently, autoscaling is supported for Elasticsearch [data tiers](/manage-data/lifecycle/data-tiers.md) and machine learning nodes. Required resources for each tiers are estimated by [Elasticsearch deciders](https://www.elastic.co/guide/en/elasticsearch/reference/current/autoscaling-deciders.html). Deciders may return required CPU, memory or storage capacity. If a resource type is missing in the decider's output, it is inferred relative to the others. For example, if a decider does not return a memory requirement, then memory is calculated proportionally to the required amount of storage returned by the decider. The same goes for CPU which is inferred from memory if it is absent from the decider's result. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can do what @barkbay suggests. But I also think we should call out in very simple language what is actually supported or not supported today. I know this would be duplicating some of the content from the Elasticsearch docs on deciders but it can be a bit confusing to read the decider docs. Do all of them apply, which ones do not apply?
ECK can scale memory and CPU on ML tiers if specified in the
ElasticsearchAutoscaler.spec. On Frozen tiers ECK can scale memory if specified in theElasticsearchAutoscaler.cpu
I am struggeling to parse this wording. What are we trying to say here? Why can ECK scale memory when you specify what? What is ElasticsearchAutoscaler.cpu?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a typo. Intended to be ElasticsearchAutoscaler.spec. ECK can scale memory in frozen tiers according to what's returned by the ES deciders if specified, otherwise it will scale it in relation to storage.
This isn't the most straight-forward thing to understand from a customer standpoint, as each tier has it's own set of supported options. Would a table showing the available options for each tier be a more clear that the words we're suggesting @barkbay @pebrc ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the lag in answering, not sure myself what would be the best option. I tend to think that https://www.elastic.co/guide/en/elasticsearch/reference/current/autoscaling-deciders.html should be improved as I think most of the readers are interested in which resources are estimated for each tier, not really about a list of the available deciders which should be an implementation detail.
We can use a table, maybe something along the lines of:
| Storage | Memory | CPU | |
|---|---|---|---|
| Data Nodes (except Frozen) | Yes | Calculated proportionally to the required amount of storage | Calculated proportionally to the required amount of memory | 
| Frozen Nodes | Yes | Yes | Calculated proportionally to the required amount of memory | 
| Machine Learning | No | Yes | Calculated proportionally to the required amount of memory | 
As a side note I just realized that https://www.elastic.co/docs/deploy-manage/autoscaling/autoscaling-in-ece-and-ech does not mention the frozen tier case, so maybe you were right in the beginning and It's okay not to be that specific 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think the table format is much clearer to understand as opposed to reading a wall of text. I'll update this updated and we can review further. ty!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Peter Brachwitz <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM

From the existing ECK docs, it sounds as though CPU and Ram can be independently scaled with a "Decider" from Elasticsearch, but no such decider exists. This makes it clear that cpu/ram are scaled relative to the storage min/max settings.