Fix ECS memory reservation. by itsdalmo · Pull Request #78 · TeliaSoneraNorge/telia-terraform-modules

itsdalmo · 2018-06-20T10:44:33Z

While moving a copy of the ECS modules to a new repository, I discovered that the container definitions used for ecs/service specifies memory rather than memoryReservation. I believe this to have been a mistake from mixing up the documentation for a task definition and a container definition (which is a part of task definitions):

See task definition docs here and here.
See container definition docs here.

Currently, the task_definition_ram sets memory in the container definition, which imposes a hard limit on the amount of memory that can be consumed by a running container, if the limit is exceeded it will be killed. I think this is not the intended behaviour for any users of this module, and this PR aims to fix that.

Having read the documentation, I believe what we want is the following:

memoryReservation specified in the container definition, so that we reserve some memory but allow the container to exceed the reservation.
memory and cpu specified in the task definition. The documentation does not give a lot of details about what this actually does (outside fargate), but I presume it is used by ECS to work out container placement in the cluster?

By not specifying memory on the container definition, we run the risk of having leaky applications eat up all the memory. Personally I think its better to deal with the underlying problem (fix the memory leak), than have a hard limit on memory which restarts the application whenever it reaches the hard limit. Not sure if there are any other uses for this parameter?

PS: This should be tested before it is merged, and I was hoping that perhaps @neugeeug could test it on the NEO dev/staging environment?

colincoleman

I did this for the original xqb terraform scripts - and it is probably an improvement for most of the cases we see. However both methods have their benefits. Memory reservation is much better for the use case where an executable has been wrapped in a docker container for ease of deployment but you want the service to run as best it can.
On the other hand if the use case is for something that runs nicely in parallel (parallel build workers etc) then being able to specify a memory limit that is a whole fraction of a total instance gives you the best chance of maximising the use of your resources

itsdalmo · 2018-06-20T11:05:43Z

On the other hand if the use case is for something that runs nicely in parallel (parallel build workers etc) then being able to specify a memory limit that is a whole fraction of a total instance gives you the best chance of maximising the use of your resources

But the memory parameter in the container definition will outright kill the instance if it exceeds the limit? If we use that parameter on all our tasks, we'll have to make sure that the memory for all things deployed to the cluster add up to the total memory of the cluster, or we'll be enforcing an under-utilisation?

Relevant quote from the documentation:

The hard limit (in MiB) of memory to present to the container. If your container attempts to exceed the memory specified here, the container is killed. This parameter maps to Memory in the Create a container section of the Docker Remote API and the --memory option to docker run.

Edit:
Another relevant quote:

For containers that will be part of a task using the EC2 launch type, you must specify a non-zero integer for one or both of memory or memoryReservation in container definitions. If you specify both, memory must be greater than memoryReservation. If you specify memoryReservation, then that value is subtracted from the available memory resources for the container instance on which the container is placed; otherwise, the value of memory is used.

With the last quote in mind, I think we can skip the task-level cpu and memory reservations. The cpu and memoryReservation of the container definition (of which there is only one) will be enough for ECS to schedule containers correctly, and both are soft limits that also tries to balance usage between multiple containers on the same instance.

Last edit:
Or we can do it the other way around, only use task level limits since we only have a single container definition. I think those parameters equal cpu and memoryReservation in the container definition.

colincoleman · 2018-06-20T11:13:00Z

For java programs you can control the total memory either in the startup parameters or in the manifest file. When set Java regulates its garbage collection to stay under the limit you choose. (Of course if you set this too low then the program performs terribly!)

Fix ECS memory reservation.

54a341f

itsdalmo requested review from colincoleman and neugeeug June 20, 2018 10:44

Format example.

06e320e

colincoleman approved these changes Jun 20, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ECS memory reservation.#78

Fix ECS memory reservation.#78
itsdalmo wants to merge 2 commits intomasterfrom
ecs_memory_reservation

itsdalmo commented Jun 20, 2018

Uh oh!

colincoleman left a comment

Uh oh!

itsdalmo commented Jun 20, 2018 •

edited

Loading

Uh oh!

colincoleman commented Jun 20, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

itsdalmo commented Jun 20, 2018

Uh oh!

colincoleman left a comment

Choose a reason for hiding this comment

Uh oh!

itsdalmo commented Jun 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

colincoleman commented Jun 20, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

itsdalmo commented Jun 20, 2018 •

edited

Loading