Skip to content
This repository was archived by the owner on Jun 27, 2018. It is now read-only.

Fix ECS memory reservation.#78

Open
itsdalmo wants to merge 2 commits intomasterfrom
ecs_memory_reservation
Open

Fix ECS memory reservation.#78
itsdalmo wants to merge 2 commits intomasterfrom
ecs_memory_reservation

Conversation

@itsdalmo
Copy link
Contributor

While moving a copy of the ECS modules to a new repository, I discovered that the container definitions used for ecs/service specifies memory rather than memoryReservation. I believe this to have been a mistake from mixing up the documentation for a task definition and a container definition (which is a part of task definitions):

  • See task definition docs here and here.
  • See container definition docs here.

Currently, the task_definition_ram sets memory in the container definition, which imposes a hard limit on the amount of memory that can be consumed by a running container, if the limit is exceeded it will be killed. I think this is not the intended behaviour for any users of this module, and this PR aims to fix that.

Having read the documentation, I believe what we want is the following:

  • memoryReservation specified in the container definition, so that we reserve some memory but allow the container to exceed the reservation.
  • memory and cpu specified in the task definition. The documentation does not give a lot of details about what this actually does (outside fargate), but I presume it is used by ECS to work out container placement in the cluster?

By not specifying memory on the container definition, we run the risk of having leaky applications eat up all the memory. Personally I think its better to deal with the underlying problem (fix the memory leak), than have a hard limit on memory which restarts the application whenever it reaches the hard limit. Not sure if there are any other uses for this parameter?

PS: This should be tested before it is merged, and I was hoping that perhaps @neugeeug could test it on the NEO dev/staging environment?

Copy link
Contributor

@colincoleman colincoleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this for the original xqb terraform scripts - and it is probably an improvement for most of the cases we see. However both methods have their benefits. Memory reservation is much better for the use case where an executable has been wrapped in a docker container for ease of deployment but you want the service to run as best it can.
On the other hand if the use case is for something that runs nicely in parallel (parallel build workers etc) then being able to specify a memory limit that is a whole fraction of a total instance gives you the best chance of maximising the use of your resources

@itsdalmo
Copy link
Contributor Author

itsdalmo commented Jun 20, 2018

On the other hand if the use case is for something that runs nicely in parallel (parallel build workers etc) then being able to specify a memory limit that is a whole fraction of a total instance gives you the best chance of maximising the use of your resources

But the memory parameter in the container definition will outright kill the instance if it exceeds the limit? If we use that parameter on all our tasks, we'll have to make sure that the memory for all things deployed to the cluster add up to the total memory of the cluster, or we'll be enforcing an under-utilisation?

Relevant quote from the documentation:

The hard limit (in MiB) of memory to present to the container. If your container attempts to exceed the memory specified here, the container is killed. This parameter maps to Memory in the Create a container section of the Docker Remote API and the --memory option to docker run.

Edit:
Another relevant quote:

For containers that will be part of a task using the EC2 launch type, you must specify a non-zero integer for one or both of memory or memoryReservation in container definitions. If you specify both, memory must be greater than memoryReservation. If you specify memoryReservation, then that value is subtracted from the available memory resources for the container instance on which the container is placed; otherwise, the value of memory is used.

With the last quote in mind, I think we can skip the task-level cpu and memory reservations. The cpu and memoryReservation of the container definition (of which there is only one) will be enough for ECS to schedule containers correctly, and both are soft limits that also tries to balance usage between multiple containers on the same instance.

Last edit:
Or we can do it the other way around, only use task level limits since we only have a single container definition. I think those parameters equal cpu and memoryReservation in the container definition.

@colincoleman
Copy link
Contributor

For java programs you can control the total memory either in the startup parameters or in the manifest file. When set Java regulates its garbage collection to stay under the limit you choose. (Of course if you set this too low then the program performs terribly!)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants