Default container restart policy #9061

EbolaWare · 2022-11-02T23:01:42Z

EbolaWare
Nov 2, 2022

After growing our grid to a large number of nodes, we have discovered a multitude of problems. This one (along with many of our problems) stems from our salt master being extremely busy. With highstates taking excessively long to complete, it might take 2 hours before the sensor shows green again and has all containers running again. Looking for a work around, another guy came up with just doing a docker start on the containers. This brought another thing to mind... Earlier this week, I used a bandaid on some of our zeek containers, sudo docker update --restart unless-stopped <container_name>.

Well, we came up with a faster way. We update all of the containers with that setting right before restart. Which begs the question "why isn't this setting done by default in the SLS files' docker_container.running states? I'm updating all of them this week, but it seems like a pretty hefty oversight when a sensor is supposed to stay up. 3 minutes can be enough to miss an initial infection vector, much less however long a highstate is supposed to take. I'll be posting some additional configs, like turning swap off in the containers, and completely for all the nodes. Writing to disk sucks.

Answered by EbolaWare

Nov 8, 2022

After fighting through some stubborn so-zeek containers' failures today I may have come to the "why" behind restart not being set on all the containers. A simple docker restart (on the stubborn cases) usually just puts all the zeek processes/workers back into a crashed state. In this case, I had to restart, then execute zeekctl deploy to get them running again.

Perhaps this was the reasoning behind the decision?

I'm hoping to find an interim solution (band-aid), but I have too many options at the moment and won't be in my environment until next week...

View full answer

dougburks · 2022-11-04T10:38:55Z

dougburks
Nov 4, 2022
Maintainer

Thank you for these observations. Other large grids run without these difficulties, so we suspect that this may be unique to your environment (perhaps due to STIGs, other standards, or Salt customizations).

1 reply

EbolaWare Nov 4, 2022
Author

After fighting through some stubborn so-zeek containers' failures today I may have come to the "why" behind restart not being set on all the containers. A simple docker restart (on the stubborn cases) usually just puts all the zeek processes/workers back into a crashed state. In this case, I had to restart, then execute zeekctl deploy to get them running again.

Perhaps this was the reasoning behind the decision?

I'm hoping to find an interim solution (band-aid), but I have too many options at the moment and won't be in my environment until next week...

EbolaWare · 2022-11-08T00:58:06Z

EbolaWare
Nov 8, 2022
Author

After fighting through some stubborn so-zeek containers' failures today I may have come to the "why" behind restart not being set on all the containers. A simple docker restart (on the stubborn cases) usually just puts all the zeek processes/workers back into a crashed state. In this case, I had to restart, then execute zeekctl deploy to get them running again.

Perhaps this was the reasoning behind the decision?

I'm hoping to find an interim solution (band-aid), but I have too many options at the moment and won't be in my environment until next week...

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Default container restart policy #9061

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Default container restart policy #9061

Uh oh!

Uh oh!

EbolaWare Nov 2, 2022

Replies: 2 comments · 1 reply

Uh oh!

dougburks Nov 4, 2022 Maintainer

Uh oh!

EbolaWare Nov 4, 2022 Author

Uh oh!

EbolaWare Nov 8, 2022 Author

EbolaWare
Nov 2, 2022

Replies: 2 comments 1 reply

dougburks
Nov 4, 2022
Maintainer

EbolaWare Nov 4, 2022
Author

EbolaWare
Nov 8, 2022
Author