fix: Ensure addons with before_compute are created before compute resources #3529

vincentmrg · 2025-09-25T10:23:02Z

Description

Fix issue described here: #3525

Motivation and Context

Addons with before_compute = true should be applied before any compute resources.

The modules module.eks_managed_node_group and module.self_managed_node_group should include a depends_on on the aws_eks_addon.before_compute resources.

Breaking Changes

No breaking change

How Has This Been Tested?

I have updated at least one of the examples/* to demonstrate and validate my change(s)
I have tested and validated these changes using one or more of the provided examples/* projects

I have executed pre-commit run -a on my pull request

…ources

bryantbiggs · 2025-09-25T12:10:21Z

This is incorrect and will cause disruptions for users

vincentmrg · 2025-09-25T15:33:32Z

I tested it myself without finding any issues.
Perhaps I missed something. Is it possible to elaborate a little on how this proposal is incorrect and could cause disruptions for users?
The issue that led to this PR remains relevant; it would be nice to be able to move forward on the issue to help continue improving this module.

bryantbiggs · 2025-09-25T15:40:41Z

I tested it myself without finding any issues.

here you said its difficult to reproduce, so how do you know this resolves anything?

Is it possible to elaborate a little on how this proposal is incorrect and could cause disruptions for users?

See hashicorp/terraform#30340

The issue that led to this PR remains relevant; it would be nice to be able to move forward on the issue to help continue improving this module.

Not saying the module is not without imperfections; there are scenarios where the Terraform language simply does not support the desired outcome. However, a module that is seeing over a million downloads a week I would expect to see more users reporting issues if there is truly an issue. I don't see that, I just see this one edge case that is not very well defined and eagerly jumping to a code solution that has known defects. and that is ultimately why the PR is not valid - it would cause more harm than good

vincentmrg · 2025-09-25T16:42:05Z

I'm fully aware that this module is widely used and that caution is obviously required.
I opened this PR to illustrate a possible solution to the issue #3525, which contains more details about the problem encountered.

The problem occurs when nodegroups are deployed before the vpc-cni cluster_addon.
When this happens, nodes from nodegroups never go into Ready, remaining in this state indefinitely:

 # CNI should be installed before worker nodes to prevent the following error on nodes:
KubeletNotReady              container runtime network not ready: NetworkReady=false reason: NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized

#3525 (comment) you said its difficult to reproduce, so how do you know this resolves anything?

This ensures that addons deployed with before_compute = true are deployed before any compute resource.

Below is a graph of the module before this commit

And here the same graph after the commit

We can clearly see from the graphs that aws_eks_addon.before_compute resources are not guaranteed to be deployed before compute resources (module.self_managed_node_group and module.eks_managed_node_group, where my proposal allows it.

bryantbiggs · 2025-09-25T17:01:54Z

This ensures that addons deployed with before_compute = true are deployed before any compute resource.

Yes, but it also guarantees that any changes to addons cause *ALL compute modules to be re-computed and cause disruptions. This is why we have taken more of a "best effort" route that is not disruptive

terraform-aws-eks/main.tf

Lines 799 to 803 in de2aa10

    
           depends_on = [ 
        
             module.fargate_profile, 
        
             module.eks_managed_node_group, 
        
             module.self_managed_node_group, 
        
           ]

If you have compute already deployed, it will not work - its not designed for this scenario. If you are creating a new cluster from scratch, it should work 99/100 times

fix: ensure addons with before_compute are created before compute res…

82c7939

…ources

vincentmrg mentioned this pull request Sep 25, 2025

cluster_addons before_compute does not guarantee addons are created before compute resources #3525

Open

1 task

vincentmrg changed the title ~~fix: ensure addons with before_compute are created before compute resources~~ fix: Ensure addons with before_compute are created before compute resources Sep 25, 2025

bryantbiggs closed this Sep 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: Ensure addons with before_compute are created before compute resources #3529

fix: Ensure addons with before_compute are created before compute resources #3529

Uh oh!

vincentmrg commented Sep 25, 2025 •

edited

Loading

Uh oh!

bryantbiggs commented Sep 25, 2025

Uh oh!

vincentmrg commented Sep 25, 2025

Uh oh!

bryantbiggs commented Sep 25, 2025

Uh oh!

vincentmrg commented Sep 25, 2025

Uh oh!

bryantbiggs commented Sep 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

fix: Ensure addons with before_compute are created before compute resources #3529

fix: Ensure addons with before_compute are created before compute resources #3529

Uh oh!

Conversation

vincentmrg commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Breaking Changes

How Has This Been Tested?

Uh oh!

bryantbiggs commented Sep 25, 2025

Uh oh!

vincentmrg commented Sep 25, 2025

Uh oh!

bryantbiggs commented Sep 25, 2025

Uh oh!

vincentmrg commented Sep 25, 2025

Uh oh!

bryantbiggs commented Sep 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vincentmrg commented Sep 25, 2025 •

edited

Loading