Skip to content

EIP association fails with AuthFailure #122

@josh-abram

Description

@josh-abram

Bug Report

The fck-nat service fails to associate an Elastic IP during instance startup with Client.AuthFailure: You do not have permission to access the specified resource. However, the exact same AWS CLI command with the same parameters succeeds when run manually minutes later.

We've been able to replicate on several occasions.

Note: AWS account IDs, resource IDs, and IP addresses in this report have been sanitized.

Environment

  • Module version: 1.4.0
  • AWS Region: ap-southeast-2
  • Instance type: t4g.nano
  • HA mode: false
  • Terraform AWS Provider: 6.13.0

Configuration

module "fck_nat" {
  source  = "RaJiska/fck-nat/aws"
  version = "1.4.0"

  name      = "example-fck-nat"
  vpc_id    = var.vpc_id
  subnet_id = var.public_subnet_id
  
  instance_type      = "t4g.nano"
  ha_mode            = false
  eip_allocation_ids = [aws_eip.ecs_eip.allocation_id]
  
  update_route_tables = false
}

Observed Behavior

At instance startup (2025-10-13 08:08:38Z) - from CloudTrail:

{
    "eventName": "AssociateAddress",
    "eventTime": "2025-10-13T08:08:38Z",
    "errorCode": "Client.AuthFailure",
    "errorMessage": "You do not have permission to access the specified resource.",
    "requestParameters": {
        "allocationId": "eipalloc-01234567890abcdef",
        "networkInterfaceId": "eni-01234567890abcdef",
        "allowReassociation": true
    },
    "userIdentity": {
        "type": "AssumedRole",
        "arn": "arn:aws:sts::123456789012:assumed-role/example-fck-nat/i-0123456789abcdef0"
    }
}

Service logs:

Oct 13 08:08:35 fck-nat.sh[1674]: Found eip_id configuration, associating eipalloc-01234567890abcdef...
Oct 13 08:08:38 fck-nat.sh[1685]: An error occurred (AuthFailure) when calling the AssociateAddress operation: You do not have permission to access the specified resource.

When run manually minutes later - same instance, same role, same parameters:

$ aws ec2 associate-address \
    --region ap-southeast-2 \
    --allocation-id eipalloc-01234567890abcdef \
    --network-interface-id eni-01234567890abcdef \
    --allow-reassociation

{
    "AssociationId": "eipassoc-01234567890abcdef"
}

Success - exact same parameters, exact same IAM role

IAM Policy (created by module)

{
    "Statement": [
        {
            "Sid": "ManageEIPAllocation",
            "Effect": "Allow",
            "Action": ["ec2:DisassociateAddress", "ec2:AssociateAddress"],
            "Resource": "arn:aws:ec2:ap-southeast-2:123456789012:elastic-ip/eipalloc-01234567890abcdef"
        },
        {
            "Sid": "ManageEIPNetworkInterface",
            "Effect": "Allow",
            "Action": ["ec2:DisassociateAddress", "ec2:AssociateAddress"],
            "Resource": "arn:aws:ec2:ap-southeast-2:123456789012:network-interface/*",
            "Condition": {
                "StringEquals": {
                    "ec2:ResourceTag/Name": "example-fck-nat"
                }
            }
        },
        {
            "Sid": "ManageNetworkInterface",
            "Effect": "Allow",
            "Action": ["ec2:ModifyNetworkInterfaceAttribute", "ec2:AttachNetworkInterface"],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "ec2:ResourceTag/Name": "example-fck-nat"
                }
            }
        }
    ]
}

What We Verified

  • ✅ IAM policy allows ec2:AssociateAddress on the specific EIP allocation
  • ✅ IAM policy allows ec2:AssociateAddress on network interfaces with tag Name=example-fck-nat
  • ✅ The ENI has the required tag Name=example-fck-nat
  • ✅ CloudTrail confirms the correct ENI and EIP allocation IDs were used
  • ✅ Same command with same IAM role succeeds when run manually after boot
  • ❌ Fails with Client.AuthFailure during fck-nat service startup (16 seconds after instance launch)

Timeline

  • 08:08:22Z - IAM role credentials delivered to instance (ec2RoleDelivery: "2.0")
  • 08:08:38Z - AssociateAddress API call fails with Client.AuthFailure (16 seconds after credential delivery)
  • Later - Same API call succeeds when run manually

Expected Behavior

The fck-nat service should successfully associate the EIP during instance startup using the IAM role credentials.

Questions

  1. Why does the AssociateAddress call fail with AuthFailure at startup but succeed minutes later with identical parameters and IAM role?
  2. Is there a known issue with IAM permission evaluation during early instance boot?
  3. Should the fck-nat script include retry logic for AuthFailure errors?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions