Skip to content

🐛 Feature request: improve authentication error diagnostics (IRSA vs Pod Identity) #2678

@gecube

Description

@gecube

Hello team,

I encountered the following error when running the CloudWatch Logs controller:

{"level":"error","ts":"2025-11-09T13:35:17.149Z","logger":"setup","msg":"Unable to create controller manager","aws.service":"cloudwatchlogs","error":"unable to determine account ID: unable to get caller identity: operation error STS: GetCallerIdentity, get identity: get credentials: failed to refresh cached credentials, failed to load credentials, : [43250e5c-8c9c-4fe8-af41-34d48103435b]: (AccessDeniedException): Unauthorized Exception! EKS does not have permissions to assume the associated role., fault: client","stacktrace":"main.main\n\t/github.com/aws-controllers-k8s/cloudwatchlogs-controller/cmd/controller/main.go:77\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:285"}

What actually happened

The error message is not very informative.
In my case, the ServiceAccount had an IRSA annotation, but the controller was actually using EKS Pod Identity under the hood.
Naturally, the IAM role did not include a trust relationship for Pod Identity — which resulted in the AccessDeniedException.

Because of this, I initially spent time debugging IRSA, while the actual problem was with Pod Identity being used instead.

Why this matters

Debugging such issues (IRSA vs Pod Identity) is quite painful and time-consuming.
The current error message only says “EKS does not have permissions to assume the associated role,” but doesn’t clarify which authentication mechanism was in use.

Having that context would immediately point users in the right direction.

Expected behavior

It would be great if the controller:

  1. Explicitly logged which authentication mechanism is being used (e.g. Using Pod Identity, Using IRSA (web identity)), ideally at the INFO level during startup.
  2. When receiving AccessDeniedException, the error message included hints on where to look:
    • For Pod Identity: check ServiceAccount ↔ Pod Identity Association and IAM role trust policy.
    • For IRSA: check eks.amazonaws.com/role-arn annotation, OIDC provider, and trust policy.

Suggested improvements

  • Add an explicit log message showing which credential provider/mechanism is active (IRSA / Pod Identity / static / env).
  • In the STS GetCallerIdentity error handler, enrich the error text with a note about the active mechanism and recommended checks (trust policy, associations, etc.).
  • (Optional) Expose this information via metrics or /healthz endpoint for quick diagnostics.

Rationale

This would make debugging IAM-related startup issues much faster and prevent users from “chasing” the wrong mechanism — especially when IRSA and Pod Identity configurations coexist.

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions