Skip to content

EKS Pod Identity broken with ipv6 AWS_CONTAINER_CREDENTIALS_FULL_URI #983

@danxmoran

Description

@danxmoran

Describe the question/issue

I recently created an EKS auto-mode cluster using ipv6 networking. Since auto-mode enables pod identity by default, I'm trying to get all workloads to use that approach. Fluent bit is spamming errors when trying to write to firehose, with the message:

[firehose 0] NoCredentialProviders: no valid providers in chain
caused by: EnvAccessKeyNotFound: failed to find credentials in the environment.
SharedCredsLoad: failed to load profile, .
CredentialsEndpointError: invalid endpoint host, \"fd00:ec2::23\", only loopback hosts are allowed.

When I kubectl get -o yaml for the fluent-bit pod, I see the following in the env vars:

- name: AWS_CONTAINER_CREDENTIALS_FULL_URI
  value: http://[fd00:ec2::23]/v1/credentials

Configuration

Configuration:

[SERVICE]
    HTTP_Server  On
    HTTP_Listen  [::]
    HTTP_PORT    2020
    Health_Check On
    HC_Errors_Count 5
    HC_Retry_Failure_Count 5
    HC_Period 5
    Log_Level debug

    Parsers_File /fluent-bit/parsers/parsers.conf
[INPUT]
    Name              tail
    Tag               kube.*
    Path              /var/log/containers/*.log
    DB                /var/log/flb_kube.db
    Parser            cri
    Docker_Mode       Off
    multiline.parser  docker, cri
    Mem_Buf_Limit     5MB
    Skip_Long_Lines   On
    Refresh_Interval  10
    multiline.parser  docker, cri
    Path_Key          filename
    Read_from_Head    true

[INPUT]
    Name systemd
    Tag  journald.*

    Path /var/log/journal
    DB   /var/log/flb_journald.db

[FILTER]
    Name                kubernetes
    Match               kube.*
    Kube_URL            https://kubernetes.default.svc.cluster.local:443
    Merge_Log           On
    Merge_Log_Key       data
    Keep_Log            Off
    K8S-Logging.Parser  On
    K8S-Logging.Exclude On
    Buffer_Size         512k
[FILTER]
    Name modify
    Match *

    Add hostname ${HOSTNAME}
    Add cluster  <redacted>

[OUTPUT]
    Name            firehose
    Match           *
    region          us-east-1
    delivery_stream <redacted>
    endpoint        <redacted>

Full Pod spec (including data added by EKS):

apiVersion: v1
kind: Pod
metadata:
  annotations:
    checksum/config: 32ae1b90d34fc7ab995f272ce98e6ee5c63a49e0cb16894e9cb5c7d2dbb80500
  creationTimestamp: "2025-08-22T20:57:55Z"
  generateName: fluent-bit-
  labels:
    app.kubernetes.io/instance: fluent-bit
    app.kubernetes.io/name: aws-for-fluent-bit
    controller-revision-hash: 594487d6c5
    pod-template-generation: "1"
  name: fluent-bit-4ghrh
  namespace: fluent-bit
  ownerReferences:
  - apiVersion: apps/v1
    blockOwnerDeletion: true
    controller: true
    kind: DaemonSet
    name: fluent-bit
    uid: fca5f2bb-3832-4f4b-99bb-cc773e429733
  resourceVersion: "8306940"
  uid: 8b0f6241-2485-4a52-b454-2a6b49cc820a
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchFields:
          - key: metadata.name
            operator: In
            values:
            - i-08d140ca8ce367e38
  containers:
  - env:
    - name: AWS_STS_REGIONAL_ENDPOINTS
      value: regional
    - name: AWS_DEFAULT_REGION
      value: us-east-1
    - name: AWS_REGION
      value: us-east-1
    - name: AWS_CONTAINER_CREDENTIALS_FULL_URI
      value: http://[fd00:ec2::23]/v1/credentials
    - name: AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE
      value: /var/run/secrets/pods.eks.amazonaws.com/serviceaccount/eks-pod-identity-token
    image: <cache-registry>/aws-observability/aws-for-fluent-bit:2.33.0.20250731
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 2
      httpGet:
        path: /api/v1/health
        port: 2020
        scheme: HTTP
      initialDelaySeconds: 30
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 10
    name: aws-for-fluent-bit
    resources:
      limits:
        cpu: 100m
        memory: 100Mi
      requests:
        cpu: 100m
        memory: 100Mi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /fluent-bit/etc/
      name: fluentbit-config
    - mountPath: /var/log
      name: varlog
    - mountPath: /var/lib/docker/containers
      name: varlibdockercontainers
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-4lhjx
      readOnly: true
    - mountPath: /var/run/secrets/pods.eks.amazonaws.com/serviceaccount
      name: eks-pod-identity-token
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: i-08d140ca8ce367e38
  preemptionPolicy: PreemptLowerPriority
  priority: 2000001000
  priorityClassName: system-node-critical
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    seccompProfile:
      type: RuntimeDefault
  serviceAccount: fluent-bit
  serviceAccountName: fluent-bit
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/disk-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/memory-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/pid-pressure
    operator: Exists
  - effect: NoSchedule
    key: node.kubernetes.io/unschedulable
    operator: Exists
  volumes:
  - name: eks-pod-identity-token
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          audience: pods.eks.amazonaws.com
          expirationSeconds: 83908
          path: eks-pod-identity-token
  - configMap:
      defaultMode: 420
      name: fluent-bit
    name: fluentbit-config
  - hostPath:
      path: /var/log
      type: ""
    name: varlog
  - hostPath:
      path: /var/lib/docker/containers
      type: ""
    name: varlibdockercontainers
  - name: kube-api-access-4lhjx
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace

Fluent Bit Log Output

A lot of these:

[firehose 0] NoCredentialProviders: no valid providers in chain caused by: EnvAccessKeyNotFound: failed to find credentials in the environment. SharedCredsLoad: failed to load profile, . CredentialsEndpointError: invalid endpoint host, \"fd00:ec2::23\", only loopback hosts are allowed.
[firehose 0] PutRecordBatch failed with NoCredentialProviders: no valid providers in chain caused by: EnvAccessKeyNotFound: failed to find credentials in the environment. SharedCredsLoad: failed to load profile, . CredentialsEndpointError: invalid endpoint host, \"fd00:ec2::23\", only loopback hosts are allowed.

Fluent Bit Version Info

Running version 2.33.0.20250731

Cluster Details

EKS auto-mode, ipv6 networking enabled.

Steps to reproduce issue

The errors start as soon as I deploy fluent-bit into the cluster, as it attempts to load the EKS pod identity credentials referenced by the env vars.

Related Issues

This is also an issue in the newer more performant AWS output plugins. I opened fluent/fluent-bit#10699 with hopeful fixes in fluent/fluent-bit#10706 and fluent/fluent-bit#10707, but haven't received any comments in ~2 weeks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions