-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Describe the bug
We experienced a transient connectivity issue to the CoreDNS service in one of our K8s clusters that coincided with a cloudflared pod entering its DNS resolver refresh logic.
This resulted in a traffic outage for requests routed through that connector (502 errors)
From the Cloudflare Zero Trust dashboard, the tunnel remained in a “working” state for the entire duration of the outage, despite traffic failures.
The error log below occurred every 5 minutes.
Failing traffic and the cloudflared errors stopped only after restarting the pod.
To mitigate this type of issue in the future, I've tried to use the --dns-resolver-addrs flag to configure a static DNS resolver (i.e., set to the CoreDNS service CLUSTER_IP value) to disable resolver refresh behavior.
Based on this commit:
commit 398da8860f39617a2bb112f59ed3498af6a704ae Date: Mon Jun 30 15:20:32 2025 -0700 TUN-9473: Add --dns-resolver-addrs flag To help support users with environments that don't work well with the DNS local resolver's automatic resolution process for local resolver addresses, we introduce a flag to provide them statically to the runtime. When providing the resolver addresses, cloudflared will no longer lookup the DNS resolver addresses and use the user input directly. When provided with a list of DNS resolvers larger than one, the resolver service will randomly select one at random for each incoming request. Closes TUN-9473
However, the flag is rejected as undefined when used with cloudflared tunnel run, even though the commit suggests it should be available to disable resolver refresh behavior.
The error is "Incorrect Usage. flag provided but not defined: --dns-resolver-addrs"
The flag is also not listed in cloudflared tunnel run --help.
To Reproduce
Steps to reproduce the behavior:
- Configure --dns-resolver-addrs flag by adding to cloudflared container arg
args:
- tunnel
- --dns-resolver-addrs=<CoreDNS CLUSTER_IP value>
- --config
- /etc/cloudflared/config/config.yaml
- run
Expected behavior
Is --dns-resolver-addrs intended to be supported for cloudflared tunnel run, or is there an alternative supported mechanism to enable static DNS resolver behavior for tunnels ?
In Kubernetes clusters, the CoreDNS ClusterIP is stable for the lifetime of the Service, so dynamic resolver refresh every 5 minutes is unnecessary and can introduce persistent failure states.
Environment and versions
- Architecture: Kubernetes
- Version: 2025.11.1
Logs and errors
2025-12-13T00:54:05Z ERR Failed to refresh DNS local resolver error="lookup region1.v2.argotunnel.com: i/o timeout"
2025-12-13T00:59:10Z ERR Failed to refresh DNS local resolver error="lookup region1.v2.argotunnel.com: i/o timeout"
2025-12-13T01:04:15Z ERR Failed to refresh DNS local resolver error="lookup region1.v2.argotunnel.com: i/o timeout"
2025-12-13T01:09:20Z ERR Failed to refresh DNS local resolver error="lookup region1.v2.argotunnel.com: i/o timeout"
2025-12-13T01:14:25Z ERR Failed to refresh DNS local resolver error="lookup region1.v2.argotunnel.com: i/o timeout"
2025-12-13T01:19:30Z ERR Failed to refresh DNS local resolver error="lookup region1.v2.argotunnel.com: i/o timeout"
2025-12-13T01:24:35Z ERR Failed to refresh DNS local resolver error="lookup region1.v2.argotunnel.com: i/o timeout"
2025-12-13T01:29:40Z ERR Failed to refresh DNS local resolver error="lookup region1.v2.argotunnel.com: i/o timeout"
2025-12-13T01:34:45Z ERR Failed to refresh DNS local resolver error="lookup region1.v2.argotunnel.com: i/o timeout"
2025-12-13T01:44:50Z ERR Failed to refresh DNS local resolver error="lookup region1.v2.argotunnel.com: i/o timeout"
2025-12-13T01:55:00Z ERR Failed to refresh DNS local resolver error="lookup region1.v2.argotunnel.com: i/o timeout"
2025-12-13T02:00:05Z ERR Failed to refresh DNS local resolver error="lookup region1.v2.argotunnel.com: i/o timeout"
2025-12-13T02:10:10Z ERR Failed to refresh DNS local resolver error="lookup region1.v2.argotunnel.com: i/o timeout"
Additional context
None