Skip to content

Fixed bug where allowing Prometheus metrics scraping could block DNS resolution by preventing traffic to the DNS server#595

Merged
zohar7ch merged 2 commits intomainfrom
zohar7ch/allow-dns-traffic
Jun 3, 2025
Merged

Fixed bug where allowing Prometheus metrics scraping could block DNS resolution by preventing traffic to the DNS server#595
zohar7ch merged 2 commits intomainfrom
zohar7ch/allow-dns-traffic

Conversation

@zohar7ch
Copy link
Copy Markdown
Contributor

@zohar7ch zohar7ch commented Jun 3, 2025

Description

We know that the service with the Prometheus scraping annotation might be the cluster’s DNS service - for example, the coredns workload in an EKS cluster.

If we add a network policy that only allows metrics scraping, it could unintentionally block DNS ingress traffic, effectively breaking DNS resolution within the cluster.

To avoid this, and since we can't always definitively identify which workload handles DNS, we ensure that any network policy allowing Prometheus to scrape metrics also permits traffic on port 53 (DNS) within the same policy.

Testing

Describe how this can be tested by reviewers. Be specific about anything not tested and reasons why. If this library has unit and/or integration testing, tests should be added for new functionality and existing tests should complete without errors.

Please include any manual steps for testing end-to-end or functionality not covered by unit/integration tests.

Also include details of the environment this PR was developed in (language/platform/browser version).

  • This change adds test coverage for new/changed/fixed functionality

Checklist

  • I have added documentation for new/changed functionality in this PR and in github.com/otterize/docs

zohar7ch added 2 commits June 3, 2025 14:01
We know that the service that has the Prometheus scraping annotation
might by the DNS service in the cluster (for example, the `coredns`
workload on an EKS cluster).
In this case, adding a network policy to allow metrics-scraping-traffic
would block the DNS ingress traffic, which would block DNS in the
cluster.
We don;t want this, and we can't tell for sure which workload is the DNS
in the cluster, so whenever we create a network policy to allow
Prometheus to scrape metrics, we also allow port 53 (DNS) on the same
network policy
@zohar7ch zohar7ch merged commit 5ec465a into main Jun 3, 2025
22 checks passed
@zohar7ch zohar7ch deleted the zohar7ch/allow-dns-traffic branch June 3, 2025 12:12
@github-actions github-actions bot locked and limited conversation to collaborators Jun 3, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants