Skip to content

Dynamic ZoneDelegations#2102

Draft
k0da wants to merge 1 commit intok8gb-io:masterfrom
AbsaOSS:dynamic_zones
Draft

Dynamic ZoneDelegations#2102
k0da wants to merge 1 commit intok8gb-io:masterfrom
AbsaOSS:dynamic_zones

Conversation

@k0da
Copy link
Collaborator

@k0da k0da commented Nov 6, 2025

..

HOW TO RUN CI ---

By default, all the checks will be run automatically. Furthermore, when changing website-related stuff, the preview will be generated by the netlify bot.

Heavy tests

Add the heavy-tests label on this PR if you want full-blown tests that include more than 2-cluster scenarios.

Debug tests

If the test suite is failing for you, you may want to try triggering Re-run all jobs (top right) with debug logging enabled. It will also make the print debug action more verbose.

@k0da k0da force-pushed the dynamic_zones branch 2 times, most recently from 07a1072 to 918a76e Compare November 7, 2025 09:50
kuritka
kuritka previously approved these changes Nov 7, 2025
Copy link
Collaborator

@kuritka kuritka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

excellent!

Copy link
Member

@ytsarev ytsarev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks super promising but requires documentation

@k0da k0da force-pushed the dynamic_zones branch 2 times, most recently from faa9508 to c252ffb Compare November 10, 2025 18:22
@k0da
Copy link
Collaborator Author

k0da commented Nov 10, 2025

@ytsarev Documentation added

@k0da k0da requested review from kuritka and ytsarev November 10, 2025 21:25
kuritka
kuritka previously approved these changes Nov 11, 2025
Copy link
Contributor

@abaguas abaguas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the work. It looks good, except for a regression I pointed out

{{ . | nindent 8 }}
{{- end }}
{{- end }}
{{- if .extraPlugins }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR introduces a regression. It is no longer possible to configure extraPlugins per DNSZone.
This is a feature that was introduced by the community in #1947.

Ideally the new CRD also supports such overrides. If not, at least it shouldn't break the existing functionality.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

}

zoneList := &v1beta1.ZoneDelegationList{}
err = zs.client.List(ctx, zoneList)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: since you also instantiate a client.Reader, shouldn't you use it also for List operations?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Client setups caches and initializes informer. ApiReader doesn't do that. We watch to Watch ZoneDelegations but we don't want to watch Configmaps

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abaguas - if you use client.Client before manager start, there are no caches and controller runtime gets crazy when using client (log messages). ApiReader is clean way to access resources out of cache.

Copy link
Contributor

@donovanmuller donovanmuller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@k0da Is it worth adding to the docs what problem this solves or how this improves things, using ZoneDelegation?

app.kubernetes.io/managed-by: Helm
name: coredns-dynamic
```
This configmap is mointed into CoreDNS pod and imported with import plugin:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This configmap is mointed into CoreDNS pod and imported with import plugin:
This configmap is mounted into CoreDNS pod and imported with import plugin:

Copy link
Member

@ytsarev ytsarev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@k0da, thanks a lot for adding user-facing documentation. Please also extend the PR description or additional documentation with the developer-facing design decisions. It's a serious change introducing new CRD and we need to be crystal clear why we are adding more complexity.

In line with what @donovanmuller suggested above

@k0da
Copy link
Collaborator Author

k0da commented Nov 14, 2025

@donovanmuller Updated documentation with pursued use case

@k0da
Copy link
Collaborator Author

k0da commented Nov 14, 2025

@ytsarev it is backed by feature flag. And be enabled on explicit request. It is not meant to change default k8gb behavior.

@ytsarev
Copy link
Member

ytsarev commented Nov 14, 2025

@k0da feature flag is great, we still need more detailed documentation for substantial changes in the codebase. We can work on the docs together when I return from Atlanta 👍

Copy link
Member

@ytsarev ytsarev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@k0da awesome job, thank you!

I did e2e tests locally. Main functionality works, but i noticed some things worth discussing

On fresh install , after ZoneDelegation creation, the existing gslbs are not picking up geotags/config until the main k8gb pod restart

 k get gslb -A --context=k3d-test-gslb1
NAMESPACE         NAME                 STRATEGY     GEOTAG
test-gslb-istio   failover-istio       failover
test-gslb-istio   notfound-istio       roundRobin
test-gslb-istio   roundrobin-istio     roundRobin
test-gslb-istio   unhealthy-istio      roundRobin
test-gslb         failover-ingress     failover
test-gslb         roundrobin-ingress   roundRobin
➜  k8gb git:(dynamic_zones) ✗ k -n k8gb delete pod k8gb-6586fb7754-nn8rc --context=k3d-test-gslb1
pod "k8gb-6586fb7754-nn8rc" deleted
➜  k8gb git:(dynamic_zones) ✗ k get gslb -A --context=k3d-test-gslb1
NAMESPACE         NAME                 STRATEGY     GEOTAG
test-gslb-istio   failover-istio       failover     eu
test-gslb-istio   notfound-istio       roundRobin   eu
test-gslb-istio   roundrobin-istio     roundRobin   eu
test-gslb-istio   unhealthy-istio      roundRobin   eu
test-gslb         failover-ingress     failover     eu
test-gslb         roundrobin-ingress   roundRobin   eu

I do not see anything reported in status of ZoneDelegation

k get zonedelegations.k8gb.absa.oss test-zone -o yaml
apiVersion: k8gb.absa.oss/v1beta1
kind: ZoneDelegation
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"k8gb.absa.oss/v1beta1","kind":"ZoneDelegation","metadata":{"annotations":{},"name":"test-zone"},"spec":{"dnsZoneNegTTL":30,"loadBalancedZone":"cloud.example.com","parentZone":"example.com"}}
  creationTimestamp: "2025-11-19T07:27:53Z"
  generation: 1
  name: test-zone
  resourceVersion: "1049"
  uid: e437966d-f83c-492d-9607-cc20bb7fff63
spec:
  dnsZoneNegTTL: 30
  loadBalancedZone: cloud.example.com
  parentZone: example.com

Please also check the inline questions/suggestions.

Thank you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we conver this logic by unit tests?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which section of mkdocs.yml it will show up? we need to group it somewhere. Advanced section?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sound like a good idea

Comment on lines +1 to +74
# Dynamic Zones (Feature)

This feature introduces new type of `Cluster Scoped` resource `ZoneDelegation` which carries DNS Zone information. Such as `parentZone`, `loadBalancedZone` and `dnsZoneNegTTL`

THere are cases where clusters are anonymous or shared across multiple tenants. When controller starts, we don't know what zones it will potentially participate into.

If we try to guess all potential loadBalancedZones in advance, prior application deployed into such cluster, k8gb's CoreDNS starts to be authoritative to such zone and be part of such zones query.
This leads to `NXDOMAIN` responses, since application is not yet deployed there.

`ZoneDelegation` resource helps us to align application deployment (or say `Team` onboard) and delegated zone creation, where zone is shipped together with `Appliation` and prevents `NXDOMAIN` to happen. And
will be reconciled once added.

For example:
```yaml
apiVersion: k8gb.absa.oss/v1beta1
kind: ZoneDelegation
metadata:
generation: 1
name: test-zone
spec:
dnsZoneNegTTL: 30
loadBalancedZone: test-zone.cloud.example.com
parentZone: cloud.example.com
```

Setup:
Set `k8gb.feature.dynamicZones` to true via helm chart values. This will add a new empty ConfigMap named `dynamic-zones`. Every `ZoneDelegation` reconcile loop, `k8gb` will
get all ZoneDelegations and create configmap key per `ZoneDelegation` like:
```yaml
apiVersion: v1
data:
test-zone-cloud-example-com.conf: |2-
test-zone.cloud.example.com:5353 {
import k8gbplugins
}
kind: ConfigMap
metadata:
annotations:
meta.helm.sh/release-name: k8gb
meta.helm.sh/release-namespace: k8gb
labels:
app.kubernetes.io/managed-by: Helm
name: coredns-dynamic
```
This configmap is mounted into CoreDNS pod and imported with import plugin:
```yaml
apiVersion: v1
data:
Corefile: |-
(k8gbplugins) {
errors
health
reload 30s 15s
ready
prometheus 0.0.0.0:9153
forward . /etc/resolv.conf
k8s_crd {
filter k8gb.absa.oss/dnstype=local
negttl 30
loadbalance weight
}
}
static-zone.cloud.example.com:5353 {
import k8gbplugins
}
import ../dynamic/*.conf
```

# ZoneDelegation Status
Holds an info about all DNSServers participating into zone delegation.

# ZoneDelegation Cleanup (WIP)
`ZoneDelegation` is protected by Finalizer, where on object removal, controller is responsible to clean up own reference in zone delegation and delete delegation completely if
current member is the last one standing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Dynamic Zones (Feature)
This feature introduces new type of `Cluster Scoped` resource `ZoneDelegation` which carries DNS Zone information. Such as `parentZone`, `loadBalancedZone` and `dnsZoneNegTTL`
THere are cases where clusters are anonymous or shared across multiple tenants. When controller starts, we don't know what zones it will potentially participate into.
If we try to guess all potential loadBalancedZones in advance, prior application deployed into such cluster, k8gb's CoreDNS starts to be authoritative to such zone and be part of such zones query.
This leads to `NXDOMAIN` responses, since application is not yet deployed there.
`ZoneDelegation` resource helps us to align application deployment (or say `Team` onboard) and delegated zone creation, where zone is shipped together with `Appliation` and prevents `NXDOMAIN` to happen. And
will be reconciled once added.
For example:
```yaml
apiVersion: k8gb.absa.oss/v1beta1
kind: ZoneDelegation
metadata:
generation: 1
name: test-zone
spec:
dnsZoneNegTTL: 30
loadBalancedZone: test-zone.cloud.example.com
parentZone: cloud.example.com
```
Setup:
Set `k8gb.feature.dynamicZones` to true via helm chart values. This will add a new empty ConfigMap named `dynamic-zones`. Every `ZoneDelegation` reconcile loop, `k8gb` will
get all ZoneDelegations and create configmap key per `ZoneDelegation` like:
```yaml
apiVersion: v1
data:
test-zone-cloud-example-com.conf: |2-
test-zone.cloud.example.com:5353 {
import k8gbplugins
}
kind: ConfigMap
metadata:
annotations:
meta.helm.sh/release-name: k8gb
meta.helm.sh/release-namespace: k8gb
labels:
app.kubernetes.io/managed-by: Helm
name: coredns-dynamic
```
This configmap is mounted into CoreDNS pod and imported with import plugin:
```yaml
apiVersion: v1
data:
Corefile: |-
(k8gbplugins) {
errors
health
reload 30s 15s
ready
prometheus 0.0.0.0:9153
forward . /etc/resolv.conf
k8s_crd {
filter k8gb.absa.oss/dnstype=local
negttl 30
loadbalance weight
}
}
static-zone.cloud.example.com:5353 {
import k8gbplugins
}
import ../dynamic/*.conf
```
# ZoneDelegation Status
Holds an info about all DNSServers participating into zone delegation.
# ZoneDelegation Cleanup (WIP)
`ZoneDelegation` is protected by Finalizer, where on object removal, controller is responsible to clean up own reference in zone delegation and delete delegation completely if
current member is the last one standing.
# Dynamic Zones
## High-Level Summary
Dynamic Zones allow k8gb to serve DNS zones only when applications are deployed, solving the problem of premature `NXDOMAIN` responses in anonymous, multi-tenant, or shared clusters.
By introducing a new cluster-scoped resource, `ZoneDelegation`, k8gb dynamically detects which zones a cluster should serve and configures CoreDNS automatically.
This provides:
- Accurate DNS behavior (no premature NXDOMAIN)
- Declarative onboarding
- Automatic CoreDNS reconfiguration without restarts
- Safe cleanup when zones are removed
---
# Dynamic Zones (Feature)
The feature introduces a cluster-scoped resource `ZoneDelegation` that contains:
- `loadBalancedZone`
- `parentZone`
- `dnsZoneNegTTL`
## Why Dynamic Zones?
Pre-configuring zones in anonymous or shared clusters makes CoreDNS authoritative too early, causing `NXDOMAIN` responses. `ZoneDelegation` lets applications ship their own zone definitions, enabling dynamic activation.
## Example
````yaml
apiVersion: k8gb.absa.oss/v1beta1
kind: ZoneDelegation
metadata:
name: test-zone
spec:
loadBalancedZone: test-zone.cloud.example.com
parentZone: cloud.example.com
dnsZoneNegTTL: 30

Setup

Enable via Helm:

k8gb:
  feature:
    dynamicZones: true

Example generated ConfigMap entry:

apiVersion: v1
data:
  test-zone-cloud-example-com.conf: |-
    test-zone.cloud.example.com:5353 {
      import k8gbplugins
    }
kind: ConfigMap
metadata:
  name: coredns-dynamic

CoreDNS loads dynamic zones via:

import ../dynamic/*.conf

Full Corefile snippet:

Corefile: |-
  (k8gbplugins) {
      errors
      health
      reload 30s 15s
      ready
      prometheus 0.0.0.0:9153
      forward . /etc/resolv.conf
      k8s_crd {
          filter k8gb.absa.oss/dnstype=local
          negttl 30
          loadbalance weight
      }
  }

  static-zone.cloud.example.com:5353 {
      import k8gbplugins
  }

  import ../dynamic/*.conf

ZoneDelegation Status

The status field lists all DNS servers currently participating in the delegation.


ZoneDelegation Cleanup (WIP)

On deletion, the finalizer:

  1. Removes this cluster from the delegation
  2. Deletes the delegation if this was the last participant
  3. Removes related ConfigMap entries
  4. Lets CoreDNS reload automatically

Ensuring no abandoned zone configuration remains.

coreDNSZones.Data[zoneKey] = getCoreDNSData(zoneInfo.LoadBalancedZone)
list = append(list, zoneInfo)
}
err = zs.client.Update(ctx, coreDNSZones)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unconditional update might lead to resource update conflicts

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is configmap update. We want to ensure all ZoneDelegation objects are reflected as CoreDNS zone file.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good as it is for now, I just expect issues with the concurrent updates in the future.

# -- whether it should also deploy the gslb and dnsendpoints CRDs
deployCrds: true
# -- whether it should also deploy the service account, cluster role and cluster role binding
feature:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need an additional feature hierarchy for this one?
why not just dynamicZones: false instead of

feature:
  dynamicZones: false

?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a first conditional feature we have. I thought it would be nice to introduce such things via feature Flag as those are optional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a noble goal to reflect that, but in practice when this functionality becomes closer to default, it will lead to migration pain to the users.

I suggest to do a single level and reflect maturity in comment. Something like
dynamicZones: false # Alpha feature

Comment on lines +501 to +503
name: zonedelegations.k8gb.absa.oss
spec:
group: k8gb.absa.oss
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name: zonedelegations.k8gb.absa.oss
spec:
group: k8gb.absa.oss
name: zonedelegations.k8gb.io
spec:
group: k8gb.io

we might consider using new upcoming group for the new resource to avoid migrations

# -- whether it should also deploy the service account, cluster role and cluster role binding
feature:
# -- whether to use Dynamic ZoneDelegation feature.
# If enabled, an extra volume and VolumeMounts must be specified
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the user need to make any modifications to the extraVolumes configuration below?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I have to drop it.

@sudhamshk
Copy link
Contributor

sudhamshk commented Nov 20, 2025

Hi @k0da

I tested the Dynamic Zones feature with the following setup:

Clusters: cluster1, cluster2, cluster3

ZoneDelegation:

parentZone: app.example.com
loadBalancedZone: gslb.app.example.com

Application placement:
app1 → cluster1 and cluster2
app2 → cluster2 and cluster3

With Dynamic Zones enabled, the expectation I was having was that each cluster would only serve the zones it participates in.

Observed behaviour:

app1.gslb.app.example.com

  • resolves correctly when NS points to cluster1 or cluster2
  • returns NXDOMAIN when NS points to cluster3, since cluster3 does not host the Gslb for app1

app2.gslb.app.example.com

  • resolves correctly when NS points to cluster2 or cluster3
  • returns NXDOMAIN when NS points to cluster1, since cluster1 does not host the Gslb for app2

I think to match how Dynamic Zones is designed, each app (or service) needs its own delegated sub-zone. That way, only the clusters hosting that app receive NS delegation, and clusters that don’t host it won’t return NXDOMAIN.

@k0da
Copy link
Collaborator Author

k0da commented Nov 20, 2025

Hi @k0da

I tested the Dynamic Zones feature with the following setup:

Clusters: cluster1, cluster2, cluster3

ZoneDelegation:

parentZone: app.example.com
loadBalancedZone: gslb.app.example.com

Application placement: app1 → cluster1 and cluster2 app2 → cluster2 and cluster3

With Dynamic Zones enabled, the expectation I was having was that each cluster would only serve the zones it participates in.

Observed behaviour:

app1.gslb.app.example.com

* resolves correctly when NS points to cluster1 or cluster2

* returns NXDOMAIN when NS points to cluster3, since cluster3 does not host the Gslb for app1

app2.gslb.app.example.com

* resolves correctly when NS points to cluster2 or cluster3

* returns NXDOMAIN when NS points to cluster1, since cluster1 does not host the Gslb for app2

I think to match how Dynamic Zones is designed, each app (or service) needs its own delegated sub-zone. That way, only the clusters hosting that app receive NS delegation, and clusters that don’t host it won’t return NXDOMAIN.

yes. That the intent. A team who deploys app to a particular cluster brings delegation along with the app. So cluster which have ZoneDelegation have app, be healthy or unhealthy but it would have gslb resource to participate into app queries.

Our internal setup works in a way a team owns subdomain exclusively and not shared with the other teams

@ytsarev ytsarev marked this pull request as draft December 12, 2025 22:55
@k0da k0da force-pushed the dynamic_zones branch 3 times, most recently from 0039a06 to 0353052 Compare December 30, 2025 15:56
@netlify
Copy link

netlify bot commented Dec 30, 2025

Deploy Preview for k8gb-preview ready!

Name Link
🔨 Latest commit a61a93a
🔍 Latest deploy log https://app.netlify.com/projects/k8gb-preview/deploys/6953f922a9602b0008316cbe
😎 Deploy Preview https://deploy-preview-2102--k8gb-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Signed-off-by: Dinar Valeev <k0da@opensuse.org>
kuritka added a commit that referenced this pull request Feb 26, 2026
Refactors Helm chart to properly support dynamicZones introduced in #2102.

- add k8gb.dynamicZones boolean
- add configurable extraVolumes and extraVolumeMounts
- extend values.schema.json
- update CoreDNS ConfigMap template

Dynamic zones can now be enabled declaratively through values.yaml.

Signed-off-by: Michal K <kuritka@gmail.com>
kuritka added a commit that referenced this pull request Feb 27, 2026
Refactors Helm chart to properly support dynamicZones introduced in #2102.

- add k8gb.dynamicZones boolean
- add configurable extraVolumes and extraVolumeMounts
- extend values.schema.json
- update CoreDNS ConfigMap template

Dynamic zones can now be enabled declaratively through values.yaml.

Signed-off-by: Michal K <kuritka@gmail.com>
kuritka added a commit that referenced this pull request Feb 27, 2026
Refactors Helm chart to properly support dynamicZones introduced in #2102.

- add k8gb.dynamicZones boolean
- add configurable extraVolumes and extraVolumeMounts
- extend values.schema.json
- update CoreDNS ConfigMap template

Dynamic zones can now be enabled declaratively through values.yaml.

Signed-off-by: Michal K <kuritka@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants