-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
When deploying openshift cluster with platform: baremetal you cannot have same VIP address for both API and Ingress even when setting loadbalancer: type: UserManaged.
For example, following snippet in install-config.yaml:
platform:
baremetal:
apiVIPs:
- 10.0.10.100
ingressVIPs:
- 10.0.10.100will produce validation error:
$ openshift-install-linux agent create image
INFO Configuration has 3 master replicas, 0 arbiter replicas, and 4 worker replicas
ERROR failed to write asset (Agent Installer ISO) to disk: cannot generate ISO image due to configuration errors
FATAL failed to fetch Agent Installer ISO: failed to load asset "Install Config": invalid install-config configuration: platform.baremetal.apiVIPs: Invalid value: "10.0.10.100": VIP for API must not be one of the Ingress VIPs
Which is expected. But uppon further investigation of the installer code:
https://github.com/openshift/installer/blob/release-4.20/pkg/types/validation/installconfig.go#L971
If you update snippet in install-config.yaml to:
platform:
baremetal:
loadBalancer:
type: UserManaged
apiVIPs:
- 10.0.10.100
ingressVIPs:
- 10.0.10.100Then install succesfully generate agent.x86_64.iso file that you can boot up on the node(s).
$ openshift-install-linux agent create image
INFO Configuration has 3 master replicas, 0 arbiter replicas, and 4 worker replicas
INFO The rendezvous host IP (node0 IP) is 10.0.10.10
INFO Extracting base ISO from release payload
INFO Verifying cached file
INFO Using cached Base ISO /root/.cache/agent/image_cache/coreos-x86_64.iso
INFO Consuming Install Config from target directory
INFO Consuming Agent Config from target directory
INFO Generated ISO at agent.x86_64.iso.
But when that ISO image is booted on the node, the assisted-service reports following validation violation:
- The IP address "10.0.10.100" appears both in apiVIPs and ingressVIPs
Uppon further digging around assisted-service codebase, there are code paths to have UserManaged load balancer that will not enforce different VIPs for both API and Ingress.
https://github.com/openshift/assisted-service/blob/c616cdc/internal/cluster/validations/validations.go#L525
It checks the file on node /etc/assisted/manifests/agent-cluster-install.yaml if
apiVersion: extensions.hive.openshift.io/v1beta1
kind: AgentClusterInstall
spec:
...
platformType: BareMetal
loadBalancer:
type: UserManaged
is defined inside, but that stanza is not defined even though it was defined in initial install-config.yaml (probably not copied during rendering of manifests in installer iso image creation process).
It seems that in this place it's missing copy of load balancer definition into AgentClusterInstall manifest:
https://github.com/openshift/installer/blob/release-4.20/pkg/asset/agent/manifests/agentclusterinstall.go#L307
Version of openshift-installer:
openshift-install-linux 4.20.1
built from commit e23807689ec464da30e771dda70fd8989680a011
release image quay.io/openshift-release-dev/ocp-release@sha256:cbde13fe6ed4db88796be201fbdb2bbb63df5763ae038a9eb20bc793d5740416
release architecture amd64
Version of assisted-service:
openshift/assisted-service@c616cdc
Also, I'm willing to create PR for this if this is accepted as bug/improvement.