Skip to content

Conversation

@davidesalerno
Copy link
Contributor

… IngressController

This PR is fixing the issue we have got into the ingress controller that is not exposing ingress_operator_conditions metrics for existing ingress controllers if the operator pod is restarted till the ingress controllers status don't change.

We can achieve this simply moving the SetIngressControllerConditionsMetric away from the code block inside the if statement checking if the ingress status is changed or not.

Added an e2e test that is verifying that the metrics are there before and after the operator pod restart. The e2e test can't be executed in parallel with other tests because it is restarting the ingress operator pod and so it could affect in a non predictable way their results.

@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. labels Oct 6, 2025
@openshift-ci-robot
Copy link
Contributor

@davidesalerno: This pull request references Jira Issue OCPBUGS-61508, which is invalid:

  • expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

… IngressController

This PR is fixing the issue we have got into the ingress controller that is not exposing ingress_operator_conditions metrics for existing ingress controllers if the operator pod is restarted till the ingress controllers status don't change.

We can achieve this simply moving the SetIngressControllerConditionsMetric away from the code block inside the if statement checking if the ingress status is changed or not.

Added an e2e test that is verifying that the metrics are there before and after the operator pod restart. The e2e test can't be executed in parallel with other tests because it is restarting the ingress operator pod and so it could affect in a non predictable way their results.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Oct 6, 2025
@openshift-ci openshift-ci bot requested review from Thealisyed and candita October 6, 2025 09:57
@openshift-ci-robot
Copy link
Contributor

@davidesalerno: This pull request references Jira Issue OCPBUGS-61508, which is invalid:

  • expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@davidesalerno
Copy link
Contributor Author

/retest

@davidesalerno
Copy link
Contributor Author

/retest

@candita
Copy link
Contributor

candita commented Oct 9, 2025

/assign @alebedev87
/asssign @grzpiotrowski

@candita
Copy link
Contributor

candita commented Oct 9, 2025

/assign @grzpiotrowski

@davidesalerno
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot
Copy link
Contributor

@davidesalerno: This pull request references Jira Issue OCPBUGS-61508, which is invalid:

  • expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@davidesalerno
Copy link
Contributor Author

/test okd-scos-e2e-aws-ovn

@davidesalerno
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 10, 2025
@openshift-ci-robot
Copy link
Contributor

@davidesalerno: This pull request references Jira Issue OCPBUGS-61508, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @anuragthehatter

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

@alebedev87 alebedev87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code change looks fine. Some comments about the testing.

Copy link
Contributor

@alebedev87 alebedev87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second look.

@davidesalerno davidesalerno force-pushed the ocpbugs61508onmaster branch 3 times, most recently from b90a8c4 to 3896e07 Compare October 21, 2025 10:15
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 4, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alebedev87

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 4, 2025
@lihongan
Copy link
Contributor

lihongan commented Nov 5, 2025

/jira refresh

@openshift-ci-robot
Copy link
Contributor

@lihongan: This pull request references Jira Issue OCPBUGS-61508, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.21.0) matches configured target version for branch (4.21.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @ShudiLi

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from ShudiLi November 5, 2025 02:01
@davidesalerno
Copy link
Contributor Author

/retest

@ShudiLi
Copy link
Member

ShudiLi commented Nov 7, 2025

tested it with 4.21.0-0-2025-11-07-061132-test-ci-ln-z5rxwg2-latest

1.
% oc get clusterversion
NAME      VERSION                                                AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.21.0-0-2025-11-07-061132-test-ci-ln-z5rxwg2-latest   True        False         7m      Cluster version is 4.21.0-0-2025-11-07-061132-test-ci-ln-z5rxwg2-latest

2. ingresscontroller int33 had the same domain as that of int22
 % oc -n openshift-ingress-operator get ingresscontroller                

NAME      AGE
default   4m6s
int1      34m
int22     33m
int33     20m

3.  delete the ingresscontroller pod, the metrics of in default, int1 and int22 were available on the new ingresscontroller pod.
4.  delete the default ingresscontroller
5. 
sh-5.1$ curl -s localhost:60000/metrics | grep -v controller_runtime | grep -v workqueue | grep -v go_ | grep controller
# HELP ingress_controller_aws_nlb_active Report the number of active NLBs on AWS clusters.
# TYPE ingress_controller_aws_nlb_active gauge
ingress_controller_aws_nlb_active{name="int1"} 0
ingress_controller_aws_nlb_active{name="int22"} 0
# HELP ingress_controller_conditions Report the conditions for ingress controllers. 0 is False and 1 is True.
# TYPE ingress_controller_conditions gauge
ingress_controller_conditions{condition="Available",name="int1"} 1
ingress_controller_conditions{condition="Available",name="int22"} 1
ingress_controller_conditions{condition="Degraded",name="int1"} 0
ingress_controller_conditions{condition="Degraded",name="int22"} 0
# HELP route_metrics_controller_routes_per_shard Report the number of routes for shards (ingress controllers).
# TYPE route_metrics_controller_routes_per_shard gauge
route_metrics_controller_routes_per_shard{shard_name="int1"} 8
route_metrics_controller_routes_per_shard{shard_name="int22"} 8
sh-5.1$ 
sh-5.1$ curl -s localhost:60000/metrics | grep -v controller_runtime | grep -v workqueue | grep -v go_ | grep controller
# HELP ingress_controller_aws_nlb_active Report the number of active NLBs on AWS clusters.
# TYPE ingress_controller_aws_nlb_active gauge
ingress_controller_aws_nlb_active{name="default"} 0
ingress_controller_aws_nlb_active{name="int1"} 0
ingress_controller_aws_nlb_active{name="int22"} 0
# HELP ingress_controller_conditions Report the conditions for ingress controllers. 0 is False and 1 is True.
# TYPE ingress_controller_conditions gauge
ingress_controller_conditions{condition="Available",name="default"} 0
ingress_controller_conditions{condition="Available",name="int1"} 1
ingress_controller_conditions{condition="Available",name="int22"} 1
ingress_controller_conditions{condition="Degraded",name="default"} 0
ingress_controller_conditions{condition="Degraded",name="int1"} 0
ingress_controller_conditions{condition="Degraded",name="int22"} 0
# HELP route_metrics_controller_routes_per_shard Report the number of routes for shards (ingress controllers).
# TYPE route_metrics_controller_routes_per_shard gauge
route_metrics_controller_routes_per_shard{shard_name="default"} 8
route_metrics_controller_routes_per_shard{shard_name="int1"} 8
route_metrics_controller_routes_per_shard{shard_name="int22"} 8
sh-5.1$ curl -s localhost:60000/metrics | grep -v controller_runtime | grep -v workqueue | grep -v go_ | grep controller
# HELP ingress_controller_aws_nlb_active Report the number of active NLBs on AWS clusters.
# TYPE ingress_controller_aws_nlb_active gauge
ingress_controller_aws_nlb_active{name="default"} 0
ingress_controller_aws_nlb_active{name="int1"} 0
ingress_controller_aws_nlb_active{name="int22"} 0
# HELP ingress_controller_conditions Report the conditions for ingress controllers. 0 is False and 1 is True.
# TYPE ingress_controller_conditions gauge
ingress_controller_conditions{condition="Available",name="default"} 1
ingress_controller_conditions{condition="Available",name="int1"} 1
ingress_controller_conditions{condition="Available",name="int22"} 1
ingress_controller_conditions{condition="Degraded",name="default"} 0
ingress_controller_conditions{condition="Degraded",name="int1"} 0
ingress_controller_conditions{condition="Degraded",name="int22"} 0
# HELP route_metrics_controller_routes_per_shard Report the number of routes for shards (ingress controllers).
# TYPE route_metrics_controller_routes_per_shard gauge
route_metrics_controller_routes_per_shard{shard_name="default"} 8
route_metrics_controller_routes_per_shard{shard_name="int1"} 8
route_metrics_controller_routes_per_shard{shard_name="int22"} 8
sh-5.1$ %   

@ShudiLi
Copy link
Member

ShudiLi commented Nov 7, 2025

/label qe-approved
thanks

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Nov 7, 2025
@grzpiotrowski
Copy link
Contributor

/retest

@davidesalerno
Copy link
Contributor Author

/test okd-scos-e2e-aws-ovn

@davidesalerno
Copy link
Contributor Author

/test e2e-aws-pre-release-ossm

@rhamini3
Copy link
Contributor

/verified by @ShudiLi comment

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Nov 13, 2025
@openshift-ci-robot
Copy link
Contributor

@rhamini3: This PR has been marked as verified by @ShudiLi [comment](https://github.com/openshift/cluster-ingress-operator/pull/1290#issuecomment-3501165490).

In response to this:

/verified by @ShudiLi comment

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 072c1cd and 2 for PR HEAD 00810f9 in total

@davidesalerno
Copy link
Contributor Author

/test e2e-aws-ovn-serial-1of2

@davidesalerno
Copy link
Contributor Author

/test e2e-aws-ovn

@rhamini3
Copy link
Contributor

/test e2e-gcp-operator

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 14, 2025

@davidesalerno: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 8773f8f into openshift:master Nov 14, 2025
18 checks passed
@openshift-ci-robot
Copy link
Contributor

@davidesalerno: Jira Issue Verification Checks: Jira Issue OCPBUGS-61508
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-61508 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

In response to this:

… IngressController

This PR is fixing the issue we have got into the ingress controller that is not exposing ingress_operator_conditions metrics for existing ingress controllers if the operator pod is restarted till the ingress controllers status don't change.

We can achieve this simply moving the SetIngressControllerConditionsMetric away from the code block inside the if statement checking if the ingress status is changed or not.

Added an e2e test that is verifying that the metrics are there before and after the operator pod restart. The e2e test can't be executed in parallel with other tests because it is restarting the ingress operator pod and so it could affect in a non predictable way their results.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@davidesalerno
Copy link
Contributor Author

/cherry-pick release-4.20

@openshift-cherrypick-robot

@davidesalerno: new pull request created: #1305

In response to this:

/cherry-pick release-4.20

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@davidesalerno
Copy link
Contributor Author

/cherry-pick release-4.19

@openshift-cherrypick-robot

@davidesalerno: new pull request created: #1308

In response to this:

/cherry-pick release-4.19

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-robot
Copy link
Contributor

Fix included in accepted release 4.21.0-0.nightly-2025-11-15-144034

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.