fix: reset ResourceGroup metrics on deletion #1930

tiffanny29631 · 2025-11-10T22:51:29Z

ResourceGroup metrics were showing stale values from deleted ResourceGroups because the controller never received deletion events. Since the pod persists, metrics continue to be emitted but with incorrect values.

Example: When a RootSync with 30 resources is deleted, ready_resource_count will still show 30 even though the ResourceGroup no longer exists. Resetting the resource metrics to zero ensures the aggregate metric values across all ResourceGroups remain correct.

Root controller now sends deletion events, and ResourceGroup controller resets resource metrics to zero for deleted ResourceGroups, ensuring aggregate metric values remain correct.

ResourceGroup metrics were showing stale values from deleted ResourceGroups because the controller never received deletion events. Since the pod persists, metrics continue to be emitted but with incorrect values. Root controller now sends deletion events, and ResourceGroup controller resets all metrics to zero for deleted ResourceGroups, ensuring aggregate metric values remain correct.

google-oss-prow · 2025-11-10T22:51:44Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from tiffanny29631. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

gemini-code-assist · 2025-11-10T22:51:45Z

Summary of Changes

Hello @tiffanny29631, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where ResourceGroup metrics would display stale values after a ResourceGroup was deleted. The changes ensure that upon deletion, the root controller propagates a deletion event, prompting the ResourceGroup controller to reset all relevant metrics to zero. This guarantees that aggregate metric values remain accurate and reflect the current state of active ResourceGroups.

Highlights

Deletion Event Propagation: The root controller now explicitly sends deletion events for ResourceGroups, ensuring that downstream controllers are notified when a resource is removed from the cluster.
Metric Reset on Deletion: The ResourceGroup controller has been updated to reset all associated resource metrics to zero when a ResourceGroup is deleted or not found, preventing stale metric values from persisting.
New Metric Reset Function: A new helper function, resetResourceMetrics, was introduced to encapsulate the logic for zeroing out various resource-related metrics upon ResourceGroup deletion.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request correctly addresses the issue of stale metrics for deleted ResourceGroups. The changes ensure that metrics are reset to zero upon ResourceGroup deletion by introducing a resetResourceMetrics function and having the root controller send deletion events. The logic is sound and the test updates reflect the new behavior. I've identified an opportunity for refactoring to reduce code duplication in pkg/resourcegroup/controllers/root/root_controller.go, which would improve maintainability. Overall, this is a good fix.

pkg/resourcegroup/controllers/root/root_controller.go

Copilot

Pull Request Overview

This PR fixes a bug where ResourceGroup metrics displayed stale values after ResourceGroups were deleted. The root cause was that the Root controller never sent deletion events to the ResourceGroup controller, so metrics continued to show outdated values even after deletion. The fix ensures deletion events are sent and metrics are reset to zero when ResourceGroups are deleted.

Key changes:

Root controller now sends deletion events to the ResourceGroup controller for both deletion scenarios (NotFound and DeletionTimestamp)
ResourceGroup controller resets all resource metrics to zero when handling deleted ResourceGroups
Test expectations updated to verify deletion events are properly sent

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
pkg/resourcegroup/controllers/root/root_controller.go	Adds deletion event sending for both NotFound and DeletionTimestamp scenarios, ensuring the ResourceGroup controller receives notifications when ResourceGroups are deleted
pkg/resourcegroup/controllers/resourcegroup/resourcegroup_controller.go	Introduces `resetResourceMetrics` function and calls it when ResourceGroups are deleted, resetting all metrics to zero to prevent stale values
pkg/resourcegroup/controllers/root/root_controller_test.go	Updates test to validate that deletion events are sent to the channel instead of verifying no events are sent

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pkg/resourcegroup/controllers/resourcegroup/resourcegroup_controller.go

sdowell · 2025-12-04T23:37:39Z

e2e/testcases/reconciler_finalizer_test.go

+// TestReconcilerFinalizer_ResourceGroupMetricsReset tests that when a ResourceGroup
+// is deleted, all resource-related metrics are reset to 0.
+func TestReconcilerFinalizer_ResourceGroupMetricsReset(t *testing.T) {
+	nt := nomostest.New(t, nomostesting.MultiRepos)


nit: suggest using Unstructured repository for testing instead of hierarchical

sdowell · 2025-12-04T23:39:46Z

e2e/testcases/reconciler_finalizer_test.go

+	defer cancel()
+	go nomostest.TailReconcilerLogs(ctx, nt, nomostest.RootReconcilerObjectKey(rootSyncKey.Name))
+
+	// Enable deletion propagation


Is the deletion propagation relevant to the test case? I believe the ResourceGroup should get cleaned up regardless of the deletion propagation policy

google-oss-prow bot requested review from Camila-B and mikebz November 10, 2025 22:51

google-oss-prow bot added the size/M label Nov 10, 2025

gemini-code-assist bot reviewed Nov 10, 2025

View reviewed changes

pkg/resourcegroup/controllers/root/root_controller.go Outdated Show resolved Hide resolved

mikebz requested a review from Copilot November 10, 2025 23:34

Copilot started reviewing on behalf of mikebz November 10, 2025 23:35 View session

Copilot finished reviewing on behalf of mikebz November 10, 2025 23:40

Copilot AI reviewed Nov 10, 2025

View reviewed changes

tiffanny29631 requested a review from sdowell November 11, 2025 19:31

refactor: consolidate duplicate deletion event code

33fec9e

sdowell reviewed Dec 2, 2025

View reviewed changes

pkg/resourcegroup/controllers/resourcegroup/resourcegroup_controller.go Show resolved Hide resolved

google-oss-prow bot added size/L and removed size/M labels Dec 3, 2025

tiffanny29631 force-pushed the rg-metric branch from 55ace24 to 5cd18a2 Compare December 3, 2025 22:58

sdowell reviewed Dec 4, 2025

View reviewed changes

add e2e for resetting resource related metrics upon deletion

1d4ae20

tiffanny29631 force-pushed the rg-metric branch from 5cd18a2 to 1d4ae20 Compare December 6, 2025 00:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: reset ResourceGroup metrics on deletion #1930

fix: reset ResourceGroup metrics on deletion #1930

Uh oh!

tiffanny29631 commented Nov 10, 2025 •

edited

Loading

Uh oh!

google-oss-prow bot commented Nov 10, 2025

Uh oh!

gemini-code-assist bot commented Nov 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

sdowell Dec 4, 2025

Uh oh!

tiffanny29631 Dec 6, 2025

Uh oh!

sdowell Dec 4, 2025

Uh oh!

tiffanny29631 Dec 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: reset ResourceGroup metrics on deletion #1930

Are you sure you want to change the base?

fix: reset ResourceGroup metrics on deletion #1930

Uh oh!

Conversation

tiffanny29631 commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

google-oss-prow bot commented Nov 10, 2025

Uh oh!

gemini-code-assist bot commented Nov 10, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

sdowell Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

tiffanny29631 Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

sdowell Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

tiffanny29631 Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tiffanny29631 commented Nov 10, 2025 •

edited

Loading