Skip to content

Conversation

@LukeAVanDrie
Copy link
Contributor

@LukeAVanDrie LukeAVanDrie commented Feb 2, 2026

What type of PR is this?

/kind flake

What this PR does / why we need it:

Fixes two test flakes in the Flow Control layer (TestFlowController_EnqueueAndWait and TestFlowRegistry_GarbageCollection).

  1. Controller Flake: Logic Race with RealClock
  • OnReqCtxTimeoutAfterDistribution used a frozen FakeClock for the request timestamp but context.WithDeadline (which uses the system wall clock) for the deadline. CI/CD latency or slow startup (simulated locally with a 35ms sleep) caused the deadline to expire immediately relative to wall time, failing the test.
  • This PR refactorsnewUnitHarness to accept options and injected clock.RealClock{} for this specific test case. This ensures the request timestamp and deadline check rely on the same time source.
  1. Registry Flake: Background GC Race
  • TestFlowRegistry_GarbageCollection/ShouldCollectIdleFlow failed intermittently because the background GC loop (enabled by default) raced with the test’s manual GC trigger. This sometimes caused the flow to be collected before the test asserted its state.
  • This PR sets manualGC: true for this test case, preventing the background loop from interfering with deterministic test steps.

Reproduced flakes and verified fixes by injecting time.Sleep(35 * time.Millisecond) into newUnitHarness (for the controller test) and running:

go test -race -count=5000 -v -failfast -test.fullpath=true \
  -run ^TestFlowController_EnqueueAndWait$/^Lifecycle$/^OnReqCtxTimeoutAfterDistribution$ \
  sigs.k8s.io/gateway-api-inference-extension/pkg/epp/flowcontrol/controller

go test -race -count=5000 -v -failfast -test.fullpath=true \
  -run ^TestFlowRegistry_GarbageCollection$/^ShouldCollectIdleFlow$ \
  sigs.k8s.io/gateway-api-inference-extension/pkg/epp/flowcontrol/registry

Which issue(s) this PR fixes:

Fixes #2248
Fixes #2249

Does this PR introduce a user-facing change?:

NONE

Fixes two critical test flakes:
1. Controller: Mismatch between FakeClock and RealClock in
   `OnReqCtxTimeoutAfterDistribution`.
2. Registry: Race condition between background GC and manual GC in
   `ShouldCollectIdleFlow`.
@k8s-ci-robot
Copy link
Contributor

@LukeAVanDrie: The label(s) kind/test cannot be applied, because the repository doesn't have them.

Details

In response to this:

What type of PR is this?

/kind test

What this PR does / why we need it:

Fixes two test flakes in the Flow Control layer (TestFlowController_EnqueueAndWait and TestFlowRegistry_GarbageCollection).

  1. Controller Flake: Logic Race with RealClock
  • OnReqCtxTimeoutAfterDistribution used a frozen FakeClock for the request timestamp but context.WithDeadline (which uses the system wall clock) for the deadline. CI/CD latency or slow startup (simulated locally with a 35ms sleep) caused the deadline to expire immediately relative to wall time, failing the test.
  • This PR refactorsnewUnitHarness to accept options and injected clock.RealClock{} for this specific test case. This ensures the request timestamp and deadline check rely on the same time source.
  1. Registry Flake: Background GC Race
  • TestFlowRegistry_GarbageCollection/ShouldCollectIdleFlow failed intermittently because the background GC loop (enabled by default) raced with the test’s manual GC trigger. This sometimes caused the flow to be collected before the test asserted its state.
  • This PR sets manualGC: true for this test case, preventing the background loop from interfering with deterministic test steps.

Reproduced flakes and verified fixes by injecting time.Sleep(35 * time.Millisecond) into newUnitHarness (for the controller test) and running:

go test -race -count=5000 -v -failfast -test.fullpath=true \
 -run ^TestFlowController_EnqueueAndWait$/^Lifecycle$/^OnReqCtxTimeoutAfterDistribution$ \
 sigs.k8s.io/gateway-api-inference-extension/pkg/epp/flowcontrol/controller

go test -race -count=5000 -v -failfast -test.fullpath=true \
 -run ^TestFlowRegistry_GarbageCollection$/^ShouldCollectIdleFlow$ \
 sigs.k8s.io/gateway-api-inference-extension/pkg/epp/flowcontrol/registry

Which issue(s) this PR fixes:

Fixes #2248
Fixes #2249

Does this PR introduce a user-facing change?:

NONE

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@netlify
Copy link

netlify bot commented Feb 2, 2026

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit c629a46
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/6980fd9edeb95a0008f060c2
😎 Deploy Preview https://deploy-preview-2250--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot requested a review from liu-cong February 2, 2026 19:40
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 2, 2026
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 2, 2026
@ahg-g
Copy link
Contributor

ahg-g commented Feb 2, 2026

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 2, 2026
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, LukeAVanDrie

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 2, 2026
@ahg-g
Copy link
Contributor

ahg-g commented Feb 2, 2026

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

3 participants