Skip to content

Conversation

@perdasilva
Copy link
Collaborator

@perdasilva perdasilva commented Dec 12, 2024

Description of the change:

Motivation for the change:

Architectural changes:

Testing remarks:

Reviewer Checklist

  • Implementation matches the proposed design, or proposal is updated to match implementation
  • Sufficient unit test coverage
  • Sufficient end-to-end test coverage
  • Bug fixes are accompanied by regression test(s)
  • e2e tests and flake fixes are accompanied evidence of flake testing, e.g. executing the test 100(0) times
  • tech debt/todo is accompanied by issue link(s) in comments in the surrounding code
  • Tests are comprehensible, e.g. Ginkgo DSL is being used appropriately
  • Docs updated or added to /doc
  • Commit messages sensible and descriptive
  • Tests marked as [FLAKE] are truly flaky and have an issue
  • Code is properly formatted

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 12, 2024
Signed-off-by: Per Goncalves da Silva <[email protected]>
Signed-off-by: Per Goncalves da Silva <[email protected]>
})
}

func (o *operator) ResyncInformers() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No client using LIST + WATCH should ever have to do this, FWIW.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, watch events are at least once right?
This mishmash of cached and live requests is biting us in the butt....

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case you're curious: the issue is we have a situation in which the namespaces gets nuked and immediately recreated with the subscription, and we're ending up with unsat...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you using UID checks to detect deletion + recreation? Do you care that it deleted & recreated, or no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the end the RC was: the typed queues were of any type and we were enqueuing two different types of objects (string and ResourceEvent). Two semantically equal events come in and get sync'ed concurrently because we break dedupe =P

@stevekuznetsov
Copy link
Contributor

Using a Deployment or anything other than a Pod is likely the better approach that won't cause grief in the future.

Signed-off-by: Per Goncalves da Silva <[email protected]>
@perdasilva
Copy link
Collaborator Author

Using a Deployment or anything other than a Pod is likely the better approach that won't cause grief in the future.

Sorry - the PR description was misleading because I fumbled the commit (--amend). What I'm trying to understand is this: #3010

@perdasilva perdasilva closed this Dec 19, 2024
@perdasilva perdasilva deleted the perdasilva/fix-cache branch December 19, 2024 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants