- 
                Notifications
    You must be signed in to change notification settings 
- Fork 836
Bug fix: Avoid race condition by deduplicating entries in bucket stores user scan #6863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…rite Signed-off-by: Justin Jung <[email protected]>
Signed-off-by: Justin Jung <[email protected]>
| Thanks for catching! | 
| Umm, I think this becomes unnecessary now after refactoring the user scanner code. Before the new user scanner, in Store Gateway we simply gets all users by listing all prefixes in object store. There is no deduplication logic in the  After #6780, we introduced maps to deduplicate users from object store List API response so seems it is impossible to happen anymore. https://github.com/cortexproject/cortex/blob/master/pkg/storage/tsdb/users/scanner.go#L62 | 
| 
 We saw this before #6780, when in certain edge case  | 
| 
 Makes sense, but is there a way to force  | 
| I guess another option is to add a dedup logic here as well, such that any future scanner implementation that does not guarantee unique elements also doesn't cause store gateway crashes. @yeya24 what do you think? | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Signed-off-by: Justin Jung <[email protected]>
| Added deduplication instead of returning error | 
Signed-off-by: Justin Jung <[email protected]>
What this PR does:
We saw store gateways terminating due to map concurrent write:
I was able to reproduce the race condition by forcing scanUsers to return duplicate user IDs in the list:
To avoid such fatal error, I've added deduplication logic for users.
Which issue(s) this PR fixes:
n/a
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]