Skip to content

Conversation

@dao-jun
Copy link
Member

@dao-jun dao-jun commented Apr 26, 2024

Descriptions of the changes in this PR:

Motivation

Fix ConcurrentLongHashMap can be ArrayIndexOutBound when call get

Changes

(Describe: what changes you have made)

Master Issue: #4316


In order to uphold a high standard for quality for code contributions, Apache BookKeeper runs various precommit
checks for pull requests. A pull request can only be merged when it passes precommit checks.


Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

If this PR is a BookKeeper Proposal (BP):

  • Make sure the PR title is formatted like:
    <BP-#>: Description of bookkeeper proposal
    e.g. BP-1: 64 bits ledger is support
  • Attach the master issue link in the description of this PR.
  • Attach the google doc link if the BP is written in Google Doc.

Otherwise:

  • Make sure the PR title is formatted like:
    <Issue #>: Description of pull request
    e.g. Issue 123: Description ...
  • Make sure tests pass via mvn clean apache-rat:check install spotbugs:check.
  • Replace <Issue #> in the title with the actual Issue number.

@dao-jun
Copy link
Member Author

dao-jun commented Apr 26, 2024

@merlimat @dlg99 @lhotari PTAL

@dao-jun dao-jun changed the title [fix] Fix ConcurrentLongHashMap can be ArrayIndexOutBound when call get [fix] Fix ConcurrentLongHashMap concurrency issue Apr 26, 2024
@dao-jun
Copy link
Member Author

dao-jun commented Apr 26, 2024

It looks other Concurrent containers has the same issue, I'll create PRs to fix them one by one.

@dao-jun
Copy link
Member Author

dao-jun commented Apr 27, 2024

I created an issue for tracking it #4318

@dao-jun
Copy link
Member Author

dao-jun commented Apr 27, 2024

BTW, the key point is

                    int capacity = this.capacity;
                    bucket = signSafeMod(bucket, capacity);

and

                    // First try optimistic locking
                    long storedKey = keys[bucket];
                    V storedValue = values[bucket];

not an atomic operation.

If rehash triggered in an async thread after calculating bucket and finished before read entry from the array, and table shrink to a smaller array, it will lead to ArrayIndexOutOfBoundsExeception

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reporting the issue and starting the work to fix this issue.

Catching the exception would be the preferred solution since there's no point of using a StampedLock unless optimistic locking is used.

I think that we had this same bug discussed years ago, but for some reason we didn't fix this.

@lhotari
Copy link
Member

lhotari commented Apr 27, 2024

Found the discussion from the past: apache/pulsar#18390 (comment) . @thetumbled had originally made a PR to catch IndexOutOfBoundsException, which I think is the correct solution.

@thetumbled
Copy link
Member

I have created this pr to fix the issue in bookkeeper: #4066, can we move forward with it? @lhotari @hangc0276 @eolivelli @wenbingshen @zymap @shoothzj @horizonzy

@lhotari
Copy link
Member

lhotari commented Apr 27, 2024

Let's close this PR and continue with #4066

@dao-jun
Copy link
Member Author

dao-jun commented Apr 27, 2024

Good! Let's continue with #4066

@dao-jun dao-jun closed this Apr 27, 2024
@dao-jun dao-jun deleted the dev/fix_concurrent_long_hashmap branch April 27, 2024 07:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants