Skip to content

Conversation

@zhaizhibo
Copy link
Contributor

Descriptions of the changes in this PR:

Motivation

Fix #4238

Changes

Add a boolean status check before KeyValueStorageRocksDB.count()

…t() (possibly triggered by Prometheus) after RocksDB has been closed(apache#4243)
@zhaizhibo zhaizhibo marked this pull request as ready for review April 16, 2025 05:53
@lhotari
Copy link
Member

lhotari commented Apr 16, 2025

There's already #4243 to handle this issue.

@zhaizhibo
Copy link
Contributor Author

zhaizhibo commented Apr 16, 2025

@AnonHxy @lhotari To fix this issue, it is unnecessary to add locks in get/put operations, as it would be too heavy.
#4243

@lhotari
Copy link
Member

lhotari commented Apr 16, 2025

@AnonHxy @lhotari To fix this issue, it is unnecessary to add locks in get/put operations, as it would be too heavy. #4238

@zhaizhibo With this PR there could also be a race such as the count operation is in progress while the database gets closed. That would most likely result in a core dump as well. There's a need for a lock.
Another question is whether all operations are protected or whether it's just the count method that is addressed.

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to the closed flag, there's a need to have an exclusive lock on close and count methods so that the database doesn't get closed while the count method is executing. One possible solution is a StampedLock since a re-entrant lock isn't needed. The count method could acquire a read lock and close would acquire a write lock.

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lhotari
Copy link
Member

lhotari commented Apr 16, 2025

@zhaizhibo Please fix the checkstyle issues.

@zhaizhibo
Copy link
Contributor Author

@lhotari fixed.

@lhotari lhotari merged commit 2831ed3 into apache:master Apr 17, 2025
23 checks passed
@StevenLuMT
Copy link
Member

+1

lhotari pushed a commit that referenced this pull request Apr 17, 2025
…t after rocksdb has been closed (#4581)

* Fix the coredump that occurs when calling KeyValueStorageRocksDB.count() (possibly triggered by Prometheus) after RocksDB has been closed(#4243)

* fix race when count op in process and db gets closed.

---------

Co-authored-by: zhaizhibo <[email protected]>
(cherry picked from commit 2831ed3)
lhotari pushed a commit that referenced this pull request Apr 17, 2025
…t after rocksdb has been closed (#4581)

* Fix the coredump that occurs when calling KeyValueStorageRocksDB.count() (possibly triggered by Prometheus) after RocksDB has been closed(#4243)

* fix race when count op in process and db gets closed.

---------

Co-authored-by: zhaizhibo <[email protected]>
(cherry picked from commit 2831ed3)
lhotari pushed a commit that referenced this pull request Apr 17, 2025
…t after rocksdb has been closed (#4581)

* Fix the coredump that occurs when calling KeyValueStorageRocksDB.count() (possibly triggered by Prometheus) after RocksDB has been closed(#4243)

* fix race when count op in process and db gets closed.

---------

Co-authored-by: zhaizhibo <[email protected]>
(cherry picked from commit 2831ed3)
@zhaizhibo zhaizhibo deleted the fix_coredump_when_rocksdb_shutdown branch April 17, 2025 17:57
priyanshu-ctds pushed a commit to datastax/bookkeeper that referenced this pull request Jul 11, 2025
…t after rocksdb has been closed (apache#4581)

* Fix the coredump that occurs when calling KeyValueStorageRocksDB.count() (possibly triggered by Prometheus) after RocksDB has been closed(apache#4243)

* fix race when count op in process and db gets closed.

---------

Co-authored-by: zhaizhibo <[email protected]>
(cherry picked from commit 2831ed3)
(cherry picked from commit 9d067f4)
sandeep-ctds pushed a commit to datastax/bookkeeper that referenced this pull request Jul 22, 2025
…t after rocksdb has been closed (apache#4581)

* Fix the coredump that occurs when calling KeyValueStorageRocksDB.count() (possibly triggered by Prometheus) after RocksDB has been closed(apache#4243)

* fix race when count op in process and db gets closed.

---------

Co-authored-by: zhaizhibo <[email protected]>
(cherry picked from commit 2831ed3)
(cherry picked from commit 9d067f4)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bookie could coredump when it is shutting down

4 participants