-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[fix](core)be core when BeConfDataDirReader::get_data_dir_by_file_path #59204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
run buildall |
|
run buildall |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run performance |
TPC-H: Total hot run time: 35023 ms |
TPC-DS: Total hot run time: 179761 ms |
ClickBench: Total hot run time: 27.22 s |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
freemandealer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice work!
|
run buildall |
TPC-H: Total hot run time: 34965 ms |
TPC-DS: Total hot run time: 179596 ms |
ClickBench: Total hot run time: 27.01 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run nonConcurrent |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
PR approved by at least one committer and no changes requested. |
#59204) During the execution of `init_file_cache_factory`, the following call path is triggered: ```txt init_file_cache_factory -> FileCacheFactory::create_file_cache -> cache->initialize() -> initialize_unlocked -> _storage->init(this) -> FSFileCacheStorage::init() ``` (At this point, a thread named `_cache_background_load_thread` is created, and the remaining operations run within this thread) `-> upgrade_cache_dir_if_necessary -> read_file_cache_version -> FileSystem::open_file -> open_file_impl -> LocalFileReader::LocalFileReader -> BeConfDataDirReader::get_data_dir_by_file_path` After `FSFileCacheStorage::init` completes (spawning the `_cache_background_load_thread`), `ExecEnv::_init` continues to execute `doris::io::BeConfDataDirReader::init_be_conf_data_dir`. This function performs push operations on `be_config_data_dir_list`. Simultaneously, `BeConfDataDirReader::get_data_dir_by_file_path` (running in the background thread) iterates over this same `be_config_data_dir_list`. This leads to a race condition: if `doris::io::BeConfDataDirReader::init_be_conf_data_dir` is inserting data while the vector is being read, two issues arise: 1. Modifying `be_config_data_dir_list` while iterating over it via a range-based for loop results in **Undefined Behavior (UB)**. 2. If `be_config_data_dir_list` triggers a reallocation (expansion) during the insertion, concurrent read operations on its elements will access dangling references, triggering a **heap-use-after-free** error. Since `init_be_conf_data_dir` depends on `cache_paths` derived from `init_file_cache_factory`, we must carefully manage the synchronization sequence to prevent these errors.
#59204) During the execution of `init_file_cache_factory`, the following call path is triggered: ```txt init_file_cache_factory -> FileCacheFactory::create_file_cache -> cache->initialize() -> initialize_unlocked -> _storage->init(this) -> FSFileCacheStorage::init() ``` (At this point, a thread named `_cache_background_load_thread` is created, and the remaining operations run within this thread) `-> upgrade_cache_dir_if_necessary -> read_file_cache_version -> FileSystem::open_file -> open_file_impl -> LocalFileReader::LocalFileReader -> BeConfDataDirReader::get_data_dir_by_file_path` After `FSFileCacheStorage::init` completes (spawning the `_cache_background_load_thread`), `ExecEnv::_init` continues to execute `doris::io::BeConfDataDirReader::init_be_conf_data_dir`. This function performs push operations on `be_config_data_dir_list`. Simultaneously, `BeConfDataDirReader::get_data_dir_by_file_path` (running in the background thread) iterates over this same `be_config_data_dir_list`. This leads to a race condition: if `doris::io::BeConfDataDirReader::init_be_conf_data_dir` is inserting data while the vector is being read, two issues arise: 1. Modifying `be_config_data_dir_list` while iterating over it via a range-based for loop results in **Undefined Behavior (UB)**. 2. If `be_config_data_dir_list` triggers a reallocation (expansion) during the insertion, concurrent read operations on its elements will access dangling references, triggering a **heap-use-after-free** error. Since `init_be_conf_data_dir` depends on `cache_paths` derived from `init_file_cache_factory`, we must carefully manage the synchronization sequence to prevent these errors.
…_by_file_path #59204 (#59472) Cherry-picked from #59204 Co-authored-by: koarz <[email protected]>
During the execution of
init_file_cache_factory, the following call path is triggered:(At this point, a thread named
_cache_background_load_threadis created, and the remaining operations run within this thread)-> upgrade_cache_dir_if_necessary -> read_file_cache_version -> FileSystem::open_file -> open_file_impl -> LocalFileReader::LocalFileReader -> BeConfDataDirReader::get_data_dir_by_file_pathAfter
FSFileCacheStorage::initcompletes (spawning the_cache_background_load_thread),ExecEnv::_initcontinues to executedoris::io::BeConfDataDirReader::init_be_conf_data_dir. This function performs push operations onbe_config_data_dir_list.Simultaneously,
BeConfDataDirReader::get_data_dir_by_file_path(running in the background thread) iterates over this samebe_config_data_dir_list. This leads to a race condition: ifdoris::io::BeConfDataDirReader::init_be_conf_data_diris inserting data while the vector is being read, two issues arise:be_config_data_dir_listwhile iterating over it via a range-based for loop results in Undefined Behavior (UB).be_config_data_dir_listtriggers a reallocation (expansion) during the insertion, concurrent read operations on its elements will access dangling references, triggering a heap-use-after-free error.Since
init_be_conf_data_dirdepends oncache_pathsderived frominit_file_cache_factory, we must carefully manage the synchronization sequence to prevent these errors.