Skip to content

Commit 1effb36

Browse files
koarzYour Name
authored andcommitted
[fix](core)be core when BeConfDataDirReader::get_data_dir_by_file_path (#59204)
During the execution of `init_file_cache_factory`, the following call path is triggered: ```txt init_file_cache_factory -> FileCacheFactory::create_file_cache -> cache->initialize() -> initialize_unlocked -> _storage->init(this) -> FSFileCacheStorage::init() ``` (At this point, a thread named `_cache_background_load_thread` is created, and the remaining operations run within this thread) `-> upgrade_cache_dir_if_necessary -> read_file_cache_version -> FileSystem::open_file -> open_file_impl -> LocalFileReader::LocalFileReader -> BeConfDataDirReader::get_data_dir_by_file_path` After `FSFileCacheStorage::init` completes (spawning the `_cache_background_load_thread`), `ExecEnv::_init` continues to execute `doris::io::BeConfDataDirReader::init_be_conf_data_dir`. This function performs push operations on `be_config_data_dir_list`. Simultaneously, `BeConfDataDirReader::get_data_dir_by_file_path` (running in the background thread) iterates over this same `be_config_data_dir_list`. This leads to a race condition: if `doris::io::BeConfDataDirReader::init_be_conf_data_dir` is inserting data while the vector is being read, two issues arise: 1. Modifying `be_config_data_dir_list` while iterating over it via a range-based for loop results in **Undefined Behavior (UB)**. 2. If `be_config_data_dir_list` triggers a reallocation (expansion) during the insertion, concurrent read operations on its elements will access dangling references, triggering a **heap-use-after-free** error. Since `init_be_conf_data_dir` depends on `cache_paths` derived from `init_file_cache_factory`, we must carefully manage the synchronization sequence to prevent these errors.
1 parent 9243365 commit 1effb36

File tree

2 files changed

+16
-0
lines changed

2 files changed

+16
-0
lines changed

be/src/io/fs/local_file_reader.cpp

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,15 +41,25 @@
4141
#include "runtime/workload_management/io_throttle.h"
4242
#include "util/async_io.h"
4343
#include "util/debug_points.h"
44+
#include "util/defer_op.h"
4445
#include "util/doris_metrics.h"
4546

4647
namespace doris {
4748
namespace io {
49+
// 1: initing 2: inited 0: before init
50+
std::atomic_int BeConfDataDirReader::be_config_data_dir_list_state = 0;
4851

4952
std::vector<doris::DataDirInfo> BeConfDataDirReader::be_config_data_dir_list;
5053

5154
void BeConfDataDirReader::get_data_dir_by_file_path(io::Path* file_path,
5255
std::string* data_dir_arg) {
56+
int state = be_config_data_dir_list_state.load(std::memory_order_acquire);
57+
if (state == 0) [[unlikely]] {
58+
return;
59+
} else if (state == 1) [[unlikely]] {
60+
be_config_data_dir_list_state.wait(1);
61+
}
62+
5363
for (const auto& data_dir_info : be_config_data_dir_list) {
5464
if (data_dir_info.path.size() >= file_path->string().size()) {
5565
continue;
@@ -65,6 +75,11 @@ void BeConfDataDirReader::init_be_conf_data_dir(
6575
const std::vector<doris::StorePath>& store_paths,
6676
const std::vector<doris::StorePath>& spill_store_paths,
6777
const std::vector<doris::CachePath>& cache_paths) {
78+
be_config_data_dir_list_state.store(1, std::memory_order_release);
79+
Defer defer {[]() {
80+
be_config_data_dir_list_state.store(2, std::memory_order_release);
81+
be_config_data_dir_list_state.notify_all();
82+
}};
6883
for (int i = 0; i < store_paths.size(); i++) {
6984
DataDirInfo data_dir_info;
7085
data_dir_info.path = store_paths[i].path;

be/src/io/fs/local_file_reader.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ struct CachePath;
3535
namespace doris::io {
3636

3737
struct BeConfDataDirReader {
38+
static std::atomic_int be_config_data_dir_list_state;
3839
static std::vector<doris::DataDirInfo> be_config_data_dir_list;
3940

4041
static void get_data_dir_by_file_path(Path* file_path, std::string* data_dir_arg);

0 commit comments

Comments
 (0)