-
Notifications
You must be signed in to change notification settings - Fork 99
MINIFICPP-2342 Do not overwrite config files during an upgrade #2069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
MINIFICPP-2342 Do not overwrite config files during an upgrade #2069
Conversation
ebca3b0 to
604baba
Compare
| #include "utils/Environment.h" | ||
|
|
||
| namespace { | ||
| bool fileContentsMatch(const std::filesystem::path& file_name, const std::unordered_set<std::string>& expected_contents) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of this function is a bit misleading, it could be something line propertiesMatchInFile to indicate that this is not a generic file content match check. Also this might be a bit shorter like this:
for (std::string line; std::getline(file, line); ) {
actual_contents.insert(std::move(line));
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have renamed the function to settingsInFileAreAsExpected() and changed the while loop to a for loop in fb3cdef
I don't think the move would be correct here, since we would reuse line after we moved from it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(thread reply) I think move is valid: the moved-from state is messy in general, some people say invalid, the standard says unspecified but valid state. But if it's reassigned (by getfile), then it probably doesn't matter. I would be a proponent of a destructive move / relocate, which would prevent reuse like this, but we don't have it yet.
I think it's fine either way.
| const auto extra_properties_files_dir = extra_properties_files_dir_name(); | ||
| std::vector<std::filesystem::path> extra_properties_file_names; | ||
| if (utils::file::exists(extra_properties_files_dir) && utils::file::is_directory(extra_properties_files_dir)) { | ||
| utils::file::list_dir(extra_properties_files_dir, [&](const std::filesystem::path&, const std::filesystem::path& file_name) { | ||
| if (!file_name.string().ends_with(".bak")) { | ||
| extra_properties_file_names.push_back(file_name); | ||
| } | ||
| return true; | ||
| }, logger_, /* recursive = */ false); | ||
| } | ||
| std::ranges::sort(extra_properties_file_names); | ||
| for (const auto& file_name : extra_properties_file_names) { | ||
| properties_files_.push_back(extra_properties_files_dir / file_name); | ||
| } | ||
|
|
||
| std::ifstream file(properties_file_, std::ifstream::in); | ||
| if (!file.good()) { | ||
| logger_->log_error("load configure file failed {}", properties_file_); | ||
| return; | ||
| logger_->log_info("Using configuration file to load configuration for {} from {} (located at {})", | ||
| getName().c_str(), configuration_file.string(), base_properties_file_.string()); | ||
| if (!extra_properties_file_names.empty()) { | ||
| auto list_of_files = utils::string::join(", ", extra_properties_file_names, [](const auto& path) { return path.string(); }); | ||
| logger_->log_info("Also reading configuration from files {} in {}", list_of_files, extra_properties_files_dir.string()); | ||
| } | ||
|
|
||
| properties_.clear(); | ||
| dirty_ = false; | ||
| for (const auto& line : PropertiesFile{file}) { | ||
| auto key = line.getKey(); | ||
| auto persisted_value = line.getValue(); | ||
| auto value = utils::string::replaceEnvironmentVariables(persisted_value); | ||
| bool need_to_persist_new_value = false; | ||
| fixValidatedProperty(std::string(prefix) + key, persisted_value, value, need_to_persist_new_value, *logger_); | ||
| dirty_ = dirty_ || need_to_persist_new_value; | ||
| properties_[key] = {persisted_value, value, need_to_persist_new_value}; | ||
| for (const auto& properties_file : properties_files_) { | ||
| std::ifstream file(properties_file, std::ifstream::in); | ||
| if (!file.good()) { | ||
| logger_->log_error("load configure file failed {}", properties_file); | ||
| continue; | ||
| } | ||
| for (const auto& line : PropertiesFile{file}) { | ||
| auto key = line.getKey(); | ||
| auto persisted_value = line.getValue(); | ||
| auto value = utils::string::replaceEnvironmentVariables(persisted_value); | ||
| bool need_to_persist_new_value = false; | ||
| fixValidatedProperty(std::string(prefix) + key, persisted_value, value, need_to_persist_new_value, *logger_); | ||
| dirty_ = dirty_ || need_to_persist_new_value; | ||
| properties_[key] = {persisted_value, value, need_to_persist_new_value}; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are 2 separate sections here, doing 2 separate things that can be encapsulated, I would extract them to 2 separate functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have refactored loadConfigureFile by pulling out two self-contained sections: edf88de
| const auto output_file = (persist_to_ == PersistTo::SingleFile ? base_properties_file_ : extra_properties_files_dir_name() / C2PropertiesFileName); | ||
| if (!std::filesystem::exists(output_file)) { | ||
| logger_->log_debug("Configuration file {} does not exist yet, creating it", output_file); | ||
| utils::file::create_dir(output_file.parent_path(), /* recursive = */ true); | ||
| std::ofstream file{output_file}; | ||
| } | ||
|
|
||
| std::ifstream file(output_file, std::ifstream::in); | ||
| if (!file) { | ||
| logger_->log_error("load configure file failed {}", properties_file_); | ||
| logger_->log_error("Failed to load configuration file {}", output_file); | ||
| return false; | ||
| } | ||
|
|
||
| auto new_file = properties_file_; | ||
| new_file += ".new"; | ||
|
|
||
| PropertiesFile current_content{file}; | ||
| for (const auto& prop : properties_) { | ||
| if (!prop.second.need_to_persist_new_value) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would extract reading the PropertiesFile content to a separate function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulling out the file-reading part is not so easy because of the return false in line 219/257.
I have pulled out the next section, which sets the changed properties in current_content: 9d355e8
12e6d40 to
5894882
Compare
| #include "utils/Environment.h" | ||
|
|
||
| namespace { | ||
| bool fileContentsMatch(const std::filesystem::path& file_name, const std::unordered_set<std::string>& expected_contents) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(thread reply) I think move is valid: the moved-from state is messy in general, some people say invalid, the standard says unspecified but valid state. But if it's reassigned (by getfile), then it probably doesn't matter. I would be a proponent of a destructive move / relocate, which would prevent reuse like this, but we don't have it yet.
I think it's fine either way.
Currently, MiNiFi modifies the installed config files (minifi.properties and config.yml) while it runs. This confuses the Windows installer, so changes to these config files get lost when MiNiFi is upgraded. After this change, the minifi.properties, minifi-log.properties and minifi-uid.properties files are no longer modified by MiNiFi at runtime, so they can be safely replaced by new versions during upgrade. All changes to the settings should be put into new files in the minifi.properties.d (minifi-log.properties.d, minifi-uid.properties.d) directory; these new files will not be touched by the upgrade. The config.yml file will no longer be installed as part of MiNiFi. If C2 is enabled, config.yml will be fetched from the C2 server; otherwise, MiNiFi will create a new file with an empty flow, and the user can edit this. Either way, config.yml will not be touched by an upgrade.
... and use a for loop instead of while
Co-authored-by: Márton Szász <[email protected]>
af0b755 to
be4f9d1
Compare
| return true; | ||
| } | ||
| std::ifstream file(properties_file_, std::ifstream::in); | ||
| const auto output_file = (persist_to_ == PersistTo::SingleFile ? base_properties_file_ : extra_properties_files_dir_name() / C2PropertiesFileName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what would be the best way to decide which file to write updates to, but it's a bit strange that we hardcode a C2-related filename for (I'm assuming) property overrides through C2 property update, in a function whose signature doesn't suggest any inherent relation to C2. Could we maybe pass in the target filename instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have looked into it, but it would not be easy to do, because of the complex inheritance structure:

We would have to modify all these classes (maybe more), and also change 3 places in C2Agent where commitChanges() is called.
A simpler solution could be to find a more neutral name for this file, eg. 90_runtime_updates.properties. Would that be OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather keep the functionality and keep the door open for future refactors cleaning it up.
https://issues.apache.org/jira/browse/MINIFICPP-2342
Currently, MiNiFi modifies the installed config files (minifi.properties and config.yml) while it runs. This confuses the Windows installer, so changes to these config files get lost when MiNiFi is upgraded.
After this change, the minifi.properties, minifi-log.properties and minifi-uid.properties files are no longer modified by MiNiFi at runtime, so they can be safely replaced by new versions during upgrade. All changes to the settings should be put into new files in the minifi.properties.d (minifi-log.properties.d, minifi-uid.properties.d) directory; these new files will not be touched by the upgrade.
The config.yml file will no longer be installed as part of MiNiFi. If C2 is enabled, config.yml will be fetched from the C2 server; otherwise, MiNiFi will create a new file with an empty flow, and the user can edit this. Either way, config.yml will not be touched by an upgrade.
Thank you for submitting a contribution to Apache NiFi - MiNiFi C++.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
For all changes:
Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?
Does your PR title start with MINIFICPP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
Has your PR been rebased against the latest commit within the target branch (typically main)?
Is your initial contribution a single, squashed commit?
For code changes:
For documentation related changes:
Note:
Please ensure that once the PR is submitted, you check GitHub Actions CI results for build issues and submit an update to your PR as soon as possible.