Skip to content

Commit 42335ee

Browse files
iSignalpao214
authored andcommitted
[#28944] YSQL: Set the default value of auto analyze flag based on user set cbo flags
Summary: Users currently have to manage cbo and auto analyze flags separately and can get into bad combinations. We do not want to enable auto analyze when cbo is in legacy mode (or anything other than on) because ANALYZE can update the `reltuples` field in pg_class for relations and cause unpredictable behavior. We intend to have the ysql_yb_enable_cbo flag default to false in code and be explicitly enabled by YBA/yugabyted for new clusters in 2025.2. This diff (intended for backport to 2025.2) changes the auto analyze flag defaults to match this situation. 1.0 If the user explicits sets ysql_enable_auto_analyze, that setting is always used directly. 2.0 The default value of the flag ysql_enable_auto_analyze in the code is `false`, however it can be automatically turned on to true in the following cases: 2.1 ysql_yb_enable_cbo PG runtime gflag is set to `on` (currently this flag defaults to `legacy_mode` in code) 2.2 ysql_pg_conf_csv contains `yb_enable_cbo=on` (currently this GUC defaults to `legacy_mode` in code) 2.3 (logic prior to this change): both the deprecated auto analyze flags are set explictly by the user `--ysql_enable_auto_analyze_service=true` && `--ysql_enable_table_mutation_counter=true` In all other cases, the default value of the flag (false) is used unchanged. For special cases, including precedence order when both ysql_yb_enable_cbo and ysql_pg_conf_csv are set, see Notes below. Notes 1. When we change the default GUC yb_enable_cbo or PG flag ysql_yb_enable_cbo to "on" in code in the future, we will also need to update ysql_enable_auto_analyze default in the code to true and possibly invert this logic if we want to disable it for older clusters with cbo in legacy mode. Added a DCHECK to catch this case. 2. `yb_enable_cbo` can be configured on a per-db/role basis as any other GUC. However, that case isn't handled here. The user can set the GUC `yb_disable_auto_analyze` to match this GUC on the corresponding db/roles manually in such cases. 3. GUCs are case insensitive both in name and value. 4. Precedence order and repeated values 4.1 If ysql_yb_enable_cbo is set explicitly, its value is the one that is used in postgres config, so that is the value used to determine the auto analyze setting (set to true if cbo is on/true/yes/1, false otherwise). In this case, any setting for yb_enable_cbo in pg_conf_csv is not considered. 4.2 Repeated values for ysql_yb_enable_cbo like `ysql_yb_enable_cbo=on,ysql_yb_enable_cbo=off` are parsed by the gflags library to prefer the last setting. 4.3 ysql_pg_conf_csv can technically include multiple settings for `ysql_yb_enable_cbo` like `ysql_pg_conf_csv={yb_enaBle_cbo=off,yb_enAble_cbo=oN}`. In this case, the last value wins. Jira: DB-18673 Test Plan: Jenkins ``` ./yb_build.sh --java-test TestPgRegressThirdPartyExtensionsHypopg ``` Backport-through: 2025.2 Tested the following combinations manually using yugabyted. | **tserver flags** | **ysql_enable_auto_analyze** | **show yb_enable_cbo** | | --tserver_flags not set | false (default) | legacy_mode | | --tserver_flags=ysql_yb_enable_cbo=on | true | on | | --tserver_flags=ysql_yb_enable_cbo=legacy_mode/off | false (default) | legacy_mode/off | | --tserver_flags="ysql_pg_conf_csv={yb_enable_cbo=on/yes/true/1}" | true | on | | --tserver_flags="ysql_pg_conf_csv={yb_enaBle_cbo=off,yb_enAble_cbo=oN}" | true | on | | --tserver_flags="ysql_pg_conf_csv={yb_enable_cbo=legacy_mode}" | false (default) | legacy_mode | | --tserver_flags="ysql_yb_enable_cbo=on,ysql_pg_conf_csv={yb_enable_cbo=off}" | true | on | | --tserver_flags="ysql_yb_enable_cbo=off,ysql_pg_conf_csv={yb_enable_cbo=on}" | false (default) | off | | --tserver_flags='"ysql_yb_enable_cbo=on,ysql_yb_enable_cbo=legacy_bnl_mode,ysql_pg_conf_csv={yb_enable_cbo=off,yb_enable_cbo=on}" | false (default) | legacy_bnl_mode Test script ``` rm /tmp/out; cat /tmp/examples | while read flags; do echo $flags >> /tmp/out; bin/yugabyted stop && sleep 1 && bin/yugabyted start --listen 127.0.0.3 --tserver_flags="$flags" && grep 'server_main_util.*enable_auto_analyze' ~/var/logs/tserver/yb-tserver.INFO | tee -a /tmp/out && sleep 15 && bin/ysqlsh -h 127.0.0.3 -c "show yb_enable_cbo" | tee -a /tmp/out ; echo "---" >> /tmp/out; done ``` ``` $ cat /tmp/examples ysql_yb_enable_cbo=on ysql_yb_enable_cbo=legacy_mode ysql_pg_conf_csv={yb_enable_cbo=TRue} ysql_pg_conf_csv={yb_enable_cbo=on} ysql_pg_conf_csv={yb_enaBle_cbo=off,yb_enAble_cbo=oN} ysql_pg_conf_csv={yb_enable_cbo=legacy_mode} ysql_yb_enable_cbo=on,ysql_pg_conf_csv={yb_enable_cbo=off} ysql_yb_enable_cbo=off,ysql_pg_conf_csv={yb_enable_cbo=on} ysql_yb_enable_cbo=on,ysql_yb_enable_cbo=legacy_bnl_mode,ysql_pg_conf_csv={yb_enable_cbo=off,yb_enable_cbo=on} ``` Close: #28944 Reviewers: yguan, mtakahara, gkukreja, sanketh Reviewed By: yguan, mtakahara Subscribers: jason, hsunder, yql, ybase Differential Revision: https://phorge.dev.yugabyte.com/D47676
1 parent be597c3 commit 42335ee

File tree

6 files changed

+158
-61
lines changed

6 files changed

+158
-61
lines changed

java/yb-pgsql/src/test/java/org/yb/pgsql/TestPgRegressThirdPartyExtensionsHypopg.java

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@
2222
import java.util.Collections;
2323
import java.util.Map;
2424

25+
import static org.yb.AssertionWrappers.*;
26+
2527
@RunWith(value=YBTestRunner.class)
2628
public class TestPgRegressThirdPartyExtensionsHypopg extends BasePgRegressTest {
2729
@Override
@@ -40,6 +42,13 @@ public void schedule_old_cost_model() throws Exception {
4042
Map<String, String> flagMap = super.getTServerFlags();
4143
flagMap.put("ysql_pg_conf_csv", "yb_enable_cbo=OFF");
4244
restartClusterWithFlags(Collections.emptyMap(), flagMap);
45+
46+
String enable_auto_analyze = miniCluster.getClient().getFlag(
47+
miniCluster.getTabletServers().keySet().iterator().next(),
48+
"ysql_enable_auto_analyze");
49+
assertTrue("ysql_enable_auto_analyze should remain at default value false",
50+
enable_auto_analyze.equals("false"));
51+
4352
runPgRegressTest(new File(TestUtils.getBuildRootDir(),
4453
"postgres_build/third-party-extensions/hypopg"),
4554
"yb_old_cost_model_schedule");
@@ -50,6 +59,13 @@ public void schedule() throws Exception {
5059
Map<String, String> flagMap = super.getTServerFlags();
5160
flagMap.put("ysql_pg_conf_csv", "yb_enable_cbo=ON");
5261
restartClusterWithFlags(Collections.emptyMap(), flagMap);
62+
63+
String enable_auto_analyze = miniCluster.getClient().getFlag(
64+
miniCluster.getTabletServers().keySet().iterator().next(),
65+
"ysql_enable_auto_analyze");
66+
assertTrue("ysql_enable_auto_analyze should be auto set to true",
67+
enable_auto_analyze.equals("true"));
68+
5369
runPgRegressTest(new File(TestUtils.getBuildRootDir(),
5470
"postgres_build/third-party-extensions/hypopg"),
5571
"yb_schedule");

src/yb/tserver/server_main_util.cc

Lines changed: 98 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,10 @@
1313

1414
#include "yb/tserver/server_main_util.h"
1515

16+
#include <algorithm>
1617
#include <iostream>
18+
#include <boost/algorithm/string/trim.hpp>
19+
#include "yb/util/string_case.h"
1720

1821
#if YB_ABSL_ENABLED
1922
#include "absl/debugging/symbolize.h"
@@ -31,6 +34,7 @@
3134
#include "yb/util/debug/trace_event.h"
3235
#include "yb/util/flags.h"
3336
#include "yb/util/mem_tracker.h"
37+
#include "yb/util/pg_util.h"
3438
#include "yb/util/size_literals.h"
3539
#include "yb/util/status.h"
3640

@@ -49,6 +53,8 @@ DECLARE_int64(memory_limit_hard_bytes);
4953
DECLARE_bool(ysql_enable_auto_analyze);
5054
DECLARE_bool(ysql_enable_auto_analyze_service);
5155
DECLARE_bool(ysql_enable_table_mutation_counter);
56+
DECLARE_string(ysql_pg_conf_csv);
57+
DECLARE_string(ysql_yb_enable_cbo);
5258

5359
namespace yb {
5460

@@ -125,40 +131,106 @@ void AdjustMemoryLimitsIfNeeded(bool is_master) {
125131
AdjustMemoryLimits(values);
126132
}
127133

128-
// If auto analyze service is used based on old deprecated flags,
129-
// adjust new flags' values based on the deprecated flags to make auto analyze
130-
// service behaves consistently.
131-
// When using deprecated flags, only setting both FLAGS_ysql_enable_auto_analyze
132-
// and FLAGS_ysql_enable_table_mutation_counter makes auto analyze service runable,
133-
// so ysql_enable_auto_analyze is set in the following precedence order
134-
// 1. Any explicit user set value for ysql_enable_auto_analyze.
135-
// 2. If not set explicitly,
136-
// 2.1 If ysql_enable_auto_analyze_service and ysql_enable_table_mutation_counter are set
137-
// explicitly. Use their values to set value for ysql_enable_auto_analyze.
138-
// ysql_enable_auto_analyze is true only if both ysql_enable_auto_analyze_service and
139-
// ysql_enable_table_mutation_counter are true.
140-
// 2.2 If not both of ysql_enable_auto_analyze_service and
141-
// ysql_enable_table_mutation_counter are set explicity,
142-
// use ysql_enable_auto_analyze default value.
134+
// Return the lower case value of a GUC set in a CSV string (FLAGS_ysql_pg_conf_csv),
135+
// accounting for the following
136+
// 1. GUC keys and values are case insensitive
137+
// 2. A GUC can be repeated multiple times, the final value is what matters
138+
// (consistent with WritePostgresConfig)
139+
std::optional<std::string> FindGUCValue(
140+
const std::string& guc_key, const std::vector<std::string>& user_configs) {
141+
std::optional<std::string> result;
142+
for (const auto& config : user_configs) {
143+
const auto lower_case_config = yb::ToLowerCase(config);
144+
if (lower_case_config.find(guc_key) == std::string::npos) {
145+
continue;
146+
}
147+
const auto pos = lower_case_config.find('=');
148+
if (pos != std::string::npos) {
149+
std::string config_key = boost::trim_copy(lower_case_config.substr(0, pos));
150+
std::string config_value = boost::trim_copy(lower_case_config.substr(pos + 1));
151+
if (config_key == guc_key) {
152+
result = config_value;
153+
}
154+
}
155+
}
156+
return result;
157+
}
158+
159+
// Auto analyze flag can have different defaults based on other flags like
160+
// older (deprecated) auto analyze flags or the cbo flags. In any case, an explict
161+
// user set flag value should be respected as-is.
143162
void AdjustAutoAnalyzeFlagIfNeeded() {
144-
// Check if the flag is explictly set.
145-
if (!google::GetCommandLineFlagInfoOrDie("ysql_enable_auto_analyze").is_default)
163+
// If the auto analyze flag is explictly set, do not change it.
164+
if (!google::GetCommandLineFlagInfoOrDie("ysql_enable_auto_analyze").is_default) {
165+
LOG(INFO) << "Flag: ysql_enable_auto_analyze has been set to " << FLAGS_ysql_enable_auto_analyze
166+
<< " by explicit config in command line";
146167
return;
147-
bool old_value_of_new_flag = FLAGS_ysql_enable_auto_analyze;
148-
// check if the old flags are enabled.
149-
bool auto_analyze_enabled_using_old_flags
150-
= FLAGS_ysql_enable_auto_analyze_service && FLAGS_ysql_enable_table_mutation_counter;
168+
}
169+
170+
// When we change ysql_yb_enable_cbo to default on, we need to update the logic here, so adding
171+
// a DCHECK to catch this situation.
172+
DCHECK(
173+
!(google::GetCommandLineFlagInfoOrDie("ysql_yb_enable_cbo").is_default &&
174+
FLAGS_ysql_yb_enable_cbo == "on"));
175+
176+
// If ysql_yb_enable_cbo was explicitly set by user, then ysql_enable_auto_analyze flag is set to
177+
// true whenever it is on. The reasoning is that CBO in legacy mode
178+
// can have more unpredictable behavior when ANALYZE is run, so we don't want to run ANALYZE
179+
// automatically.
180+
bool cbo_flag_on = false;
181+
std::string cbo_reason = "";
182+
const std::vector<std::string> cbo_on_values = {"on", "true", "1", "yes"};
183+
if (!google::GetCommandLineFlagInfoOrDie("ysql_yb_enable_cbo").is_default) {
184+
if (std::find(
185+
cbo_on_values.begin(), cbo_on_values.end(),
186+
yb::ToLowerCase(FLAGS_ysql_yb_enable_cbo)) != cbo_on_values.end()) {
187+
cbo_flag_on = true;
188+
cbo_reason = "ysql_yb_enable_cbo flag was explicitly set by user to '" +
189+
FLAGS_ysql_yb_enable_cbo + "'";
190+
}
191+
} else {
192+
// If ysql_pg_conf_csv contains yb_enable_cbo=on, then ysql_enable_auto_analyze is set to true.
193+
// Note that ysql_pg_conf_csv has lower precedence than user-set ysql_yb_enable_cbo.
194+
std::vector<std::string> user_configs;
195+
std::optional<std::string> cbo_guc_value;
196+
if (!FLAGS_ysql_pg_conf_csv.empty() &&
197+
ReadCSVValues(FLAGS_ysql_pg_conf_csv, &user_configs).ok() &&
198+
(cbo_guc_value = FindGUCValue("yb_enable_cbo", user_configs)).has_value() &&
199+
std::find(cbo_on_values.begin(), cbo_on_values.end(), cbo_guc_value.value()) !=
200+
cbo_on_values.end()) {
201+
cbo_flag_on = true;
202+
cbo_reason = "yb_enable_cbo was set to " + cbo_guc_value.value() + " in ysql_pg_conf_csv";
203+
}
204+
}
205+
206+
if (cbo_flag_on) {
207+
google::SetCommandLineOptionWithMode(
208+
"ysql_enable_auto_analyze", "true", google::SET_FLAG_IF_DEFAULT);
209+
LOG(INFO) << "Flag: ysql_yb_enable_auto_analyze was auto-set to "
210+
<< FLAGS_ysql_enable_auto_analyze << " because " << cbo_reason;
211+
return;
212+
}
213+
214+
// If ysql_enable_auto_analyze_service and ysql_enable_table_mutation_counter are set
215+
// explicitly, use their values to set value for ysql_enable_auto_analyze.
216+
// ysql_enable_auto_analyze is true only if both ysql_enable_auto_analyze_service and
217+
// ysql_enable_table_mutation_counter are true.
218+
const bool old_value_of_new_flag = FLAGS_ysql_enable_auto_analyze;
219+
const bool auto_analyze_enabled_using_old_flags =
220+
FLAGS_ysql_enable_auto_analyze_service && FLAGS_ysql_enable_table_mutation_counter;
151221
if (auto_analyze_enabled_using_old_flags) {
152-
// set new flag default.
153222
google::SetCommandLineOptionWithMode("ysql_enable_auto_analyze", "true",
154223
google::SET_FLAG_IF_DEFAULT);
155224

156-
LOG(INFO) << "Flag: ysql_enable_auto_analyze was auto-set to "
157-
<< FLAGS_ysql_enable_auto_analyze
225+
LOG(INFO) << "Flag: ysql_enable_auto_analyze was auto-set to " << FLAGS_ysql_enable_auto_analyze
158226
<< " from its default value of " << old_value_of_new_flag
159-
<< ". Flags: ysql_enable_auto_analyze_service and "
160-
<< "ysql_enable_table_mutation_counter are deprecated.";
227+
<< "because it was enabled by the following deprecated flags"
228+
<< " - ysql_enable_auto_analyze_service and ysql_enable_table_mutation_counter.";
229+
return;
161230
}
231+
232+
LOG(INFO) << "Not changing ysql_enable_auto_analyze from its default value "
233+
<< FLAGS_ysql_enable_auto_analyze;
162234
}
163235

164236
} // anonymous namespace

src/yb/tserver/tserver_flags.cc

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ DEFINE_UNKNOWN_uint64(tserver_master_replication_factor, 0,
2727
"name and port through tserver_master_addrs for masters auto-discovery when running on "
2828
"Kubernetes.");
2929

30-
DEFINE_RUNTIME_bool(ysql_enable_auto_analyze, true,
30+
DEFINE_RUNTIME_bool(
31+
ysql_enable_auto_analyze, false,
3132
"Enable Auto Analyze to automatically trigger ANALYZE for updating table statistics of tables "
3233
"which have changed more than a configurable threshold.");

src/yb/util/pg_util.cc

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,10 @@
1414
#include "yb/util/pg_util.h"
1515

1616
#include <algorithm>
17+
18+
#include <regex>
1719
#include <string>
20+
#include <boost/algorithm/string/replace.hpp>
1821

1922
#include "yb/util/logging.h"
2023

@@ -111,4 +114,38 @@ std::string PgDeriveSocketLockFile(const HostPort& host_port) {
111114
PgDeriveSocketDir(host_port), Format(".s.PGSQL.$0.lock", host_port.port()));
112115
}
113116

117+
Status ReadCSVValues(const std::string& csv, std::vector<std::string>* lines) {
118+
// Function reads CSV string in the following format:
119+
// - fields are divided with comma (,)
120+
// - fields with comma (,) or double-quote (") are quoted with double-quote (")
121+
// - pair of double-quote ("") in quoted field represents single double-quote (")
122+
//
123+
// Examples:
124+
// 1,"two, 2","three ""3""", four , -> ['1', 'two, 2', 'three "3"', ' four ', '']
125+
// 1,"two -> Malformed CSV (quoted field 'two' is not closed)
126+
// 1, "two" -> Malformed CSV (quoted field 'two' has leading spaces)
127+
// 1,two "2" -> Malformed CSV (field with " must be quoted)
128+
// 1,"tw"o" -> Malformed CSV (no separator after quoted field 'tw')
129+
130+
const std::regex exp(R"(^(?:([^,"]+)|(?:"((?:[^"]|(?:""))*)\"))(?:(?:,)|(?:$)))");
131+
auto i = csv.begin();
132+
const auto end = csv.end();
133+
std::smatch match;
134+
while (i != end && std::regex_search(i, end, match, exp)) {
135+
// Replace pair of double-quote ("") with single double-quote (") in quoted field.
136+
if (match[2].length() > 0) {
137+
lines->emplace_back(match[2].first, match[2].second);
138+
boost::algorithm::replace_all(lines->back(), "\"\"", "\"");
139+
} else {
140+
lines->emplace_back(match[1].first, match[1].second);
141+
}
142+
i += match.length();
143+
}
144+
SCHECK(i == end, InvalidArgument, Format("Malformed CSV '$0'", csv));
145+
if (!csv.empty() && csv.back() == ',') {
146+
lines->emplace_back();
147+
}
148+
return Status::OK();
149+
}
150+
114151
} // namespace yb

src/yb/util/pg_util.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,13 +14,18 @@
1414
#pragma once
1515

1616
#include <string>
17+
#include <vector>
1718

1819
#include "yb/util/net/net_util.h"
1920

2021
namespace yb {
2122

23+
class Status;
24+
2225
std::string PgDeriveSocketDir(const HostPort& host_port);
2326

2427
std::string PgDeriveSocketLockFile(const HostPort& host_port);
2528

29+
Status ReadCSVValues(const std::string& csv, std::vector<std::string>* lines);
30+
2631
} // namespace yb

src/yb/yql/pgwrapper/pg_wrapper.cc

Lines changed: 0 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -475,40 +475,6 @@ void MergeSharedPreloadLibraries(const string& src, vector<string>* defaults) {
475475
defaults->insert(defaults->end(), new_items.begin(), new_items.end());
476476
}
477477

478-
Status ReadCSVValues(const string& csv, vector<string>* lines) {
479-
// Function reads CSV string in the following format:
480-
// - fields are divided with comma (,)
481-
// - fields with comma (,) or double-quote (") are quoted with double-quote (")
482-
// - pair of double-quote ("") in quoted field represents single double-quote (")
483-
//
484-
// Examples:
485-
// 1,"two, 2","three ""3""", four , -> ['1', 'two, 2', 'three "3"', ' four ', '']
486-
// 1,"two -> Malformed CSV (quoted field 'two' is not closed)
487-
// 1, "two" -> Malformed CSV (quoted field 'two' has leading spaces)
488-
// 1,two "2" -> Malformed CSV (field with " must be quoted)
489-
// 1,"tw"o" -> Malformed CSV (no separator after quoted field 'tw')
490-
491-
const std::regex exp(R"(^(?:([^,"]+)|(?:"((?:[^"]|(?:""))*)\"))(?:(?:,)|(?:$)))");
492-
auto i = csv.begin();
493-
const auto end = csv.end();
494-
std::smatch match;
495-
while (i != end && std::regex_search(i, end, match, exp)) {
496-
// Replace pair of double-quote ("") with single double-quote (") in quoted field.
497-
if (match[2].length() > 0) {
498-
lines->emplace_back(match[2].first, match[2].second);
499-
boost::algorithm::replace_all(lines->back(), "\"\"", "\"");
500-
} else {
501-
lines->emplace_back(match[1].first, match[1].second);
502-
}
503-
i += match.length();
504-
}
505-
SCHECK(i == end, InvalidArgument, Format("Malformed CSV '$0'", csv));
506-
if (!csv.empty() && csv.back() == ',') {
507-
lines->emplace_back();
508-
}
509-
return Status::OK();
510-
}
511-
512478
// Make sure that the parameter values do not contain '\n' since each line is a separate parameter.
513479
Status ValidateConfValuesBasic(const vector<string>& lines) {
514480
for (const string& parameter : lines) {

0 commit comments

Comments
 (0)