Skip to content

Commit 31fe915

Browse files
committed
mon/MgrStatMonitor: ignore duration for which feature is off
When the availability tracking feature is disabled, we should not be updating the score. We should start recalculating the score when the user enables the features again. Essentially, for the purpose of calculating the score, we need to ignore the duration for which the feature was turned off. The score is calculated from the uptime and downtime durations recorded in `pool_availability` object. These durations are updated in `calc_pool_availability` by adding the diff between last_uptime/ last_downtime and now. To discard the duration for which the feature was turned off, we need to offset the uptime/downtime by this duration. A simple way to do this is to update the last_uptime and last_downtime to the timestamp when the feature is toggled on again. To implement the same, we record the time at which the feature is toggled from off to on. When `calc_pool_availability` is invoked, if a reset is required, it resets last_uptime and last_downtime before proceeding with availability calculations. We only care about the state when the feature is toggled from off to on. All other toggle states for the config option will not have any effect on the score. Fixes: https://tracker.ceph.com/issues/71494 Signed-off-by: Shraddha Agrawal <[email protected]> (cherry picked from commit d81d2af)
1 parent ea06d79 commit 31fe915

File tree

2 files changed

+39
-4
lines changed

2 files changed

+39
-4
lines changed

src/mon/MgrStatMonitor.cc

Lines changed: 34 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -71,8 +71,24 @@ void MgrStatMonitor::handle_conf_change(
7171
const ConfigProxy& conf,
7272
const std::set<std::string>& changed)
7373
{
74-
// implement changes here
75-
dout(10) << __func__ << " enable_availability_tracking config option is changed." << dendl;
74+
if (changed.count("enable_availability_tracking")) {
75+
std::scoped_lock l(lock);
76+
bool oldval = enable_availability_tracking;
77+
bool newval = g_conf().get_val<bool>("enable_availability_tracking");
78+
dout(10) << __func__ << " enable_availability_tracking config option is changed from "
79+
<< oldval << " to " << newval
80+
<< dendl;
81+
82+
// if fetaure is toggled from off to on,
83+
// store the new value of last_uptime and last_downtime
84+
// (to be updated in calc_pool_availability)
85+
if (newval > oldval) {
86+
reset_availability_last_uptime_downtime_val = ceph_clock_now();
87+
dout(10) << __func__ << " reset_availability_last_uptime_downtime_val "
88+
<< reset_availability_last_uptime_downtime_val << dendl;
89+
}
90+
enable_availability_tracking = newval;
91+
}
7692
}
7793

7894
void MgrStatMonitor::create_initial()
@@ -88,14 +104,29 @@ void MgrStatMonitor::create_initial()
88104
void MgrStatMonitor::calc_pool_availability()
89105
{
90106
dout(20) << __func__ << dendl;
107+
std::scoped_lock l(lock);
91108

92109
// if feature is disabled by user, do not update the uptime
93110
// and downtime, exit early
94-
if (!g_conf().get_val<bool>("enable_availability_tracking")) {
111+
if (!enable_availability_tracking) {
95112
dout(20) << __func__ << " tracking availability score is disabled" << dendl;
96113
return;
97114
}
98115

116+
// if reset_availability_last_uptime_downtime_val is not utime_t(1, 2),
117+
// update last_uptime and last_downtime for all pools to the
118+
// recorded values
119+
if (reset_availability_last_uptime_downtime_val.has_value()) {
120+
for (const auto& i : pool_availability) {
121+
const auto& poolid = i.first;
122+
pool_availability[poolid].last_downtime = reset_availability_last_uptime_downtime_val.value();
123+
pool_availability[poolid].last_uptime = reset_availability_last_uptime_downtime_val.value();
124+
}
125+
dout(20) << __func__ << " reset last_uptime and last_downtime to "
126+
<< reset_availability_last_uptime_downtime_val << dendl;
127+
reset_availability_last_uptime_downtime_val.reset();
128+
}
129+
99130
auto pool_avail_end = pool_availability.end();
100131
for (const auto& i : digest.pool_pg_unavailable_map) {
101132
const auto& poolid = i.first;

src/mon/MgrStatMonitor.h

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@
2828
MgrStatMonitor(Monitor &mn, Paxos &p, const std::string& service_name);
2929
~MgrStatMonitor() override;
3030

31+
ceph::mutex lock = ceph::make_mutex("MgrStatMonitor::lock");
32+
3133
void init() override {}
3234
void on_shutdown() override {}
3335

@@ -53,7 +55,9 @@
5355
bool preprocess_statfs(MonOpRequestRef op);
5456

5557
void calc_pool_availability();
56-
58+
bool enable_availability_tracking = g_conf().get_val<bool>("enable_availability_tracking"); ///< tracking availability score feature
59+
std::optional<utime_t> reset_availability_last_uptime_downtime_val;
60+
5761
void check_sub(Subscription *sub);
5862
void check_subs();
5963
void send_digests();

0 commit comments

Comments
 (0)