Skip to content

Commit 645cdf9

Browse files
bill-scalesaainscow
authored andcommitted
osd: Optimized EC don't try to trim past crt
If there is an exceptionally long sequence of partial writes that did not update a shard that is followed by a full write then it is possible that the log trim point is ahead of the previous write to the shard (and hence crt). We cannot trim beyond crt. In this scenario its fine to limit the trim to crt because the shard doesn't have any of the log entries for the partial writes so there is nothing more to trim. Signed-off-by: Bill Scales <[email protected]>
1 parent d19f3a3 commit 645cdf9

File tree

1 file changed

+23
-2
lines changed

1 file changed

+23
-2
lines changed

src/osd/PeeringState.cc

Lines changed: 23 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4549,8 +4549,16 @@ bool PeeringState::append_log_entries_update_missing(
45494549

45504550
psdout(20) << "trim_to bool = " << bool(trim_to)
45514551
<< " trim_to = " << (trim_to ? *trim_to : eversion_t()) << dendl;
4552-
if (trim_to)
4553-
pg_log.trim(*trim_to, info);
4552+
if (trim_to) {
4553+
eversion_t trim = *trim_to;
4554+
if (pool.info.allows_ecoptimizations() &&
4555+
(trim > pg_log.get_can_rollback_to())) {
4556+
// An exceptionally long sequence of partial writes followed by a full
4557+
// write can result in trim_to being ahead of crt
4558+
trim = pg_log.get_can_rollback_to();
4559+
}
4560+
pg_log.trim(trim, info);
4561+
}
45544562
dirty_info = true;
45554563
write_if_dirty(t);
45564564
return invalidate_stats;
@@ -4722,6 +4730,12 @@ void PeeringState::append_log(
47224730
if (!transaction_applied || async)
47234731
psdout(10) << pg_whoami
47244732
<< " is async_recovery or backfill target" << dendl;
4733+
if (pool.info.allows_ecoptimizations() &&
4734+
(trim_to > pg_log.get_can_rollback_to())) {
4735+
// An exceptionally long sequence of partial writes followed by a full
4736+
// write can result in trim_to being ahead of crt
4737+
trim_to = pg_log.get_can_rollback_to();
4738+
}
47254739
pg_log.trim(trim_to, info, transaction_applied, async);
47264740

47274741
// update the local pg, pg log
@@ -7028,6 +7042,13 @@ boost::statechart::result PeeringState::ReplicaActive::react(const MTrim& trim)
70287042
{
70297043
DECLARE_LOCALS;
70307044
// primary is instructing us to trim
7045+
eversion_t trim_to = trim.trim_to;
7046+
if (ps->pool.info.allows_ecoptimizations() &&
7047+
(trim_to > ps->pg_log.get_can_rollback_to())) {
7048+
// An exceptionally long sequence of partial writes followed by a full
7049+
// write can result in trim_to being ahead of crt
7050+
trim_to = ps->pg_log.get_can_rollback_to();
7051+
}
70317052
ps->pg_log.trim(trim.trim_to, ps->info);
70327053
ps->dirty_info = true;
70337054
return discard_event();

0 commit comments

Comments
 (0)