Skip to content

Commit f234df0

Browse files
pks-tgitster
authored andcommitted
reftable/stack: handle locked tables during auto-compaction
When compacting tables, it may happen that we want to compact a set of tables which are already locked by a concurrent process that compacts them. In the case where we wanted to perform a full compaction of all tables it is sensible to bail out in this case, as we cannot fulfill the requested action. But when performing auto-compaction it isn't necessarily in our best interest of us to abort the whole operation. For example, due to the geometric compacting schema that we use, it may be that process A takes a lot of time to compact the bulk of all tables whereas process B appends a bunch of new tables to the stack. B would in this case also notice that it has to compact the tables that process A is compacting already and thus also try to compact the same range, probably including the new tables it has appended. But because those tables are locked already, it will fail and thus abort the complete auto-compaction. The consequence is that the stack will grow longer and longer while A isn't yet done with compaction, which will lead to a growing performance impact. Instead of aborting auto-compaction altogether, let's gracefully handle this situation by instead compacting tables which aren't locked. To do so, instead of locking from the beginning of the slice-to-be-compacted, we start locking tables from the end of the slice. Once we hit the first table that is locked already, we abort. If we succeeded to lock two or more tables, then we simply reduce the slice of tables that we're about to compact to those which we managed to lock. This ensures that we can at least make some progress for compaction in said scenario. It also helps in other scenarios, like for example when a process died and left a stale lockfile behind. In such a case we can at least ensure some compaction on a best-effort basis. Signed-off-by: Patrick Steinhardt <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent ed1ad6b commit f234df0

File tree

3 files changed

+70
-22
lines changed

3 files changed

+70
-22
lines changed

reftable/stack.c

Lines changed: 49 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -999,6 +999,15 @@ static int stack_write_compact(struct reftable_stack *st,
999999
return err;
10001000
}
10011001

1002+
enum stack_compact_range_flags {
1003+
/*
1004+
* Perform a best-effort compaction. That is, even if we cannot lock
1005+
* all tables in the specified range, we will try to compact the
1006+
* remaining slice.
1007+
*/
1008+
STACK_COMPACT_RANGE_BEST_EFFORT = (1 << 0),
1009+
};
1010+
10021011
/*
10031012
* Compact all tables in the range `[first, last)` into a single new table.
10041013
*
@@ -1010,7 +1019,8 @@ static int stack_write_compact(struct reftable_stack *st,
10101019
*/
10111020
static int stack_compact_range(struct reftable_stack *st,
10121021
size_t first, size_t last,
1013-
struct reftable_log_expiry_config *expiry)
1022+
struct reftable_log_expiry_config *expiry,
1023+
unsigned int flags)
10141024
{
10151025
struct strbuf tables_list_buf = STRBUF_INIT;
10161026
struct strbuf new_table_name = STRBUF_INIT;
@@ -1052,19 +1062,47 @@ static int stack_compact_range(struct reftable_stack *st,
10521062
/*
10531063
* Lock all tables in the user-provided range. This is the slice of our
10541064
* stack which we'll compact.
1065+
*
1066+
* Note that we lock tables in reverse order from last to first. The
1067+
* intent behind this is to allow a newer process to perform best
1068+
* effort compaction of tables that it has added in the case where an
1069+
* older process is still busy compacting tables which are preexisting
1070+
* from the point of view of the newer process.
10551071
*/
10561072
REFTABLE_CALLOC_ARRAY(table_locks, last - first + 1);
1057-
for (i = first; i <= last; i++) {
1058-
stack_filename(&table_name, st, reader_name(st->readers[i]));
1073+
for (i = last + 1; i > first; i--) {
1074+
stack_filename(&table_name, st, reader_name(st->readers[i - 1]));
10591075

10601076
err = hold_lock_file_for_update(&table_locks[nlocks],
10611077
table_name.buf, LOCK_NO_DEREF);
10621078
if (err < 0) {
1063-
if (errno == EEXIST)
1079+
/*
1080+
* When the table is locked already we may do a
1081+
* best-effort compaction and compact only the tables
1082+
* that we have managed to lock so far. This of course
1083+
* requires that we have been able to lock at least two
1084+
* tables, otherwise there would be nothing to compact.
1085+
* In that case, we return a lock error to our caller.
1086+
*/
1087+
if (errno == EEXIST && last - (i - 1) >= 2 &&
1088+
flags & STACK_COMPACT_RANGE_BEST_EFFORT) {
1089+
err = 0;
1090+
/*
1091+
* The subtraction is to offset the index, the
1092+
* addition is to only compact up to the table
1093+
* of the preceding iteration. They obviously
1094+
* cancel each other out, but that may be
1095+
* non-obvious when it was omitted.
1096+
*/
1097+
first = (i - 1) + 1;
1098+
break;
1099+
} else if (errno == EEXIST) {
10641100
err = REFTABLE_LOCK_ERROR;
1065-
else
1101+
goto done;
1102+
} else {
10661103
err = REFTABLE_IO_ERROR;
1067-
goto done;
1104+
goto done;
1105+
}
10681106
}
10691107

10701108
/*
@@ -1308,9 +1346,10 @@ static int stack_compact_range(struct reftable_stack *st,
13081346

13091347
static int stack_compact_range_stats(struct reftable_stack *st,
13101348
size_t first, size_t last,
1311-
struct reftable_log_expiry_config *config)
1349+
struct reftable_log_expiry_config *config,
1350+
unsigned int flags)
13121351
{
1313-
int err = stack_compact_range(st, first, last, config);
1352+
int err = stack_compact_range(st, first, last, config, flags);
13141353
if (err == REFTABLE_LOCK_ERROR)
13151354
st->stats.failures++;
13161355
return err;
@@ -1320,7 +1359,7 @@ int reftable_stack_compact_all(struct reftable_stack *st,
13201359
struct reftable_log_expiry_config *config)
13211360
{
13221361
size_t last = st->merged->stack_len ? st->merged->stack_len - 1 : 0;
1323-
return stack_compact_range_stats(st, 0, last, config);
1362+
return stack_compact_range_stats(st, 0, last, config, 0);
13241363
}
13251364

13261365
static int segment_size(struct segment *s)
@@ -1427,7 +1466,7 @@ int reftable_stack_auto_compact(struct reftable_stack *st)
14271466
reftable_free(sizes);
14281467
if (segment_size(&seg) > 0)
14291468
return stack_compact_range_stats(st, seg.start, seg.end - 1,
1430-
NULL);
1469+
NULL, STACK_COMPACT_RANGE_BEST_EFFORT);
14311470

14321471
return 0;
14331472
}

reftable/stack_test.c

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -917,13 +917,15 @@ static void test_reftable_stack_auto_compaction_with_locked_tables(void)
917917
write_file_buf(buf.buf, "", 0);
918918

919919
/*
920-
* Ideally, we'd handle the situation where any of the tables is locked
921-
* gracefully. We don't (yet) do this though and thus fail.
920+
* When parts of the stack are locked, then auto-compaction does a best
921+
* effort compaction of those tables which aren't locked. So while this
922+
* would in theory compact all tables, due to the preexisting lock we
923+
* only compact the newest two tables.
922924
*/
923925
err = reftable_stack_auto_compact(st);
924-
EXPECT(err == REFTABLE_LOCK_ERROR);
925-
EXPECT(st->stats.failures == 1);
926-
EXPECT(st->merged->stack_len == 5);
926+
EXPECT_ERR(err);
927+
EXPECT(st->stats.failures == 0);
928+
EXPECT(st->merged->stack_len == 4);
927929

928930
reftable_stack_destroy(st);
929931
strbuf_release(&buf);

t/t0610-reftable-basics.sh

Lines changed: 14 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -478,19 +478,26 @@ test_expect_success "$command: auto compaction" '
478478
479479
test_oid blob17_2 | git hash-object -w --stdin &&
480480
481-
# Lock all tables write some refs. Auto-compaction will be
482-
# unable to compact tables and thus fails gracefully, leaving
483-
# the stack in a sub-optimal state.
484-
ls .git/reftable/*.ref |
481+
# Lock all tables, write some refs. Auto-compaction will be
482+
# unable to compact tables and thus fails gracefully,
483+
# compacting only those tables which are not locked.
484+
ls .git/reftable/*.ref | sort |
485485
while read table
486486
do
487-
touch "$table.lock" || exit 1
487+
touch "$table.lock" &&
488+
basename "$table" >>tables.expect || exit 1
488489
done &&
490+
test_line_count = 2 .git/reftable/tables.list &&
489491
git branch B &&
490492
git branch C &&
491-
rm .git/reftable/*.lock &&
492-
test_line_count = 4 .git/reftable/tables.list &&
493493
494+
# The new tables are auto-compacted, but the locked tables are
495+
# left intact.
496+
test_line_count = 3 .git/reftable/tables.list &&
497+
head -n 2 .git/reftable/tables.list >tables.head &&
498+
test_cmp tables.expect tables.head &&
499+
500+
rm .git/reftable/*.lock &&
494501
git $command --auto &&
495502
test_line_count = 1 .git/reftable/tables.list
496503
)

0 commit comments

Comments
 (0)