Commit be02aa6
committed
PS-9703 "Upstream 8.0.41 release does not fully fix PS-9144"
https://perconadev.atlassian.net/browse/PS-9703
Problem:
--------
ALTER TABLE which rebuilds InnoDB table using INPLACE algorithm might sometimes
lead to row loss if concurrent purge happens on the table being ALTERed.
Analysis:
---------
This issue was introduced in Upstream version 8.0.41 as unwanted side-effect
of fixes for bug#115608 (PS-9144), in which similar problem is observed but
in a different scenario, and bug#115511 (PS-9214). It was propageted to
Percona Server 8.0.41-32, in which we opted for reverting our versions
of fixes for PS-9144 and PS-9214 in favour of Upstream ones.
New implementation of parallel ALTER TABLE INPLACE in InnoDB was introduced in
MySQL 8.0.27. Its code is used for online table rebuild even in a single-thread
case.
This implementation iterates over all the rows in the table, in general case,
handling different subtrees of a B-tree in different threads. This iteration
over table rows needs to be paused, from time to time, to commit InnoDB MTR/
release page latches it holds. This is necessary to give a way to concurrent
actions on the B-tree scanned (this happens when switching to the next page)
or before flushing rows of new version of table from in-memory buffer to the
B-tree. In order to resume iteration after such pause persistent cursor
position saved before pause is restored.
The problem described above occurs when we try to save and then restore
position of cursor pointing to page supremum, before switching to the next
page. In post-8.0.41 code this is done by simply calling
btr_pcur_t::store_position()/restore_position() methods for cursor that
point to supremum. In 8.0.42-based code this is done in
PCursor::save_previous_user_record_as_last_processed() and
PCursor::restore_to_first_unprocessed() pair of methods.
However, this doesn't work correctly in scenario, when after we have
saved cursor position and then committed mini-transaction/released latches
on the current page the next page is merged into the current one (after
purge removes some records from it). In this case the cursor position is
still restored as pointing to page supremum, and thus rows which were
moved over by merge are erroneously skipped.
***
Let us take look at an example. Let us assume that we have two pages
p1 : [inf, a, b, c, sup] and the next one p2 : [inf, d, e, f, sup].
Our thread which is one of the parallel ALTER TABLE worker threads
has completed scan of p1, so its cursor positioned on p1:'sup' record.
Now it needs to switch to page p2, but also give a way to threads
concurrently updating the table. So it needs to make cursor savepoint,
commit mini-transaction and release the latches.
In post-8.0.41 code we simply do btr_pcur_t::store_position()/
restore_position() with the cursor positioned on p1 : 'sup' record,
then the following might happen: concurrent purge on page p2 might
delete some record from it (e.g. 'f') and decide to merge of this page
into the page p1.
If this happens while latches are released this merge would go through
and and resulting in page p1 with the following contents
p1 : [inf, a, b, c, d, e, sup]. Savepoint for p1 : 'sup' won't
be invalidated (one can say that savepoints for sup and inf are not
safe against concurrent merges in this respect) and after restoration
of cursor the iteration will continue, on the next page, skipping
records 'd' and 'e'.
***
Fix:
----
This patch solves the problem by working around the issue with saving/
restoring cursor pointing to supremum. Instead of storing position of
supremum record PCursor::save_previous_user_record_as_last_processed()
now stores the position of record that precedes it.
And then PCursor::restore_to_first_unprocessed() does restore in two
steps - 1) restores position of this preceding record (or its closest
precedessor if it was purged meanwhile) and then 2) moves one step
forward assuming that will get to the supremum record at which cursor
pointed originally. If this is not true, i.e. there is user record
added to the page by the merge (or simple concurrent insert), we
assume that this and following records are unprocessed. The caller
of PCursor::restore_to_first_unprocessed() detects this situation
by checking if cursor is positioned on supremum and handles by
resuming processing from record under the cursor if not.
***
Let us return to the above example to explain how fix works.
PCursor::save_previous_user_record_as_last_processed() does a step back
before calling btr_pcur_t::store_position(), so for cursor positioned
on p1 : 'sup' it is actually position corresponding to p1 : 'c' what
is saved. If the merge happens when latches are released, we still
get p1 : [inf, a, b, c, d, e, sup] and the savepoint is not invalidated.
PCursor::restore_to_first_unprocessed() calls btr_pcur_t::restore_position()
gets cursor pointing to p1 : 'c' as result, and then it tries to
compensate for step-back and moves cursor one step forward making it to
point to p1 : 'd'. Code which does scanning detects the situation that
saving/restoring resulted in jump from supremum record to user record
and resume iteration from p1 : 'd' without skipping any records.
***
Thanks to Bytedance team for bringing this issue to our attention!
The test case for this bug is based on the one that they reported
to the Upstream and later pointed to us.1 parent 512384a commit be02aa6
File tree
3 files changed
+222
-10
lines changed- mysql-test/suite/innodb
- r
- t
- storage/innobase/row
3 files changed
+222
-10
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
39 | 141 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
232 | 232 | | |
233 | 233 | | |
234 | 234 | | |
235 | | - | |
| 235 | + | |
236 | 236 | | |
237 | 237 | | |
238 | 238 | | |
| |||
434 | 434 | | |
435 | 435 | | |
436 | 436 | | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
437 | 450 | | |
438 | | - | |
439 | 451 | | |
440 | 452 | | |
441 | 453 | | |
442 | 454 | | |
443 | | - | |
444 | 455 | | |
445 | 456 | | |
446 | 457 | | |
| |||
449 | 460 | | |
450 | 461 | | |
451 | 462 | | |
452 | | - | |
453 | | - | |
454 | | - | |
455 | | - | |
456 | | - | |
457 | | - | |
458 | | - | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
459 | 478 | | |
460 | 479 | | |
461 | 480 | | |
| |||
1326 | 1345 | | |
1327 | 1346 | | |
1328 | 1347 | | |
| 1348 | + | |
| 1349 | + | |
| 1350 | + | |
1329 | 1351 | | |
1330 | 1352 | | |
1331 | 1353 | | |
| |||
0 commit comments