-
Notifications
You must be signed in to change notification settings - Fork 111
Resolves #1076: Incorrect continuation when scanning by EndpointType.PREFIX_STRING #3523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -198,6 +198,77 @@ public void inclusiveRange() { | |
}); | ||
} | ||
|
||
@Test | ||
public void prefixString() { | ||
fdb.run(context -> { | ||
fdb.database().run(tr -> { | ||
for (int i = 0; i < 5; i++) { | ||
tr.set(subspace.pack(Tuple.from("apple", i)), Tuple.from("apple", i).pack()); | ||
} | ||
return null; | ||
}); | ||
Comment on lines
+204
to
+209
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This loop should probably be brought outside the |
||
TupleRange range = new TupleRange( | ||
Tuple.from("a"), | ||
Tuple.from("a"), | ||
EndpointType.PREFIX_STRING, | ||
EndpointType.PREFIX_STRING | ||
Comment on lines
+211
to
+214
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would probably be good to have some more tests. I can think of a couple like:
You might need to restructure this to be a parameterized test that takes a prefix string and an expected set of results, and maybe page size and reverse-ness, and then it executes the scan across continuations and validates the results. |
||
); | ||
|
||
KeyValueCursor cursor = KeyValueCursor.Builder.withSubspace(subspace) | ||
.setContext(context) | ||
.setRange(range) | ||
.setContinuation(null) | ||
.setScanProperties(ScanProperties.FORWARD_SCAN) | ||
.build(); | ||
for (int j = 0; j < 5; j++) { | ||
KeyValue kv = cursor.getNext().get(); | ||
assertArrayEquals(subspace.pack(Tuple.from("apple", j)), kv.getKey()); | ||
assertArrayEquals(Tuple.from("apple", j).pack(), kv.getValue()); | ||
} | ||
assertThat(cursor.getNext().hasNext(), is(false)); | ||
|
||
cursor = KeyValueCursor.Builder.withSubspace(subspace) | ||
.setContext(context) | ||
.setRange(range) | ||
.setContinuation(null) | ||
.setScanProperties(new ScanProperties(ExecuteProperties.newBuilder().setReturnedRowLimit(2).build())) | ||
.build(); | ||
for (int j = 0; j < 2; j++) { | ||
KeyValue kv = cursor.getNext().get(); | ||
assertArrayEquals(subspace.pack(Tuple.from("apple", j)), kv.getKey()); | ||
assertArrayEquals(Tuple.from("apple", j).pack(), kv.getValue()); | ||
} | ||
|
||
cursor = KeyValueCursor.Builder.withSubspace(subspace) | ||
.setContext(context) | ||
.setRange(range) | ||
.setContinuation(cursor.getNext().getContinuation().toBytes()) | ||
.setScanProperties(new ScanProperties(ExecuteProperties.newBuilder().setReturnedRowLimit(2).build())) | ||
.build(); | ||
|
||
for (int j = 2; j < 4; j++) { | ||
KeyValue kv = cursor.getNext().get(); | ||
assertArrayEquals(subspace.pack(Tuple.from("apple", j)), kv.getKey()); | ||
assertArrayEquals(Tuple.from("apple", j).pack(), kv.getValue()); | ||
} | ||
|
||
cursor = KeyValueCursor.Builder.withSubspace(subspace) | ||
.setContext(context) | ||
.setRange(range) | ||
.setContinuation(cursor.getNext().getContinuation().toBytes()) | ||
.setScanProperties(new ScanProperties(ExecuteProperties.newBuilder().setReturnedRowLimit(1).build())) | ||
.build(); | ||
|
||
for (int j = 4; j < 5; j++) { | ||
KeyValue kv = cursor.getNext().get(); | ||
assertArrayEquals(subspace.pack(Tuple.from("apple", j)), kv.getKey()); | ||
assertArrayEquals(Tuple.from("apple", j).pack(), kv.getValue()); | ||
} | ||
|
||
return null; | ||
}); | ||
} | ||
|
||
@Test | ||
public void exclusiveRange() { | ||
fdb.run(context -> { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This looks right to me, in principle. We could probably use a bit more testing to make sure it handles some edge cases properly. I have a few suggestions on additional tests in the other file.
There's a second concern, and it has to do with cross-version compatibility. (This is, for what it's worth, what ended up being the problem with the previous attempt to fix this, #1106.) We expect the user to be able to take the continuation that is generated from one run and then hand it to another instance of the Record Layer and resume from where it has been left off. We expect this to work even if the software versions of the two Record Layer instances are different. (This is because we expect people to deploy fleets of different instances of the Record Layer that all access the same data, and then load balance requests across them. When they then perform a rolling upgrade of the fleet, then they might pick up a continuation generated from an older instance and hand it to a newer one, or vice versa. Once the upgrade is completed, this shouldn't be an issue.)
The concern here is that because we're implicitly changing the format here. In particular, if you had such a scan over a
PREFIX_STRING
range, then when the continuation is transferred between versions, then it will incorrectly interpret the prefix length.The counter argument to caring about this (in this case) is that the prefix was already being wrongly handled, and so it was already an error to try and continue a prefix string search. So if the worst that can happen is that we now fail in a different way (or even the same way), maybe that's fine.
We do have tests that run multiple versions of the Record Layer and then validate the behavior across versions, including handling continuations. They use the YAML tests framework, which are set up to run a series of SQL queries against a schema template definition, and then it validates the expected results. It also runs each query in a mode where it stops after each row and resumes the query with a continuation, and in multi-server mode, it alternates between different server versions. I've added some tests of almost that behavior in this PR: #3536. The problem is that the SQL planner isn't currently smart enough to translate expressions like
abc%
into[PREFIX_STRING 'abc']
scan comparisons, so it doesn't actually exercise this codepath.We have a few options here:
PREFIX_STRING
ranges. It would be nice if we had that anyway, as it would make the queries more efficient. It would be a more involved change though as the SQL stuff has a sharper learning curve, and we'd also have to worry about how this interacted with the "plan cache". (In order to avoid costly re-planning, we have a cache we maintain of the plan associated with a given query. We also strip literals from the plan before caching it so that we can re-use plans for queries that are the same but with different constants. However, you wouldn't want to reuse a plan forSELECT * FROM T WHERE s LIKE 'abc%'
, which can use aPREFIX_STRING
, with one forSELECT * FROM T WHERE s LIKE 'abc%d'
, which cannot without additional compensation. So whatever solution we come up with would need to take that into account.)