Commit 91c3f2c
157790: backupresolver: avoid fetching all descriptors r=rafiss a=rafiss
```
goos: darwin
goarch: arm64
cpu: Apple M1 Pro
BenchmarkResolveTargets
BenchmarkResolveTargets/OldImplementation 91 18089891 ns/op 17230891 B/op 109217 allocs/op
BenchmarkResolveTargets/NewImplementation 1232 846169 ns/op 282877 B/op 2225 allocs/op
```
Please review individual commits.
---
### backupresolver: avoid fetching all descriptors
This adds a new API for use by BACKUP code to resolve TablePatterns to
descriptors that need to be backed up.
Unlike the existing API, this avoids fetching all descriptors that exist
in the database. Instead, the returned list will only contain the subgraph
of all descriptors involved in the backup:
- Table backup: Includes the table descriptors, their parent schema descriptor,
and parent database descriptor. Also includes any immutable descriptors
that the table depends like UDTs.
- Database backup: Includes the database descriptor, and all children (table, schema,
sequences, udfs, etc)
### backup,changefeedccl: use new ResolveTargets API
This updates the production code paths to use the new
function that was introduced in the previous commit.
As part of this, the function was enhanced to match the extra return
parameters of ResolveTargetsToDescriptors.
### backupresolver: remove ResolveTargetsToDescriptors
### backupresolver: fix lookup logic for descriptors
The new implementation of ResolveTargets had an overly aggressive
filter. We need to include these descriptors in order for backups taken
during a schema change to include the correct descriptors. This was need
to make TestBackupSuccess_base_create_index_create_schema_separate_statements
pass reliably.
This also fixes the error handling when looking up schemas to handle
synthetic public schemas correctly. In addition, the resolution logic
now skips temporary objects.
Lastly, the name resolver logic previously was looking up types. In the
context of resolving BACKUP targets, we only ever need to resolve tables
by name, and the type fallback logic was making the wrong object get
resolved if the table name shared a name with a type.
### backupresolver: add all wildcard expansions to expandedDBs
Previously, ResolveTargets only added database IDs to expandedDBs for
database-level wildcards (db.*), but not for schema-level wildcards
(db.schema.* or schema.*). This didn't match the behavior of
DescriptorsMatchingTargets, which adds to ExpandedDB for any wildcard
pattern in the AllTablesSelector case (targets.go:543-546).
This inconsistency matters because expandedDBs is used downstream to
determine which databases had wildcard expansion applied, affecting
incremental backup behavior and descriptor change tracking.
This commit updates ResolveTargets to add to expandedDBs for all
wildcard patterns, matching the original DescriptorsMatchingTargets
behavior.
### backupresolver: adjust test message expectations
This adjusts tests to check for the appropriate errors that are returned
by the new ResolveTargets implementation.
- The old code delegated to DescriptorsMatchingTargets, which was returning
an incorrectly specific error message when a database did not exist.
- To handle wildcard expansion, The previous implementation was calling
ResolveObjectPrefix, which mutates the object name in place to add the
`public` schema to the pattern if it was unspecified. This was not
correct, as the db.* pattern causes all tables in the database to be
backed up, not just the ones in the public schema.
---
fixes: #146803
Release note (performance improvement): Database- and table-level backups no longer fetch all object descriptors from disk in order to resolve the backup targets. Now only the objects that are referenced by the targeted objects will be fetched. This improves performance when there are many tables in the cluster.
159993: sql: fix rare race around concurrent remote flows setup r=yuzefovich a=yuzefovich
**physicalplan: harden PhysicalInfrastructure.Release**
We sync-pool `PhysicalInfrastructure` objects and previously we wouldn't
explicitly unset elements of `Processors` slice. This was done since
it's a slice of values not pointers. However, those values themselves
embed protobuf ProcessorSpec which contains more messages (among other
things might have RenderExprs) which we do want to lose the references
to, so this commit fixes that oversight.
**sql: fix rare race around concurrent remote flows setup**
A few years ago in 0c1095e we changed
the way we set up distributed query plans. Namely, we now start by
setting up the gateway (i.e. local) flow first, and then we'll issue
SetupFlowRequest RPCs concurrently to set up remote flows without
actually blocking on the gateway until the setup is complete.
We have seen about 5 occurrences where the protobuf marshaling code
crashed when handling those concurrent RPCs. I have a hypothesis is that
this is due to the main goroutine of the gateway flow not waiting until
after RPCs are done. In particular, we put `PhysicalInfrastructure`
objects into `sync.Pool` and they are released by executing
`PlanningCtx.getCleanupFunc` function. That function is executed in
a defer after `Run`ning the local flow completes. However, it's possible
that it'll be executed _before_ concurrent SetupFlowRequest RPCs
(evaluated via the distsql worker goroutines) are performed, and I'm
guessing the flow specs might get corrupted because of that.
In order to prevent this race, we now will block execution of
`Flow.Cleanup` function of the gateway flow until all concurrent RPCs
are done.
I tried injecting the sleep right before executing the concurrent RPCs
but still was unable to reproduce the problem on the gceworker. Given
that we've only seen this a handful of times, I decided to omit the release
note.
Fixes: #159569
Release note: None
Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
File tree
14 files changed
+811
-329
lines changed- pkg
- backup
- backupresolver
- testdata/backup-restore
- ccl/changefeedccl
- sql
- physicalplan
14 files changed
+811
-329
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
530 | 530 | | |
531 | 531 | | |
532 | 532 | | |
533 | | - | |
| 533 | + | |
534 | 534 | | |
535 | 535 | | |
536 | 536 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
| |||
33 | 34 | | |
34 | 35 | | |
35 | 36 | | |
36 | | - | |
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | 40 | | |
| 41 | + | |
41 | 42 | | |
42 | | - | |
43 | | - | |
44 | | - | |
45 | | - | |
46 | | - | |
47 | | - | |
48 | 43 | | |
49 | | - | |
50 | 44 | | |
51 | | - | |
52 | | - | |
53 | | - | |
54 | 45 | | |
| 46 | + | |
55 | 47 | | |
56 | | - | |
57 | 48 | | |
58 | 49 | | |
59 | 50 | | |
| 51 | + | |
60 | 52 | | |
61 | 53 | | |
0 commit comments