Skip to content

Commit 91c3f2c

Browse files
craig[bot]rafissyuzefovich
committed
157790: backupresolver: avoid fetching all descriptors r=rafiss a=rafiss ``` goos: darwin goarch: arm64 cpu: Apple M1 Pro BenchmarkResolveTargets BenchmarkResolveTargets/OldImplementation 91 18089891 ns/op 17230891 B/op 109217 allocs/op BenchmarkResolveTargets/NewImplementation 1232 846169 ns/op 282877 B/op 2225 allocs/op ``` Please review individual commits. --- ### backupresolver: avoid fetching all descriptors This adds a new API for use by BACKUP code to resolve TablePatterns to descriptors that need to be backed up. Unlike the existing API, this avoids fetching all descriptors that exist in the database. Instead, the returned list will only contain the subgraph of all descriptors involved in the backup: - Table backup: Includes the table descriptors, their parent schema descriptor, and parent database descriptor. Also includes any immutable descriptors that the table depends like UDTs. - Database backup: Includes the database descriptor, and all children (table, schema, sequences, udfs, etc) ### backup,changefeedccl: use new ResolveTargets API This updates the production code paths to use the new function that was introduced in the previous commit. As part of this, the function was enhanced to match the extra return parameters of ResolveTargetsToDescriptors. ### backupresolver: remove ResolveTargetsToDescriptors ### backupresolver: fix lookup logic for descriptors The new implementation of ResolveTargets had an overly aggressive filter. We need to include these descriptors in order for backups taken during a schema change to include the correct descriptors. This was need to make TestBackupSuccess_base_create_index_create_schema_separate_statements pass reliably. This also fixes the error handling when looking up schemas to handle synthetic public schemas correctly. In addition, the resolution logic now skips temporary objects. Lastly, the name resolver logic previously was looking up types. In the context of resolving BACKUP targets, we only ever need to resolve tables by name, and the type fallback logic was making the wrong object get resolved if the table name shared a name with a type. ### backupresolver: add all wildcard expansions to expandedDBs Previously, ResolveTargets only added database IDs to expandedDBs for database-level wildcards (db.*), but not for schema-level wildcards (db.schema.* or schema.*). This didn't match the behavior of DescriptorsMatchingTargets, which adds to ExpandedDB for any wildcard pattern in the AllTablesSelector case (targets.go:543-546). This inconsistency matters because expandedDBs is used downstream to determine which databases had wildcard expansion applied, affecting incremental backup behavior and descriptor change tracking. This commit updates ResolveTargets to add to expandedDBs for all wildcard patterns, matching the original DescriptorsMatchingTargets behavior. ### backupresolver: adjust test message expectations This adjusts tests to check for the appropriate errors that are returned by the new ResolveTargets implementation. - The old code delegated to DescriptorsMatchingTargets, which was returning an incorrectly specific error message when a database did not exist. - To handle wildcard expansion, The previous implementation was calling ResolveObjectPrefix, which mutates the object name in place to add the `public` schema to the pattern if it was unspecified. This was not correct, as the db.* pattern causes all tables in the database to be backed up, not just the ones in the public schema. --- fixes: #146803 Release note (performance improvement): Database- and table-level backups no longer fetch all object descriptors from disk in order to resolve the backup targets. Now only the objects that are referenced by the targeted objects will be fetched. This improves performance when there are many tables in the cluster. 159993: sql: fix rare race around concurrent remote flows setup r=yuzefovich a=yuzefovich **physicalplan: harden PhysicalInfrastructure.Release** We sync-pool `PhysicalInfrastructure` objects and previously we wouldn't explicitly unset elements of `Processors` slice. This was done since it's a slice of values not pointers. However, those values themselves embed protobuf ProcessorSpec which contains more messages (among other things might have RenderExprs) which we do want to lose the references to, so this commit fixes that oversight. **sql: fix rare race around concurrent remote flows setup** A few years ago in 0c1095e we changed the way we set up distributed query plans. Namely, we now start by setting up the gateway (i.e. local) flow first, and then we'll issue SetupFlowRequest RPCs concurrently to set up remote flows without actually blocking on the gateway until the setup is complete. We have seen about 5 occurrences where the protobuf marshaling code crashed when handling those concurrent RPCs. I have a hypothesis is that this is due to the main goroutine of the gateway flow not waiting until after RPCs are done. In particular, we put `PhysicalInfrastructure` objects into `sync.Pool` and they are released by executing `PlanningCtx.getCleanupFunc` function. That function is executed in a defer after `Run`ning the local flow completes. However, it's possible that it'll be executed _before_ concurrent SetupFlowRequest RPCs (evaluated via the distsql worker goroutines) are performed, and I'm guessing the flow specs might get corrupted because of that. In order to prevent this race, we now will block execution of `Flow.Cleanup` function of the gateway flow until all concurrent RPCs are done. I tried injecting the sleep right before executing the concurrent RPCs but still was unable to reproduce the problem on the gceworker. Given that we've only seen this a handful of times, I decided to omit the release note. Fixes: #159569 Release note: None Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>
3 parents 47465f9 + a5801ff + 7029127 commit 91c3f2c

14 files changed

+811
-329
lines changed

pkg/backup/backup_planning.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -530,7 +530,7 @@ func backupPlanHook(
530530
switch backupStmt.Coverage() {
531531
case tree.RequestedDescriptors:
532532
var err error
533-
targetDescs, completeDBs, requestedDBs, descsByTablePattern, err = backupresolver.ResolveTargetsToDescriptors(ctx, p, endTime, backupStmt.Targets)
533+
targetDescs, completeDBs, requestedDBs, descsByTablePattern, err = backupresolver.ResolveTargets(ctx, p, endTime, backupStmt.Targets)
534534
if err != nil {
535535
return errors.Wrap(err, "failed to resolve targets specified in the BACKUP stmt")
536536
}

pkg/backup/backupresolver/BUILD.bazel

Lines changed: 4 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ go_library(
77
visibility = ["//visibility:public"],
88
deps = [
99
"//pkg/keys",
10+
"//pkg/kv",
1011
"//pkg/sql",
1112
"//pkg/sql/catalog",
1213
"//pkg/sql/catalog/descpb",
@@ -33,29 +34,20 @@ go_test(
3334
deps = [
3435
"//pkg/base",
3536
"//pkg/ccl/storageccl",
36-
"//pkg/keys",
3737
"//pkg/security/securityassets",
3838
"//pkg/security/securitytest",
3939
"//pkg/security/username",
4040
"//pkg/server",
41+
"//pkg/sql",
4142
"//pkg/sql/catalog",
42-
"//pkg/sql/catalog/dbdesc",
43-
"//pkg/sql/catalog/descbuilder",
44-
"//pkg/sql/catalog/descpb",
45-
"//pkg/sql/catalog/schemadesc",
46-
"//pkg/sql/catalog/tabledesc",
47-
"//pkg/sql/catalog/typedesc",
4843
"//pkg/sql/parser",
49-
"//pkg/sql/sem/catid",
5044
"//pkg/sql/sem/tree",
51-
"//pkg/sql/sessiondata",
52-
"//pkg/sql/types",
53-
"//pkg/testutils",
5445
"//pkg/testutils/serverutils",
46+
"//pkg/testutils/sqlutils",
5547
"//pkg/testutils/testcluster",
56-
"//pkg/util/hlc",
5748
"//pkg/util/leaktest",
5849
"//pkg/util/log",
5950
"//pkg/util/randutil",
51+
"@com_github_stretchr_testify//require",
6052
],
6153
)

0 commit comments

Comments
 (0)