Skip to content

Commit 9c110b4

Browse files
craig[bot]msbutleryuzefovichjbowens
committed
150641: backup: remove useless fields in telemetry events r=kev-cao a=msbutler Epic: none Release note: none 151849: plpgsql: fix handling of annotations for DO blocks r=yuzefovich a=yuzefovich **plpgsql: fix handling of annotations for DO blocks** This commit should fix for good annotations handling for DO blocks that was recently attempted to be fixed in 3b8ab38. (That commit is partially reversed too). The general issue is that in the PLpgSQL parser we use an ephemeral SQL parser instance to process SQL statements. However, when processing the DO block stmts we need to preserve the annotations state between each stmt. This is now achieved by extending the SQL parser infrastructure to give an option to override the initial annotations index as well as keep on reusing the counter across multiple stmts. This allows us to correctly count the number of annotations that we need for the whole DO block during the initial parsing, and then we use the allocated annotations container to set the items in the optbuild. Fixes: #151848. Release note (bug fix): Previously, CockroachDB node could crash when executing DO stmts when they contain user-defined types (possibly non-existing) in non-default configuration (additional logging like the one controlled via `sql.log.all_statements.enabled` cluster setting needed to be enabled). This bug was introduced in 25.1 release and is now fixed. **tree: avoid index-of-bounds panics in GetAnnotation in production** We've recently seen two panics when trying to access the Annotations container and hitting the out-of-bounds condition. The annotations are used to provide additional information about unresolved object names, so if we simply return `nil` on an invalid access, we'll keep on using the unresolved object name, which seems like a better option than process crash. In test builds we'll keep on crashing. Release note: None 153734: builtins: deflake TestGetSSTableMetricsSingleNode r=xinhaoz a=jbowens Refactor and deflake TestGetSSTableMetricsSingleNode. Fixes: #151742 Epic: none Release note: none 153815: roachtest: delake pcr roachtests that offset reader tenant system tables r=jeffswenson a=msbutler In this patch, we restart the reader tenant after the poller job bootstraps the reader catalog, to refresh the namespace cache. This is necessary because the reader tenant namespace cache believes the the priv table has id 100052 instead of the newly ingested id 52 from the source. Fixes #153555 Epic: none Release note: none Co-authored-by: Michael Butler <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Jackson Owens <[email protected]>
5 parents 004984b + 150308a + fe61fd1 + 8b7a553 + 767fcb6 commit 9c110b4

File tree

23 files changed

+389
-372
lines changed

23 files changed

+389
-372
lines changed

docs/generated/eventlog.md

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3421,25 +3421,20 @@ logged whenever a BACKUP and RESTORE job completes or fails.
34213421
| `TargetScope` | TargetScope is the largest scope of the targets that the user is backing up or restoring based on the following order: table < schema < database < full cluster. | no |
34223422
| `IsMultiregionTarget` | IsMultiregionTarget is true if any of the targets contain objects with multi-region primitives. | no |
34233423
| `TargetCount` | TargetCount is the number of targets the in the BACKUP/RESTORE. | no |
3424-
| `DestinationSubdirType` | DestinationSubdirType is - latest: if using the latest subdir - standard: if using a date-based subdir - custom: if using a custom subdir that's not date-based | no |
34253424
| `DestinationStorageTypes` | DestinationStorageTypes are the types of storage that the user is backing up to or restoring from. | no |
34263425
| `DestinationAuthTypes` | DestinationAuthTypes are the types of authentication methods that the user is using to access the destination storage. | no |
34273426
| `IsLocalityAware` | IsLocalityAware indicates if the BACKUP or RESTORE is locality aware. | no |
3428-
| `AsOfInterval` | AsOfInterval is the time interval in nanoseconds between the statement timestamp and the timestamp resolved by the AS OF SYSTEM TIME expression. The interval is expressed in nanoseconds. | no |
34293427
| `WithRevisionHistory` | WithRevisionHistory is true if the BACKUP includes revision history. | no |
34303428
| `HasEncryptionPassphrase` | HasEncryptionPassphrase is true if the user provided an encryption passphrase to encrypt/decrypt their backup. | no |
34313429
| `KMSType` | KMSType is the type of KMS the user is using to encrypt/decrypt their backup. | no |
34323430
| `KMSCount` | KMSCount is the number of KMS the user is using. | no |
34333431
| `Options` | Options contain all the names of the options specified by the user in the BACKUP or RESTORE statement. For options that are accompanied by a value, only those with non-empty values will be present.<br><br>It's important to note that there are no option values anywhere in the event payload. Future changes to telemetry should refrain from adding values to the payload unless they are properly redacted. | no |
3434-
| `DebugPauseOn` | DebugPauseOn is the type of event that the restore should pause on for debugging purposes. Currently only "error" is supported. | no |
34353432
| `JobID` | JobID is the ID of the BACKUP/RESTORE job. | no |
34363433
| `ResultStatus` | ResultStatus indicates whether the job succeeded or failed. | no |
34373434
| `ErrorText` | ErrorText is the text of the error that caused the job to fail. | partially |
34383435
| `RecurringCron` | RecurringCron is the crontab for the incremental backup. | no |
34393436
| `FullBackupCron` | FullBackupCron is the crontab for the full backup. | no |
34403437
| `CustomFirstRunTime` | CustomFirstRunTime is the timestamp for the user configured first run time. Expressed as nanoseconds since the Unix epoch. | no |
3441-
| `OnExecutionFailure` | OnExecutionFailure describes the desired behavior if the schedule fails to execute. | no |
3442-
| `OnPreviousRunning` | OnPreviousRunning describes the desired behavior if the previously scheduled BACKUP is still running. | no |
34433438
| `IgnoreExistingBackup` | IgnoreExistingBackup is true iff the BACKUP schedule should still be created even if a backup is already present in its destination. | no |
34443439
| `ApplicationName` | The application name for the session where recovery event was created. | no |
34453440
| `NumRows` | NumRows is the number of rows successfully imported, backed up or restored. | no |

pkg/backup/backup_job.go

Lines changed: 1 addition & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -666,7 +666,7 @@ func (b *backupResumer) Resume(ctx context.Context, execCtx interface{}) error {
666666
}
667667

668668
// Collect telemetry, once per backup after resolving its destination.
669-
collectTelemetry(ctx, backupManifest, initialDetails, details, true, b.job.ID())
669+
collectTelemetry(ctx, backupManifest, initialDetails, details, b.job.ID())
670670
}
671671

672672
// For all backups, partitioned or not, the main BACKUP manifest is stored at
@@ -984,7 +984,6 @@ func collectTelemetry(
984984
ctx context.Context,
985985
backupManifest *backuppb.BackupManifest,
986986
initialDetails, backupDetails jobspb.BackupDetails,
987-
licensed bool,
988987
jobID jobspb.JobID,
989988
) {
990989
// sourceSuffix specifies if this schedule was created by a schedule.
@@ -1000,14 +999,6 @@ func collectTelemetry(
1000999
}
10011000

10021001
countSource("backup.total.started")
1003-
if backupManifest.IsIncremental() || backupDetails.EncryptionOptions != nil {
1004-
countSource("backup.using-enterprise-features")
1005-
}
1006-
if licensed {
1007-
countSource("backup.licensed")
1008-
} else {
1009-
countSource("backup.free")
1010-
}
10111002
if backupDetails.StartTime.IsEmpty() {
10121003
countSource("backup.span.full")
10131004
} else {
@@ -1031,33 +1022,6 @@ func collectTelemetry(
10311022
countSource("backup.encryption.kms")
10321023
}
10331024
}
1034-
if backupDetails.CollectionURI != "" {
1035-
countSource("backup.nested")
1036-
timeBaseSubdir := true
1037-
if _, err := time.Parse(backupbase.DateBasedIntoFolderName,
1038-
initialDetails.Destination.Subdir); err != nil {
1039-
timeBaseSubdir = false
1040-
}
1041-
if backupDetails.StartTime.IsEmpty() {
1042-
if !timeBaseSubdir {
1043-
countSource("backup.deprecated-full-nontime-subdir")
1044-
} else if initialDetails.Destination.Exists {
1045-
countSource("backup.deprecated-full-time-subdir")
1046-
} else {
1047-
countSource("backup.full-no-subdir")
1048-
}
1049-
} else {
1050-
if initialDetails.Destination.Subdir == backupbase.LatestFileName {
1051-
countSource("backup.incremental-latest-subdir")
1052-
} else if !timeBaseSubdir {
1053-
countSource("backup.deprecated-incremental-nontime-subdir")
1054-
} else {
1055-
countSource("backup.incremental-explicit-subdir")
1056-
}
1057-
}
1058-
} else {
1059-
countSource("backup.deprecated-non-collection")
1060-
}
10611025
if backupManifest.DescriptorCoverage == tree.AllDescriptors {
10621026
countSource("backup.targets.full_cluster")
10631027
}

pkg/backup/backup_telemetry.go

Lines changed: 0 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,8 @@ import (
99
"context"
1010
"net/url"
1111
"sort"
12-
"strings"
1312
"time"
1413

15-
"github.com/cockroachdb/cockroach/pkg/backup/backupbase"
1614
"github.com/cockroachdb/cockroach/pkg/backup/backupdest"
1715
"github.com/cockroachdb/cockroach/pkg/cloud"
1816
"github.com/cockroachdb/cockroach/pkg/jobs/jobspb"
@@ -47,10 +45,6 @@ const (
4745
scheduledBackupJobEventType eventpb.RecoveryEventType = "scheduled_backup_job"
4846
restoreJobEventType eventpb.RecoveryEventType = "restore_job"
4947

50-
latestSubdirType = "latest"
51-
standardSubdirType = "standard"
52-
customSubdirType = "custom"
53-
5448
// Currently the telemetry event payload only contains keys of options. Future
5549
// changes to telemetry should refrain from adding values to the payload
5650
// unless they are properly redacted.
@@ -100,21 +94,6 @@ func createBackupRecoveryEvent(
10094
}
10195
}
10296

103-
timeBaseSubdir := true
104-
var subdirType string
105-
if _, err := time.Parse(backupbase.DateBasedIntoFolderName,
106-
initialDetails.Destination.Subdir); err != nil {
107-
timeBaseSubdir = false
108-
}
109-
110-
if strings.EqualFold(initialDetails.Destination.Subdir, backupbase.LatestFileName) {
111-
subdirType = latestSubdirType
112-
} else if !timeBaseSubdir {
113-
subdirType = customSubdirType
114-
} else {
115-
subdirType = standardSubdirType
116-
}
117-
11897
authTypes := make(map[string]struct{})
11998
storageTypes := make(map[string]struct{})
12099
defaultURI, urisByLocalityKV, err := backupdest.GetURIsByLocalityKV(initialDetails.Destination.To, "")
@@ -156,14 +135,12 @@ func createBackupRecoveryEvent(
156135
TargetScope: largestScope.String(),
157136
IsMultiregionTarget: multiRegion,
158137
TargetCount: uint32(targetCount),
159-
DestinationSubdirType: subdirType,
160138
IsLocalityAware: isLocalityAware,
161139
WithRevisionHistory: initialDetails.RevisionHistory,
162140
HasEncryptionPassphrase: passphrase,
163141
KMSType: kms,
164142
KMSCount: uint32(kmsCount),
165143
JobID: uint64(jobID),
166-
AsOfInterval: initialDetails.AsOfInterval,
167144
Options: options,
168145
ApplicationName: initialDetails.ApplicationName,
169146
}
@@ -245,25 +222,6 @@ func parseStorageAndAuth(uri string) (string, string, error) {
245222
return storageType, authType, nil
246223
}
247224

248-
func loggedSubdirType(subdir string) string {
249-
timeBaseSubdir := true
250-
var subdirType string
251-
if _, err := time.Parse(backupbase.DateBasedIntoFolderName,
252-
subdir); err != nil {
253-
timeBaseSubdir = false
254-
}
255-
256-
if strings.EqualFold(subdir, backupbase.LatestFileName) {
257-
subdirType = latestSubdirType
258-
} else if !timeBaseSubdir {
259-
subdirType = customSubdirType
260-
} else {
261-
subdirType = standardSubdirType
262-
}
263-
264-
return subdirType
265-
}
266-
267225
// logCreateScheduleTelemetry publishes an eventpb.RecoveryEvent about a created
268226
// backup schedule.
269227
func logCreateScheduleTelemetry(
@@ -296,8 +254,6 @@ func logCreateScheduleTelemetry(
296254
backupEvent.RecurringCron = recurringCron
297255
backupEvent.FullBackupCron = fullCron
298256
backupEvent.CustomFirstRunTime = firstRunNanos
299-
backupEvent.OnPreviousRunning = jobspb.ScheduleDetails_WaitBehavior_name[int32(details.Wait)]
300-
backupEvent.OnExecutionFailure = jobspb.ScheduleDetails_ErrorHandlingBehavior_name[int32(details.OnError)]
301257
backupEvent.IgnoreExistingBackup = ignoreExisting
302258

303259
log.StructuredEvent(ctx, severity.INFO, &backupEvent)
@@ -312,7 +268,6 @@ func logRestoreTelemetry(
312268
intoDB string,
313269
newDBName string,
314270
subdir string,
315-
asOfInterval int64,
316271
opts tree.RestoreOptions,
317272
descsByTablePattern map[tree.TablePattern]catalog.Descriptor,
318273
restoreDBs []catalog.DatabaseDescriptor,
@@ -410,9 +365,7 @@ func logRestoreTelemetry(
410365
TargetScope: largestScope.String(),
411366
TargetCount: uint32(targetCount),
412367
IsMultiregionTarget: multiRegion,
413-
DestinationSubdirType: loggedSubdirType(subdir),
414368
IsLocalityAware: localityAware,
415-
AsOfInterval: asOfInterval,
416369
HasEncryptionPassphrase: passphrase,
417370
KMSType: kmsType,
418371
KMSCount: uint32(kmsCount),

pkg/backup/backup_test.go

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10297,11 +10297,9 @@ func TestBackupRestoreTelemetryEvents(t *testing.T) {
1029710297
RecoveryType: backupEventType,
1029810298
TargetScope: databaseScope.String(),
1029910299
TargetCount: 2,
10300-
DestinationSubdirType: standardSubdirType,
1030110300
DestinationStorageTypes: []string{"nodelocal", "userfile"},
1030210301
DestinationAuthTypes: []string{"specified"},
1030310302
IsLocalityAware: true,
10304-
AsOfInterval: -1 * time.Millisecond.Nanoseconds(),
1030510303
WithRevisionHistory: true,
1030610304
ApplicationName: "backup_test",
1030710305
}
@@ -10332,11 +10330,9 @@ func TestBackupRestoreTelemetryEvents(t *testing.T) {
1033210330
RecoveryType: restoreEventType,
1033310331
TargetScope: tableScope.String(),
1033410332
TargetCount: 1,
10335-
DestinationSubdirType: latestSubdirType,
1033610333
DestinationStorageTypes: []string{"nodelocal", "userfile"},
1033710334
DestinationAuthTypes: []string{"specified"},
1033810335
IsLocalityAware: true,
10339-
AsOfInterval: 0,
1034010336
Options: []string{telemetryOptionIntoDB, telemetryOptionSkipMissingFK},
1034110337
ApplicationName: "backup_test",
1034210338
}

pkg/backup/create_scheduled_backup_test.go

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1432,15 +1432,13 @@ func TestCreateScheduledBackupTelemetry(t *testing.T) {
14321432

14331433
th, cleanup := newTestHelper(t)
14341434
defer cleanup()
1435-
var asOfInterval int64
14361435

14371436
// We'll be manipulating schedule time via th.env, but we can't fool actual backup
14381437
// when it comes to AsOf time. So, override AsOf backup clause to be the current time.
14391438
useRealTimeAOST := func() func() {
14401439
knobs := th.cfg.TestingKnobs.(*jobs.TestingKnobs)
14411440
knobs.OverrideAsOfClause = func(clause *tree.AsOfClause, stmtTimestamp time.Time) {
14421441
expr, err := tree.MakeDTimestampTZ(th.cfg.DB.KV().Clock().PhysicalTime(), time.Microsecond)
1443-
asOfInterval = expr.Time.UnixNano() - stmtTimestamp.UnixNano()
14441442
require.NoError(t, err)
14451443
clause.Expr = expr
14461444
}
@@ -1472,15 +1470,11 @@ WITH SCHEDULE OPTIONS on_execution_failure = 'pause', ignore_existing_backups, f
14721470
RecoveryType: createdScheduleEventType,
14731471
TargetScope: clusterScope.String(),
14741472
TargetCount: 1,
1475-
DestinationSubdirType: standardSubdirType,
14761473
DestinationStorageTypes: []string{"userfile"},
14771474
DestinationAuthTypes: []string{"specified"},
1478-
AsOfInterval: asOfInterval,
14791475
Options: []string{telemetryOptionDetached},
14801476
RecurringCron: "@hourly",
14811477
FullBackupCron: "@daily",
1482-
OnExecutionFailure: "PAUSE_SCHED",
1483-
OnPreviousRunning: "WAIT",
14841478
IgnoreExistingBackup: true,
14851479
CustomFirstRunTime: firstRun.UnixNano(),
14861480
ApplicationName: "backup_test",
@@ -1501,10 +1495,8 @@ WITH SCHEDULE OPTIONS on_execution_failure = 'pause', ignore_existing_backups, f
15011495
RecoveryType: scheduledBackupEventType,
15021496
TargetScope: clusterScope.String(),
15031497
TargetCount: 1,
1504-
DestinationSubdirType: standardSubdirType,
15051498
DestinationStorageTypes: []string{"userfile"},
15061499
DestinationAuthTypes: []string{"specified"},
1507-
AsOfInterval: asOfInterval,
15081500
Options: []string{telemetryOptionDetached},
15091501
}
15101502
requireRecoveryEvent(t, beforeBackup.UnixNano(), scheduledBackupEventType, expectedScheduledBackup)

pkg/backup/restore_planning.go

Lines changed: 3 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1953,11 +1953,6 @@ func doRestorePlan(
19531953
}
19541954
}
19551955

1956-
var asOfInterval int64
1957-
if !endTime.IsEmpty() {
1958-
asOfInterval = endTime.WallTime - p.ExtendedEvalContext().StmtTimestamp.UnixNano()
1959-
}
1960-
19611956
filteredTablesByID, err := maybeFilterMissingViews(
19621957
tablesByID,
19631958
typesByID,
@@ -2133,7 +2128,7 @@ func doRestorePlan(
21332128
}
21342129
resultsCh <- tree.Datums{tree.NewDInt(tree.DInt(jobID))}
21352130
collectRestoreTelemetry(ctx, jobID, restoreDetails, intoDB, newDBName, subdir, restoreStmt,
2136-
descsByTablePattern, restoreDBs, asOfInterval, p.SessionData().ApplicationName)
2131+
descsByTablePattern, restoreDBs, p.SessionData().ApplicationName)
21372132
return nil
21382133
}
21392134

@@ -2180,7 +2175,7 @@ func doRestorePlan(
21802175
// execution.
21812176
p.InternalSQLTxn().Descriptors().ReleaseAll(ctx)
21822177
collectRestoreTelemetry(ctx, sj.ID(), restoreDetails, intoDB, newDBName, subdir, restoreStmt,
2183-
descsByTablePattern, restoreDBs, asOfInterval, p.SessionData().ApplicationName)
2178+
descsByTablePattern, restoreDBs, p.SessionData().ApplicationName)
21842179
if err := sj.Start(ctx); err != nil {
21852180
return err
21862181
}
@@ -2200,20 +2195,13 @@ func collectRestoreTelemetry(
22002195
restoreStmt *tree.Restore,
22012196
descsByTablePattern map[tree.TablePattern]catalog.Descriptor,
22022197
restoreDBs []catalog.DatabaseDescriptor,
2203-
asOfInterval int64,
22042198
applicationName string,
22052199
) {
22062200
telemetry.Count("restore.total.started")
22072201
if restoreStmt.DescriptorCoverage == tree.AllDescriptors {
22082202
telemetry.Count("restore.full-cluster")
22092203
}
2210-
if restoreStmt.Subdir == nil {
2211-
telemetry.Count("restore.deprecated-subdir-syntax")
2212-
} else {
2213-
telemetry.Count("restore.collection")
2214-
}
2215-
2216-
logRestoreTelemetry(ctx, jobID, details, intoDB, newDBName, subdir, asOfInterval, restoreStmt.Options,
2204+
logRestoreTelemetry(ctx, jobID, details, intoDB, newDBName, subdir, restoreStmt.Options,
22172205
descsByTablePattern, restoreDBs, applicationName)
22182206
}
22192207

0 commit comments

Comments
 (0)