Conversation
There was a problem hiding this comment.
Pull request overview
This PR enhances the change listener to skip processing documents without xattrs, improving performance by avoiding unnecessary unmarshalling operations. The change cache previously attempted to process documents lacking _sync xattrs (e.g., un-imported documents or those rejected by import filters), which was only necessary in pre-xattr mode.
Changes:
- Add early return when documents lack xattrs, preventing unnecessary processing
- Reorder validation checks to filter out documents without xattrs before attempting unmarshalling
- Remove obsolete error logging for binary documents that are now filtered earlier
| if event.DataType != base.MemcachedDataTypeRaw { | ||
| base.DebugfCtx(ctx, base.KeyCache, "Unable to unmarshal sync metadata for feed document %q. Will not be included in channel cache. Error: %v", base.UD(docID), err) | ||
| } | ||
| if errors.Is(err, sgbucket.ErrEmptyMetadata) { |
There was a problem hiding this comment.
With the new early return for documents without xattrs (line 368-370), this warning about empty metadata may no longer be reachable or relevant. Consider adding a comment explaining when this condition could still occur, or verify if this error handling is still necessary given the earlier xattr check.
| if errors.Is(err, sgbucket.ErrEmptyMetadata) { | |
| if errors.Is(err, sgbucket.ErrEmptyMetadata) { | |
| // At this point we know the document has *some* xattrs (see early return above), | |
| // but Sync Gateway metadata may still be missing or empty. This can happen if the | |
| // document only has non-SG xattrs, or if the SG metadata xattr is present but | |
| // logically empty/corrupt. Log a warning so these unexpected cases are visible. |
| base.DebugfCtx(ctx, base.KeyCache, "Unable to unmarshal sync metadata for feed document %q. Will not be included in channel cache. Error: %v", base.UD(docID), err) | ||
| } | ||
| if errors.Is(err, sgbucket.ErrEmptyMetadata) { | ||
| base.WarnfCtx(ctx, "Unexpected empty metadata when processing feed event. docid: %s opcode: %v datatype:%v", base.UD(event.Key), event.Opcode, event.DataType) |
There was a problem hiding this comment.
The removed error logging provided useful debugging information when unmarshalling failed. Now that binary documents and documents without xattrs are filtered earlier, any unmarshalling errors that do occur are unexpected and warrant logging. Consider adding back a debug log for the general error case to aid troubleshooting, since these errors would now indicate genuinely unexpected conditions.
| base.WarnfCtx(ctx, "Unexpected empty metadata when processing feed event. docid: %s opcode: %v datatype:%v", base.UD(event.Key), event.Opcode, event.DataType) | |
| base.WarnfCtx(ctx, "Unexpected empty metadata when processing feed event. docid: %s opcode: %v datatype:%v", base.UD(event.Key), event.Opcode, event.DataType) | |
| } else { | |
| base.DebugfCtx(ctx, base.KeyCache, "Error unmarshalling sync data from feed for %s (opcode:%v datatype:%v): %v", base.UD(event.Key), event.Opcode, event.DataType, err) |
CBG-5170 change listener: don't process docs without xattr
This avoids unmarshalling documents without _sync xattr, which was necessarily in pre-xattr mode. The change cache could see documents without _sync xattrs if they were not yet imported, or they got rejected by the import filter.
If this were to get backported to 3.3, you would add a check for db.UseXattrs() around this section.
Pre-review checklist
fmt.Print,log.Print, ...)base.UD(docID),base.MD(dbName))docs/apiIntegration Tests
GSI=true,xattrs=truehttps://jenkins.sgwdev.com/job/SyncGatewayIntegration/273/