-
Notifications
You must be signed in to change notification settings - Fork 46
Unicron fix backfill script #4875
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Lukasz Gryglicki <[email protected]> Assisted by [OpenAI](https://platform.openai.com/) Assisted by [GitHub Copilot](https://github.com/features/copilot)
Signed-off-by: Lukasz Gryglicki <[email protected]> Assisted by [OpenAI](https://platform.openai.com/) Assisted by [GitHub Copilot](https://github.com/features/copilot)
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughRefactors the signatures timestamp backfill tool to build DynamoDB condition expressions from only defined attribute names, pass the expression into CLI and UpdateItem calls, and conditionally handle approx_date_created/approx_date_modified; adds a signature date stats shell script. Changes
Sequence Diagram(s)sequenceDiagram
actor Scanner as Scan Operation
participant Main as Backfill Main
participant CondBuilder as buildConditionExpression()
participant CLIBuilder as buildAwsCliUpdate()
participant DynamoDB as DynamoDB UpdateItem
Scanner->>Main: identify attributes to update
Main->>CondBuilder: pass `names` map (only defined attrs)
CondBuilder-->>Main: return `condExpr` (OR of attribute_not_exists/empty or default)
Main->>CLIBuilder: pass region, table, sigID, updateExpr, names, values, condExpr
CLIBuilder-->>Main: return AWS CLI update command
Main->>DynamoDB: call UpdateItem with UpdateExpression, Names, Values, ConditionExpression=condExpr
alt Condition satisfied
DynamoDB-->>Main: update success
else Condition not met
DynamoDB-->>Main: ConditionalCheckFailedException
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25–30 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a critical bug in the DynamoDB backfill script where the condition expression referenced attribute names that weren't always defined in the ExpressionAttributeNames map. The issue only manifested during actual AWS updates (not in dry-run mode), causing validation errors when only one of the date fields needed updating.
Key changes:
- Introduced dynamic condition expression building based on actually defined attribute names
- Changed from static to dynamic population of the
namesmap to only include attributes being updated - Applied the fix consistently across both
firstPassScanAndUpdateandsnowflakeFixfunctions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
cla-backend-go/cmd/signatures_timestamp_backfill/main.go (1)
605-639: Snowflake path condition building is sound; consider avoiding duplicate recomputationWiring
namesfor#date_*vs#approx_date_*based on Fivetran vs non-Fivetran sources and then usingcondExpr := buildConditionExpression(names)for both the CLI and SDK update keeps conditions consistent and prevents mismatches when only approx fields are touched. The second callcondExpr = buildConditionExpression(names)just beforeUpdateItemis redundant becausenamesis unchanged, so you could reuse the earliercondExprto simplify the flow.Also applies to: 661-663, 676-678, 685-685
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
cla-backend-go/cmd/signatures_timestamp_backfill/main.go(10 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: CodeQL analysis (go)
- GitHub Check: Agent
- GitHub Check: cypress-functional
- GitHub Check: build-test-lint
🔇 Additional comments (3)
cla-backend-go/cmd/signatures_timestamp_backfill/main.go (3)
409-472: Dynamic per-field condition expression in first pass looks correctBuilding
namesfrom the actually-updated attributes and derivingcondExpr := buildConditionExpression(names)before both the CLI andUpdateItemWithContextcalls keeps the condition aligned with the update expression and avoids DynamoDB validation errors from referencing unmapped#-placeholders. The semantics (“only update when the specific target date fields are missing/empty”) remain consistent with the previouscondAnyMissingintent, just restricted to the fields you’re writing.Please run a small integration check against a test table to confirm that:
- Updates succeed when exactly one of
date_created/date_modifiedis missing or empty.- Updates are rejected (ConditionalCheckFailed) if another process fills the same field between scan and update.
1065-1096:buildConditionExpressionmatches prior semantics and safely extends them to approx fieldsThe helper correctly:
- Adds per-field clauses only for keys present in
names, so DDB never sees undefined#...placeholders.- Keeps the “any of the relevant date fields is missing/empty” behavior via OR-ing the per-field conditions.
- Falls back to
attribute_exists(signature_id)when no date fields are referenced, which is a safe always-on guard that doesn’t require extraExpressionAttributeNamesorValues.It would be good to verify against DynamoDB docs or a quick live test that:
- Extra entries in
ExpressionAttributeNamesnot referenced bycondExprare ignored.- The fallback
attribute_exists(signature_id)behaves as expected when the item is missing (i.e., fails the conditional update).
1102-1140: PlumbingcondExprthroughbuildAwsCliUpdatealigns dry-run CLI with live UpdateItem behaviorExtending
buildAwsCliUpdateto acceptcondExprand injecting it as--condition-expression '%s'ensures the generated AWS CLI commands now use the same condition as the SDK path, fixing the dry-run/real-run mismatch. The JSON flattening ofnames/valuesstill only includes attributes actually in play, and the new condition string doesn’t introduce shell-quoting issues since it contains no single quotes.After deploying, please spot-check a few generated CLI lines (for both regular and Fivetran/approx cases) by running them manually to confirm the condition expressions and attribute maps match what DynamoDB accepts.
Signed-off-by: Lukasz Gryglicki <[email protected]> Assisted by [OpenAI](https://platform.openai.com/) Assisted by [GitHub Copilot](https://github.com/features/copilot)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
utils/signature_dates_stats.sh (1)
1-5: Add error handling and inline comments for clarity.This utility script executes against production credentials and table without any error handling or documentation. Consider the following improvements:
- Add
set -eto exit on errors- Add inline comments above each scan explaining what's being counted
- Validate the AWS profile is available before execution
Here's a refactored version with error handling and documentation:
#!/bin/bash set -e PROFILE="lfproduct-prod" TABLE="cla-prod-signatures" # Validate AWS profile is configured if ! aws configure list --profile "$PROFILE" >/dev/null 2>&1; then echo "Error: AWS profile '$PROFILE' not found" >&2 exit 1 fi # Count signatures with missing or NULL date_created echo "Counting signatures with missing/NULL date_created..." aws --profile "$PROFILE" dynamodb scan --table-name "$TABLE" \ --filter-expression "attribute_not_exists(#a) OR #a = :nullval" \ --expression-attribute-names '{"#a":"date_created"}' \ --expression-attribute-values '{":nullval":{"NULL":true}}' \ --select "COUNT" # Count signatures with existing approx_date_created echo "Counting signatures with existing approx_date_created..." aws --profile "$PROFILE" dynamodb scan --table-name "$TABLE" \ --filter-expression "attribute_exists(#a)" \ --expression-attribute-names '{"#a":"approx_date_created"}' \ --select "COUNT" # Count signatures with missing or NULL date_modified echo "Counting signatures with missing/NULL date_modified..." aws --profile "$PROFILE" dynamodb scan --table-name "$TABLE" \ --filter-expression "attribute_not_exists(#a) OR #a = :nullval" \ --expression-attribute-names '{"#a":"date_modified"}' \ --expression-attribute-values '{":nullval":{"NULL":true}}' \ --select "COUNT" # Count signatures with existing approx_date_modified echo "Counting signatures with existing approx_date_modified..." aws --profile "$PROFILE" dynamodb scan --table-name "$TABLE" \ --filter-expression "attribute_exists(#a)" \ --expression-attribute-names '{"#a":"approx_date_modified"}' \ --select "COUNT"This improves reliability by catching errors early, validates credentials before execution, extracts variables for maintainability, breaks long lines for readability, and documents each scan's purpose.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
utils/signature_dates_stats.sh(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-test-lint
- GitHub Check: cypress-functional
Signed-off-by: Lukasz Gryglicki <[email protected]> Assisted by [OpenAI](https://platform.openai.com/) Assisted by [GitHub Copilot](https://github.com/features/copilot)
cc @mlehotskylf @ahmedomosanya - fix the issue with actual AWS update that wasn't detected in dry-run mode. No logic updates.
Signed-off-by: Lukasz Gryglicki [email protected]
Assisted by OpenAI
Assisted by GitHub Copilot