Skip to content

Fix order-sensitive DynamoDB set comparison in validator#283

Open
pradhankukiran wants to merge 4 commits intoscylladb:masterfrom
pradhankukiran:fix/dynamodb-set-comparison
Open

Fix order-sensitive DynamoDB set comparison in validator#283
pradhankukiran wants to merge 4 commits intoscylladb:masterfrom
pradhankukiran:fix/dynamodb-set-comparison

Conversation

@pradhankukiran
Copy link

Summary

  • Sort NS values before zip-comparing to handle different ordering
  • Add explicit SS and BS cases that compare as Sets instead of falling through to Seq equality

Test plan

  • Added 3 new unit tests in DynamoDBRowComparisonTest covering SS, NS, and NS-with-tolerance ordering

Fixes #282

val xs = l.map(BigDecimal(_))
val ys = r.map(BigDecimal(_))
val xs = l.map(BigDecimal(_)).sorted
val ys = r.map(BigDecimal(_)).sorted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wondering if this will be lazy, since you ideally want to run sorting only after other fail fast tests (move sort to after the size checks?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, I see, you need it for zipping

@tarzanek
Copy link
Contributor

let's see if tests pass and copilot won't have any comments if we can merge (I am still thinking about how to make the sorting lazy and done ideally after size comparison)

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a bug in the DynamoDB validator where set comparisons (SS, NS, BS) were incorrectly order-sensitive, causing false mismatch reports when sets contained the same elements in different orders. The fix makes set comparisons order-insensitive as required by DynamoDB's unordered set semantics.

Changes:

  • Modified NS (Number Set) comparison to sort values before comparing, while still applying floating-point tolerance
  • Added explicit SS (String Set) comparison using set equality instead of sequence equality
  • Added explicit BS (Binary Set) comparison using set equality instead of sequence equality

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
migrator/src/main/scala/com/scylladb/migrator/validation/RowComparisonFailure.scala Updated areDifferent method to sort NS values before comparison and added explicit SS/BS cases using toSet for unordered comparison
tests/src/test/scala/com/scylladb/migrator/validation/DynamoDBRowComparisonTest.scala Added 3 tests covering SS and NS with different orderings, including NS with tolerance

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +92 to +126
test("String sets with different order are equal") {
val source = Map("foo" -> DdbValue.Ss(Seq("a", "b", "c")))
val target = Map("foo" -> DdbValue.Ss(Seq("c", "a", "b")))
val result = RowComparisonFailure.compareDynamoDBRows(
source,
Some(target),
sameColumns,
floatingPointTolerance
)
assertEquals(result, None)
}

test("Number sets with different order are equal") {
val source = Map("foo" -> DdbValue.Ns(Seq("1", "2", "3")))
val target = Map("foo" -> DdbValue.Ns(Seq("3", "1", "2")))
val result = RowComparisonFailure.compareDynamoDBRows(
source,
Some(target),
sameColumns,
floatingPointTolerance
)
assertEquals(result, None)
}

test("Number sets with different order are equal within tolerance") {
val source = Map("foo" -> DdbValue.Ns(Seq("1.001", "2.002")))
val target = Map("foo" -> DdbValue.Ns(Seq("2.003", "1.002")))
val result = RowComparisonFailure.compareDynamoDBRows(
source,
Some(target),
sameColumns,
floatingPointTolerance
)
assertEquals(result, None)
}
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR adds explicit cases for SS and BS set comparisons using toSet, but only adds tests for SS (String sets). There is no test for BS (Binary sets) to verify that the new comparison logic works correctly for binary sets with different ordering. Consider adding a test similar to the SS test that verifies BS with different ordering are considered equal.

Copilot uses AI. Check for mistakes.
Comment on lines +92 to +126
test("String sets with different order are equal") {
val source = Map("foo" -> DdbValue.Ss(Seq("a", "b", "c")))
val target = Map("foo" -> DdbValue.Ss(Seq("c", "a", "b")))
val result = RowComparisonFailure.compareDynamoDBRows(
source,
Some(target),
sameColumns,
floatingPointTolerance
)
assertEquals(result, None)
}

test("Number sets with different order are equal") {
val source = Map("foo" -> DdbValue.Ns(Seq("1", "2", "3")))
val target = Map("foo" -> DdbValue.Ns(Seq("3", "1", "2")))
val result = RowComparisonFailure.compareDynamoDBRows(
source,
Some(target),
sameColumns,
floatingPointTolerance
)
assertEquals(result, None)
}

test("Number sets with different order are equal within tolerance") {
val source = Map("foo" -> DdbValue.Ns(Seq("1.001", "2.002")))
val target = Map("foo" -> DdbValue.Ns(Seq("2.003", "1.002")))
val result = RowComparisonFailure.compareDynamoDBRows(
source,
Some(target),
sameColumns,
floatingPointTolerance
)
assertEquals(result, None)
}
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test suite only includes positive tests (sets with same elements in different order should be equal), but lacks negative tests to verify that sets with actually different elements are still correctly identified as different. Consider adding tests that verify, for example, that SS with ["a", "b", "c"] and ["a", "b", "d"] are correctly reported as different, and similarly for NS and BS.

Copilot uses AI. Check for mistakes.
@pradhankukiran
Copy link
Author

@tarzanek addressed both Copilot review comments

@tarzanek
Copy link
Contributor

tarzanek commented Mar 4, 2026

@pradhankukiran can you resolve conflicts please? they look trivial

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DynamoDB set comparison in validator is order-sensitive (false mismatches)

3 participants