Fix order-sensitive DynamoDB set comparison in validator#283
Fix order-sensitive DynamoDB set comparison in validator#283pradhankukiran wants to merge 4 commits intoscylladb:masterfrom
Conversation
| val xs = l.map(BigDecimal(_)) | ||
| val ys = r.map(BigDecimal(_)) | ||
| val xs = l.map(BigDecimal(_)).sorted | ||
| val ys = r.map(BigDecimal(_)).sorted |
There was a problem hiding this comment.
wondering if this will be lazy, since you ideally want to run sorting only after other fail fast tests (move sort to after the size checks?)
There was a problem hiding this comment.
wait, I see, you need it for zipping
|
let's see if tests pass and copilot won't have any comments if we can merge (I am still thinking about how to make the sorting lazy and done ideally after size comparison) |
There was a problem hiding this comment.
Pull request overview
This PR fixes a bug in the DynamoDB validator where set comparisons (SS, NS, BS) were incorrectly order-sensitive, causing false mismatch reports when sets contained the same elements in different orders. The fix makes set comparisons order-insensitive as required by DynamoDB's unordered set semantics.
Changes:
- Modified NS (Number Set) comparison to sort values before comparing, while still applying floating-point tolerance
- Added explicit SS (String Set) comparison using set equality instead of sequence equality
- Added explicit BS (Binary Set) comparison using set equality instead of sequence equality
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| migrator/src/main/scala/com/scylladb/migrator/validation/RowComparisonFailure.scala | Updated areDifferent method to sort NS values before comparison and added explicit SS/BS cases using toSet for unordered comparison |
| tests/src/test/scala/com/scylladb/migrator/validation/DynamoDBRowComparisonTest.scala | Added 3 tests covering SS and NS with different orderings, including NS with tolerance |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| test("String sets with different order are equal") { | ||
| val source = Map("foo" -> DdbValue.Ss(Seq("a", "b", "c"))) | ||
| val target = Map("foo" -> DdbValue.Ss(Seq("c", "a", "b"))) | ||
| val result = RowComparisonFailure.compareDynamoDBRows( | ||
| source, | ||
| Some(target), | ||
| sameColumns, | ||
| floatingPointTolerance | ||
| ) | ||
| assertEquals(result, None) | ||
| } | ||
|
|
||
| test("Number sets with different order are equal") { | ||
| val source = Map("foo" -> DdbValue.Ns(Seq("1", "2", "3"))) | ||
| val target = Map("foo" -> DdbValue.Ns(Seq("3", "1", "2"))) | ||
| val result = RowComparisonFailure.compareDynamoDBRows( | ||
| source, | ||
| Some(target), | ||
| sameColumns, | ||
| floatingPointTolerance | ||
| ) | ||
| assertEquals(result, None) | ||
| } | ||
|
|
||
| test("Number sets with different order are equal within tolerance") { | ||
| val source = Map("foo" -> DdbValue.Ns(Seq("1.001", "2.002"))) | ||
| val target = Map("foo" -> DdbValue.Ns(Seq("2.003", "1.002"))) | ||
| val result = RowComparisonFailure.compareDynamoDBRows( | ||
| source, | ||
| Some(target), | ||
| sameColumns, | ||
| floatingPointTolerance | ||
| ) | ||
| assertEquals(result, None) | ||
| } |
There was a problem hiding this comment.
The PR adds explicit cases for SS and BS set comparisons using toSet, but only adds tests for SS (String sets). There is no test for BS (Binary sets) to verify that the new comparison logic works correctly for binary sets with different ordering. Consider adding a test similar to the SS test that verifies BS with different ordering are considered equal.
| test("String sets with different order are equal") { | ||
| val source = Map("foo" -> DdbValue.Ss(Seq("a", "b", "c"))) | ||
| val target = Map("foo" -> DdbValue.Ss(Seq("c", "a", "b"))) | ||
| val result = RowComparisonFailure.compareDynamoDBRows( | ||
| source, | ||
| Some(target), | ||
| sameColumns, | ||
| floatingPointTolerance | ||
| ) | ||
| assertEquals(result, None) | ||
| } | ||
|
|
||
| test("Number sets with different order are equal") { | ||
| val source = Map("foo" -> DdbValue.Ns(Seq("1", "2", "3"))) | ||
| val target = Map("foo" -> DdbValue.Ns(Seq("3", "1", "2"))) | ||
| val result = RowComparisonFailure.compareDynamoDBRows( | ||
| source, | ||
| Some(target), | ||
| sameColumns, | ||
| floatingPointTolerance | ||
| ) | ||
| assertEquals(result, None) | ||
| } | ||
|
|
||
| test("Number sets with different order are equal within tolerance") { | ||
| val source = Map("foo" -> DdbValue.Ns(Seq("1.001", "2.002"))) | ||
| val target = Map("foo" -> DdbValue.Ns(Seq("2.003", "1.002"))) | ||
| val result = RowComparisonFailure.compareDynamoDBRows( | ||
| source, | ||
| Some(target), | ||
| sameColumns, | ||
| floatingPointTolerance | ||
| ) | ||
| assertEquals(result, None) | ||
| } |
There was a problem hiding this comment.
The test suite only includes positive tests (sets with same elements in different order should be equal), but lacks negative tests to verify that sets with actually different elements are still correctly identified as different. Consider adding tests that verify, for example, that SS with ["a", "b", "c"] and ["a", "b", "d"] are correctly reported as different, and similarly for NS and BS.
|
@tarzanek addressed both Copilot review comments |
|
@pradhankukiran can you resolve conflicts please? they look trivial |
Summary
Test plan
Fixes #282