Skip to content

Commit 7894842

Browse files
pravinbhatmsmygit
andauthored
Fix auto column mapping bug when target has more columns than origin (#375)
* Convert nulls on origin to unset at target by a new property * Undo the option to allow null values as they will be eliminated during compaction. It provides no value other than slowing migration & target cluster. Also, doing a strict check for String (which includes all C* text types) types with empty values * Added release notes and updated readme * Added missing license * Defensive code to avoid UNSET exceptions * Handle bind counters correctly when target has more columns than origin * Added tests for auto column mapping test & DiffData bug-fix * Added release notes --------- Co-authored-by: Madhavan Sridharan <[email protected]>
1 parent f12c122 commit 7894842

File tree

17 files changed

+180
-19
lines changed

17 files changed

+180
-19
lines changed

RELEASE.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
11
# Release Notes
22

3+
## [5.4.1] - 2025-06-26
4+
- Bug fix: Fixed auto column mapping bug when `target` table has more columns than `origin`.
5+
36
## [5.4.0] - 2025-06-16
47
- Use `UNSET` value for null fields (including empty texts) to avoid creating (or carrying forward) tombstones during row creation.
58

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
/*
2+
Licensed under the Apache License, Version 2.0 (the "License"); you
3+
may not use this file except in compliance with the License.
4+
You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software
9+
distributed under the License is distributed on an "AS IS" BASIS,
10+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11+
See the License for the specific language governing permissions and
12+
limitations under the License.
13+
*/
14+
15+
DELETE FROM target.map_columns WHERE key_a=2;
16+
UPDATE target.map_columns SET val_a='valueD' WHERE key_a=3;
17+
18+
INSERT INTO origin.map_columns(key_a, key_b, val_a, val_b) VALUES (1, 'key1','valueA', 21);
19+
INSERT INTO origin.map_columns(key_a, key_b, val_a, val_b) VALUES (2, 'key2','valueB', 22);
20+
INSERT INTO origin.map_columns(key_a, key_b, val_a, val_b) VALUES (3, 'key3','valueC', 23);
21+
22+
SELECT * FROM target.map_columns;
23+
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
Read Record Count: 3
2+
Mismatch Record Count: 1
3+
Corrected Mismatch Record Count: 1
4+
Missing Record Count: 1
5+
Corrected Missing Record Count: 1
6+
Valid Record Count: 1
7+
Skipped Record Count: 0
8+
Error Record Count: 0
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
Read Record Count: 3
2+
Write Record Count: 3
3+
Skipped Record Count: 0
4+
Error Record Count: 0
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
migrateData com.datastax.cdm.job.Migrate migrate.properties
2+
validateData com.datastax.cdm.job.DiffData migrate.properties
3+
fixData com.datastax.cdm.job.DiffData fix.properties
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
Read Record Count: 3
2+
Mismatch Record Count: 0
3+
Corrected Mismatch Record Count: 0
4+
Missing Record Count: 0
5+
Corrected Missing Record Count: 0
6+
Valid Record Count: 3
7+
Skipped Record Count: 0
8+
Error Record Count: 0
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
#
2+
# Licensed under the Apache License, Version 2.0 (the "License");
3+
# you may not use this file except in compliance with the License.
4+
# You may obtain a copy of the License at
5+
#
6+
# http://www.apache.org/licenses/LICENSE-2.0
7+
#
8+
# Unless required by applicable law or agreed to in writing, software
9+
# distributed under the License is distributed on an "AS IS" BASIS,
10+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11+
# See the License for the specific language governing permissions and
12+
# limitations under the License.
13+
#
14+
15+
#!/bin/bash -e
16+
17+
workingDir="$1"
18+
cd "$workingDir"
19+
20+
/local/cdm.sh -f cdm.txt -s migrateData -d "$workingDir" > cdm.migrateData.out 2>cdm.migrateData.err
21+
/local/cdm-assert.sh -f cdm.migrateData.out -a cdm.migrateData.assert -d "$workingDir"
22+
23+
/local/cdm.sh -f cdm.txt -s validateData -d "$workingDir" > cdm.validateData.out 2>cdm.validateData.err
24+
/local/cdm-assert.sh -f cdm.validateData.out -a cdm.validateData.assert -d "$workingDir"
25+
26+
cqlsh -u $CASS_USERNAME -p $CASS_PASSWORD $CASS_CLUSTER -f $workingDir/breakData.cql > $workingDir/breakData.out 2> $workingDir/breakData.err
27+
28+
/local/cdm.sh -f cdm.txt -s fixData -d "$workingDir" > cdm.fixData.out 2>cdm.fixData.err
29+
/local/cdm-assert.sh -f cdm.fixData.out -a cdm.fixData.assert -d "$workingDir"
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
/*
2+
Licensed under the Apache License, Version 2.0 (the "License"); you
3+
may not use this file except in compliance with the License.
4+
You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software
9+
distributed under the License is distributed on an "AS IS" BASIS,
10+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11+
See the License for the specific language governing permissions and
12+
limitations under the License.
13+
*/
14+
15+
SELECT * FROM target.map_columns;
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
2+
key_a | val_a | val_c
3+
-------+--------+-------
4+
1 | valueA | null
5+
2 | valueB | null
6+
3 | valueC | null
7+
8+
(3 rows)
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
#
2+
# Licensed under the Apache License, Version 2.0 (the "License");
3+
# you may not use this file except in compliance with the License.
4+
# You may obtain a copy of the License at
5+
#
6+
# http://www.apache.org/licenses/LICENSE-2.0
7+
#
8+
# Unless required by applicable law or agreed to in writing, software
9+
# distributed under the License is distributed on an "AS IS" BASIS,
10+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
11+
# See the License for the specific language governing permissions and
12+
# limitations under the License.
13+
#
14+
15+
spark.cdm.connect.origin.host cdm-sit-cass
16+
spark.cdm.connect.target.host cdm-sit-cass
17+
18+
spark.cdm.schema.origin.keyspaceTable origin.map_columns
19+
spark.cdm.schema.target.keyspaceTable target.map_columns
20+
spark.cdm.perfops.numParts 1
21+
22+
spark.cdm.autocorrect.missing true
23+
spark.cdm.autocorrect.mismatch true

0 commit comments

Comments
 (0)