You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/cdc/resume.md
-1Lines changed: 0 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,6 @@
2
2
3
3
When starting a task, you can retrieve the latest task position from the previous execution based on the configuration, allowing you to continue the task without starting from scratch.
Copy file name to clipboardExpand all lines: docs/en/snapshot/check.md
+42-30Lines changed: 42 additions & 30 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,28 +1,23 @@
1
1
# Data Check
2
2
3
-
After data migration, you may want to compare the source and target data row by row and column by column. If the data volume is too large, you can perform sampling check. Please ensure that the tables to be checked have primary keys/unique keys.
3
+
After data migration, you may want to compare the source and target data row by row and column by column. If the data volume is too large, you can perform a sampled check. Please ensure that the tables to be checked have primary keys/unique keys.
4
4
5
-
Support comparison for MySQL/PG/Mongo.
5
+
Supports comparison for MySQL, PostgreSQL, and MongoDB.
6
6
7
-
Data check can be used with both snapshot and CDC tasks. For CDC tasks, keep `[checker]` enabled and set `extract_type=cdc`; the checker validates applied changes after they are sunk.
7
+
Data check can be used with both snapshot and CDC tasks. For CDC tasks, keep `[checker]` enabled and set `extract_type=cdc`; the checker validates applied changes after they are written to the target.
8
8
9
-
# Example: MySQL -> MySQL
9
+
##Example: MySQL -> MySQL
10
10
11
11
Refer to [task templates](../../templates/mysql_to_mysql.md) and [tutorial](../tutorial/mysql_to_mysql.md)
12
12
13
-
## Sampling Check
13
+
###Sampling Check
14
14
15
-
In the full check configuration, add `sample_interval`configuration. That is, sample 1 record for every 3 records.
15
+
In the full check configuration, add `sample_interval`to the `[extractor]` section. For example, setting `sample_interval=3` checks every 3rd record.
16
16
```
17
17
[extractor]
18
18
sample_interval=3
19
19
```
20
20
21
-
## Configuration
22
-
23
-
See [config.md](../config.md) for `[checker]` options and target selection rules. Use the task
24
-
templates and tutorials for end-to-end examples.
25
-
26
21
## Limitations
27
22
28
23
- Data check is source-driven (validates Source ∈ Target) and cannot detect extra rows that exist only in the target. To catch such cases, consider setting up a [Reverse Check](#reverse-check) by swapping extractor and checker configurations.
@@ -34,7 +29,7 @@ In CDC + Check scenarios, the checker validates DELETE events: it queries the ta
34
29
35
30
# Check Results
36
31
37
-
The check results are written to the log in json format, including diff.log, miss.log, sql.log and summary.log. The logs are stored in the log/check subdirectory.
32
+
The check results are written to the log in JSON format, including diff.log, miss.log, sql.log, and summary.log. The logs are stored in the `log/check` subdirectory.
38
33
39
34
## Difference Log (diff.log)
40
35
@@ -46,13 +41,13 @@ Difference logs include database (schema), table (tb), primary key/unique key (i
When the source and target types are different (such as Int32 vs Int64, or None vs Short), `src_type`/`dst_type` will appear under the corresponding column, clearly marking the type inconsistency. Mongo also applies this rule, and the difference log will output the BSON type name.
44
+
When the source and target types are different (such as Int32 vs Int64, or None vs Short), `src_type`/`dst_type` will appear under the corresponding column, clearly marking the type inconsistency. MongoDB also applies this rule, and the difference log will output the BSON type name.
50
45
51
-
Only when the route renames the schema or table, the log will supplement `target_schema`/`target_tb` to identify the real destination database table;`schema`,`tb` still represent the source, facilitating troubleshooting.
46
+
Only when the router renames the schema or table will the log include `target_schema`/`target_tb` to identify the real destination table.`schema` and`tb` still represent the source, facilitating troubleshooting.
52
47
53
48
## Missing Log (miss.log)
54
49
55
-
Missing logs include database (schema), table (tb) and primary/unique key (id_col_values). Since missing records do not have difference columns, `diff_col_values` will not be output.
50
+
Missing logs include database (schema), table (tb), and primary/unique key (id_col_values). Since missing records do not have difference columns, `diff_col_values` will not be output.
@@ -62,14 +57,14 @@ Missing logs include database (schema), table (tb) and primary/unique key (id_co
62
57
63
58
## Output Full Row
64
59
65
-
When the business needs full row content for troubleshooting exceptions, you can enable full row logging in `[checker]`:
60
+
When you need full row content for troubleshooting, you can enable full row logging in `[checker]`:
66
61
67
62
```
68
63
[checker]
69
64
output_full_row=true
70
65
```
71
66
72
-
After enabling, all diff.log will append `src_row` and `dst_row`, and miss.log will append `src_row` (currently only supports MySQL/PG/Mongo, Redis is not supported yet). Example:
67
+
After enabling, all diff.log entries will append `src_row` and `dst_row`, and miss.log entries will append `src_row` (currently only supports MySQL/PG/Mongo; Redis is not supported yet). Example:
73
68
74
69
```json
75
70
{
@@ -103,7 +98,7 @@ After enabling, all diff.log will append `src_row` and `dst_row`, and miss.log w
103
98
104
99
## Output Revise SQL
105
100
106
-
If the business needs to manually repair different data, you can enable SQL output in `[checker]`:
101
+
If you need to manually repair inconsistent data, you can enable SQL output in `[checker]`:
107
102
108
103
```
109
104
[checker]
@@ -112,9 +107,13 @@ output_revise_sql=true
112
107
revise_match_full_row=true
113
108
```
114
109
115
-
After enabling, `INSERT` statements for missing records and `UPDATE` statements for differing records will be written to `sql.log`. When `revise_match_full_row=true`, even if the table has a primary key, it will use the entire row data to generate the WHERE condition, so as to locate the target data through the full row value. If the route is not renamed, `target_schema`/`target_tb` will not be output, and these two fields are only needed to determine the table where the SQL should be executed when renaming.
110
+
After enabling, `INSERT` statements for missing records and `UPDATE` statements for differing records will be written to `sql.log`.
111
+
112
+
When `revise_match_full_row=true`, the entire row data is used to generate the WHERE condition even if the table has a primary key, so that the target row is located by matching all column values.
113
+
114
+
If the router does not rename the schema or table, `target_schema`/`target_tb` will not appear in the log. These two fields are only needed to determine the destination table when routing renames are configured.
116
115
117
-
The generated SQL is essentially the SQL that the sinker needs to execute to correct the target data to be consistent with the source; it directly uses the real destination schema/table, so it can be executed directly at the target (refer to `target_schema`/`target_tb` to determine the final target object when routing renames).
116
+
The generated SQL uses the real destination schema/table and can be executed directly at the target. When routing renames are configured, refer to `target_schema`/`target_tb` to determine the final target object.
Swap the [extractor] and [checker] target configurations to perform reverse check.
163
+
Data check is source-driven and only verifies that source rows exist in the target. To detect extra rows in the target that do not exist in the source, set up a reverse check by swapping the `[extractor]` and `[checker]` target configurations:
164
+
165
+
```
166
+
# Original: source=A, target=B
167
+
# Reverse: source=B, target=A
168
+
[extractor]
169
+
url=<original checker url>
165
170
166
-
# Checker Configuration Parameters
171
+
[checker]
172
+
url=<original extractor url>
173
+
```
174
+
175
+
# Configuration
167
176
168
177
See [config.md](../config.md) for the full `[checker]` configuration list and target selection rules.
169
178
@@ -174,10 +183,13 @@ When `max_retries > 0`, the checker automatically retries on inconsistency:
174
183
- Detailed miss/diff logs are only written on the final check
175
184
- Useful when target data synchronization is not yet complete
176
185
177
-
# Other Configurations
186
+
## Router
187
+
188
+
Supports the `[router]` configuration section. Refer to [config details](../config.md) for details.
189
+
190
+
## Integration Test References
178
191
179
-
- Support [router], please refer to [config details](../config.md) for details.
180
-
- Refer to task_config.ini of each type of integration test:
181
-
- dt-tests/tests/mysql_to_mysql/check
182
-
- dt-tests/tests/pg_to_pg/check
183
-
- dt-tests/tests/mongo_to_mongo/check
192
+
Refer to `task_config.ini` of each type of integration test:
Copy file name to clipboardExpand all lines: docs/en/structure/check.md
+12-12Lines changed: 12 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,40 +1,40 @@
1
-
# Check structures
1
+
# Structure Check
2
2
3
-
After structure migration, you can choose from two methods for verification. One is provided by us, and the other is an opensource tool called [Liquibase](./check_liquibase.md). This document primarily focuses on the former one.
3
+
After structure migration, you can choose from two verification methods. One is the built-in checker provided by ape-dts, and the other is an open-source tool called [Liquibase](./check_by_liquibase.md). This document focuses on the built-in checker.
4
4
5
-
Structure check is independent of CDC. CDC + checker applies to row-level data check (see data check docs).
5
+
Structure check is independent of CDC. "CDC + checker" refers to row-level data check (see [data check docs](../snapshot/check.md)).
6
+
7
+
## Example: MySQL -> MySQL
6
8
7
-
# Example: MySQL -> MySQL
8
9
Refer to [task templates](../../templates/mysql_to_mysql.md)
9
10
10
11
# Results
11
12
12
-
Based on the source structures, the check results include **miss**, **diff**, and **summary**, all presented in JSON.
13
-
`miss.log`, `diff.log` use the same JSON structure (`StructCheckLog`):
13
+
Based on the source structures, the check results include **miss**, **diff**, and **summary**, all presented in JSON format.
14
+
15
+
`miss.log` and `diff.log` use the same JSON structure (`StructCheckLog`):
14
16
15
17
```json
16
18
{
17
19
"key": "type.schema.table", // e.g., table.db_name.tb_name or index.db.tb.idx
18
20
"src_sql": "CREATE TABLE `table_name` (id INT PRIMARY KEY)", // appears in miss/diff
19
-
"dst_sql": "CREATE TABLE `table_name` (id INT PRIMARY KEY)"// appears in diff
21
+
"dst_sql": "CREATE TABLE `table_name` (id INT PRIMARY KEY)"// appears in diff only
20
22
}
21
23
```
22
24
23
-
-`miss.log` (Present in source but missing in destination)
25
+
-`miss.log` (present in source but missing in target)
24
26
```json
25
27
{"key":"table.struct_check_test_1.not_match_miss","src_sql":"CREATE TABLE IF NOT EXISTS `not_match_miss` (`id` int NOT NULL PRIMARY KEY)"}
26
28
{"key":"index.struct_check_test_1.not_match_index.i6_miss","src_sql":"CREATE INDEX `i6_miss` ON `not_match_index` (`col6`)"}
27
29
```
28
30
29
-
-`diff.log` (Present in both but different; contains both src_sql and dst_sql)
31
+
-`diff.log` (present in both but different; contains both src_sql and dst_sql)
30
32
```json
31
33
{"key":"index.struct_check_test_1.not_match_index","src_sql":"ALTER TABLE `not_match_index` ADD INDEX `idx_v1` (`col1`)","dst_sql":"ALTER TABLE `not_match_index` ADD INDEX `idx_v2` (`col1`)"}
0 commit comments