cloud: Premium supports data migration#22821
cloud: Premium supports data migration#22821alastori wants to merge 9 commits intopingcap:release-8.5from
Conversation
Add a new Public Preview guide for using the Data Migration feature on TiDB Cloud Premium, plus the corresponding entry in the Premium TOC. Mirrors the structure of premium-export.md.
- Update wizard structure to 4 steps (add Precheck as Step 3) - Tighten Job Name constraints language to match wizard helper text - Note that Private Link is in development and not yet generally available Verified against the Premium DM proto enums and the dev wizard text; prod release tag does not yet include the Private Link backend support, so the doc deliberately documents Public-only connectivity.
The 60-second safe-mode behavior is implemented in the legacy DM stack (used by Dedicated and Essential) and does not apply to the Premium DM service. Verified via dataflow-service-ng/app/models/ premium_dm/ which contains no safe-mode references.
Verified the complete wizard flow against the dev environment with a real MySQL source connection. Several corrections: - Step 2 has two controls under Migration Type: "Migration process" (Full + Incremental / Incremental only) and "Existing data migration mode" (Logical default / Physical). Document both. - Object selection is an All / Customize toggle, with Customize revealing a transfer-list pattern between source and selected. - Step 3 is named "Pre-check" (hyphenated) in the UI; "Check Again" re-runs; warnings can be ignored via a confirmation dialog. - Mode label is "Incremental only", not "Incremental Data Only". - Step 4 review shows three sections: Job Configuration, Source Connection Profile, Target Connection Profile. - PROCESS privilege is also recommended; pre-check warns when missing.
Safe mode is implemented in the tiflow DM kernel (used by Premium DM via the agent layer), not in the cloud control plane. The earlier removal was based on a search of the dataflow-service repo only, which is incomplete. Restoring the 60-second safe-mode note so the Premium doc matches the underlying replication engine behavior.
Customers reading the new Premium DM guide cross-reference the canonical Cloud DM doc for binary-log setup, privileges, and limitations. Without Premium variants in the canonical doc, those links would either render Dedicated-default content or leave tier placeholders blank. Changes: - TOC-tidb-cloud-premium.md: add the canonical and incremental-only Cloud DM docs as siblings of premium-data-migration.md so Premium customers can navigate to them. - tidb-cloud/migrate-from-mysql-using-data-migration.md: add Premium tier to all inline tier-name placeholders, plus three new Premium variant blocks: Public Preview note, supported sources matrix, and the Physical / Logical mode discussion (including PITR / changefeed and concurrent-job caveats for physical mode). - tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration.md: add Premium tier to all inline tier-name placeholders. - tidb-cloud/premium/premium-data-migration.md: add the two physical-mode caveats (PITR / changefeed; concurrent-job limit) inline so they are visible in the Premium-tier overview without requiring readers to click through. The Dedicated and Essential renderings of all three docs are unchanged.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
cc @Oreoxmt |
The canonical Cloud DM doc anchors are: - "grant-required-privileges-to-the-migration-user-in-the-source-mysql-database" (note "source-mysql", not just "source") - "grant-required-privileges-for-migration" (parent ### section; the target-side ## #### heading uses CustomContent variants and the rendered anchor is not stable, so link to the parent instead) Detected by the internal-links-anchors CI job on PR pingcap#22821.
There was a problem hiding this comment.
Code Review
This pull request introduces documentation for migrating data to TiDB Cloud Premium using the Data Migration feature, including a new guide and updates to existing migration docs to incorporate Premium-specific details like logical and physical migration modes. The review feedback focuses on style guide adherence, specifically recommending the removal of passive voice, ensuring consistent terminology, using backticks for SQL keywords, and correcting minor grammatical and tense issues.
| <CustomContent plan="premium"> | ||
|
|
||
| - For {{{ .premium }}}, both logical mode (default) and physical mode are supported. Logical mode exports rows as SQL statements and replays them on the target instance, consuming Request Capacity Units (RCUs) on the target during the load. Physical mode uses `IMPORT INTO` on the target instance and is recommended for large datasets where load throughput and cost are priorities. | ||
| - When you use physical mode and the migration job has started, do **NOT** enable PITR (Point-in-time Recovery) or have any changefeed on the {{{ .premium }}} instance. Otherwise, the migration job will be stuck. If you need to enable PITR or have any changefeed, use logical mode instead to migrate data. |
There was a problem hiding this comment.
Avoid using passive voice. State the subject clearly.
| - When you use physical mode and the migration job has started, do **NOT** enable PITR (Point-in-time Recovery) or have any changefeed on the {{{ .premium }}} instance. Otherwise, the migration job will be stuck. If you need to enable PITR or have any changefeed, use logical mode instead to migrate data. | |
| - When you use physical mode and the migration job has started, do **NOT** enable PITR (Point-in-time Recovery) or have any changefeed on the {{{ .premium }}} instance. Otherwise, the migration job stops. If you need to enable PITR or have any changefeed, use logical mode instead to migrate data. |
References
- Avoid passive voice overuse. (link)
|
|
||
| > **Note:** | ||
| > | ||
| > The Data Migration feature for {{{ .premium }}} is currently in Public Preview. During Public Preview, the source database must be reachable over a public network endpoint, and the source connection cannot be reused across migration jobs. For details, see [Limitations](#limitations). |
There was a problem hiding this comment.
Avoid using passive voice. State the subject clearly.
| > The Data Migration feature for {{{ .premium }}} is currently in Public Preview. During Public Preview, the source database must be reachable over a public network endpoint, and the source connection cannot be reused across migration jobs. For details, see [Limitations](#limitations). | |
| The Data Migration feature for {{{ .premium }}} is currently in Public Preview. During Public Preview, the source database must be reachable over a public network endpoint, and you cannot reuse the source connection across migration jobs. For details, see [Limitations](#limitations). |
References
- Avoid passive voice overuse. (link)
|
|
||
| When you use physical mode, the following limitations apply: | ||
|
|
||
| - After the migration job has started, do **NOT** enable PITR (Point-in-time Recovery) or have any changefeed on the {{{ .premium }}} instance. Otherwise, the migration job will be stuck. If you need to enable PITR or have any changefeed, use logical mode instead. |
There was a problem hiding this comment.
Avoid using passive voice. State the subject clearly.
| - After the migration job has started, do **NOT** enable PITR (Point-in-time Recovery) or have any changefeed on the {{{ .premium }}} instance. Otherwise, the migration job will be stuck. If you need to enable PITR or have any changefeed, use logical mode instead. | |
| - After the migration job has started, do **NOT** enable PITR (Point-in-time Recovery) or have any changefeed on the {{{ .premium }}} instance. Otherwise, the migration job stops. If you need to enable PITR or have any changefeed, use logical mode instead. |
References
- Avoid passive voice overuse. (link)
| ### General limitations | ||
|
|
||
| - The system databases `mysql`, `information_schema`, `performance_schema`, and `sys` are filtered out and not migrated, even if you select all databases. | ||
| - During existing data migration, if the target database already contains the table to be migrated and there are duplicate keys, the rows with duplicate keys are replaced. |
There was a problem hiding this comment.
Avoid using passive voice. State the subject clearly.
| - During existing data migration, if the target database already contains the table to be migrated and there are duplicate keys, the rows with duplicate keys are replaced. | |
| - During existing data migration, if the target database already contains the table to be migrated and there are duplicate keys, TiDB Cloud replaces the rows with duplicate keys. |
References
- Avoid passive voice overuse. (link)
|
|
||
| - The system databases `mysql`, `information_schema`, `performance_schema`, and `sys` are filtered out and not migrated, even if you select all databases. | ||
| - During existing data migration, if the target database already contains the table to be migrated and there are duplicate keys, the rows with duplicate keys are replaced. | ||
| - During incremental data migration, if a migration job recovers from an abrupt error, it might enter safe mode for 60 seconds. During safe mode, `INSERT` statements are migrated as `REPLACE`, and `UPDATE` statements as `DELETE` and `REPLACE`. For source tables without primary keys or non-null unique indexes, this can result in duplicated rows in the target instance. |
There was a problem hiding this comment.
Use backticks for SQL keywords and avoid passive voice.
| - During incremental data migration, if a migration job recovers from an abrupt error, it might enter safe mode for 60 seconds. During safe mode, `INSERT` statements are migrated as `REPLACE`, and `UPDATE` statements as `DELETE` and `REPLACE`. For source tables without primary keys or non-null unique indexes, this can result in duplicated rows in the target instance. | |
| - During incremental data migration, if a migration job recovers from an abrupt error, it might enter safe mode for 60 seconds. During safe mode, TiDB Cloud migrates `INSERT` statements as `REPLACE`, and `UPDATE` statements as `DELETE` and `REPLACE`. For source tables without primary keys or non-null unique indexes, this can result in duplicated rows in the target instance. |
|
|
||
| 4. On the **Configure source and target connection** step, enter the following information: | ||
|
|
||
| - **Job Name**: a name for the migration job. The default value is `migration_job_{timestamp}`. The name must start with a letter, can contain letters, numbers, underscores (`_`), and hyphens (`-`), and must be less than 60 characters. |
There was a problem hiding this comment.
Use 'fewer' for countable items and prefer present tense.
| - **Job Name**: a name for the migration job. The default value is `migration_job_{timestamp}`. The name must start with a letter, can contain letters, numbers, underscores (`_`), and hyphens (`-`), and must be less than 60 characters. | |
| - **Job Name**: a name for the migration job. The default value is `migration_job_{timestamp}`. The name must start with a letter, contains letters, numbers, underscores (`_`), and hyphens (`-`), and must be fewer than 60 characters. |
References
- Prefer present tense unless describing historical behavior. (link)
|
|
||
| In the **Select Objects to Migrate** section, choose: | ||
|
|
||
| - **All** (default): migrate every database and table on the source. The system databases (`mysql`, `information_schema`, `performance_schema`, `sys`) are excluded automatically. |
There was a problem hiding this comment.
Avoid using passive voice. State the subject clearly.
| - **All** (default): migrate every database and table on the source. The system databases (`mysql`, `information_schema`, `performance_schema`, `sys`) are excluded automatically. | |
| - **All** (default): migrate every database and table on the source. TiDB Cloud automatically excludes the system databases (`mysql`, `information_schema`, `performance_schema`, `sys`). |
References
- Avoid passive voice overuse. (link)
|
|
||
| ### Step 3: Pre-check | ||
|
|
||
| The console runs the pre-check against the source database, network connectivity, and the target {{{ .premium }}} instance. The progress bar shows **Running {percentage}%** while checks execute, and **Finished 100%** when complete. The summary line reports total items, completed, passed, with warning, and failed. |
There was a problem hiding this comment.
Grammar correction: 'with warnings' instead of 'with warning'.
| The console runs the pre-check against the source database, network connectivity, and the target {{{ .premium }}} instance. The progress bar shows **Running {percentage}%** while checks execute, and **Finished 100%** when complete. The summary line reports total items, completed, passed, with warning, and failed. | |
| The console runs the pre-check against the source database, network connectivity, and the target {{{ .premium }}} instance. The progress bar shows **Running {percentage}%** while checks execute, and **Finished 100%** when complete. The summary line reports the total number of items, including those that are completed, passed, with warnings, or failed. |
References
- Correct English grammar, spelling, and punctuation mistakes, if any. (link)
| The review page shows three sections summarizing the migration job: | ||
|
|
||
| - **Job Configuration**: job name and migration type. | ||
| - **Source Connection Profile**: data source, host, port, connectivity method, username, SSL/TLS status, selected objects, and import mode. |
There was a problem hiding this comment.
Terminology consistency: use 'existing data migration mode' as defined earlier in the document.
| - **Source Connection Profile**: data source, host, port, connectivity method, username, SSL/TLS status, selected objects, and import mode. | |
| - **Source Connection Profile**: data source, host, port, connectivity method, username, SSL/TLS status, selected objects, and existing data migration mode. |
References
- Use consistent terminology. (link)
Apply 7 of 9 Gemini suggestions on PR pingcap#22821, all marked low priority and aligned with pingcap/docs styleguide: - Active voice: replace "the source connection cannot be reused" with "you cannot reuse the source connection". - Active voice: replace "rows ... are replaced" with "TiDB Cloud replaces the rows" in existing-data limitation. - Active voice + subject clarity: replace "INSERT statements are migrated as ..." with "TiDB Cloud migrates INSERT statements as ...". - Active voice: replace "the migration job will be stuck" with "the migration job stops" (Premium DM doc + canonical Cloud DM doc). - Active voice + subject clarity: replace "system databases ... are excluded automatically" with "TiDB Cloud automatically excludes the system databases". - Grammar: "with warning" -> "with warnings"; rephrase pre-check summary line for clarity. - Terminology consistency: in Step 4 review section, replace "import mode" with "the existing data migration mode (shown as Import Mode on the review page)" to bridge the wizard's two labels for the same concept. Skipped: the suggestion to use "fewer than 60 characters" / "contains letters" instead of "less than 60 characters" / "can contain letters" is intentionally rejected; the current wording mirrors the wizard's helper text verbatim.
End-to-end wizard verification on the dev cluster created a real migration job (id dmtskc3frek3p5fhy7ixu6wpj7cy2r4) and inspected the post-creation experience: - The Job Detail page does not expose action buttons (just Summary and Progress panels). - The list-page actions menu (the "..." button at the end of each row) shows different items based on job status. While the job is in Creating state, only View and Delete are visible. Pause and Resume become available once the job reaches a running or paused state. Doc previously implied Pause/Resume/Delete were always available from the detail page or the list. Replaced with status-aware phrasing and noted the Creating-state subset explicitly. The dev cluster job remained in Creating for 9+ minutes without transitioning, matching the March AS-IS report KI-5 (dev infrastructure issue, not a feature gap), so Pause/Resume behavior was confirmed via API surface (PausePremiumMigration / ResumePremiumMigration RPCs in proto) rather than the UI.
|
/cc @Oreoxmt |
|
/assign |
There was a problem hiding this comment.
@alastori I suggest removing tidb-cloud/premium/premium-data-migration.md, or reducing it to a short overview page only. Detailed supported source databases, prerequisites, and migration steps are already covered in tidb-cloud/migrate-from-mysql-using-data-migration.md and tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration.md.
What is changed, added or deleted?
This PR adds documentation for the Data Migration feature on TiDB Cloud Premium, which is launching in Public Preview, and extends the canonical Cloud DM docs to render correctly for Premium readers.
New file:
tidb-cloud/premium/premium-data-migration.md: Premium-tier overview that mirrors the structure ofpremium-export.md. Covers the 4-step wizard (Configure source and target connection, Choose objects to be migrated, Pre-check, Review and start migration), Public Preview limitations, the Logical / Physical existing-data-migration mode choice (with PITR / changefeed and concurrent-job caveats for physical mode), supported source databases, prerequisites, privileges (includingPROCESS), and post-creation job management (Pause / Resume / Delete).Modified files:
TOC-tidb-cloud-premium.md: adds the new Premium-tier overview, plus the canonical Cloud DM doc and the incremental-only Cloud DM doc as siblings, so Premium customers can reach the detailed reference content from the Premium navigation.tidb-cloud/migrate-from-mysql-using-data-migration.md: adds Premium tier rendering — 16 inline tier-name placeholders gain a Premium variant, plus three new Premium-variant content blocks (Public Preview note, supported sources matrix, and the Physical / Logical existing-data-migration mode discussion including the physical-mode caveats).tidb-cloud/migrate-incremental-data-from-mysql-using-data-migration.md: adds Premium tier-name substitutions so the incremental-only guide renders cleanly when accessed from the Premium TOC.The Dedicated and Essential renderings of all three docs are unchanged; the Premium additions are purely additive
<CustomContent plan=\"premium\">blocks alongside the existing Dedicated and Essential blocks.Verification
The Premium-tier overview was verified end-to-end against the dev environment with a real MySQL source connection, walking all four wizard steps (Configure → Choose objects → Pre-check → Review), confirming wizard text labels, dropdown options, and behavioral details (such as the 11-item pre-check, the Pre-check warnings dialog, and the Logical-default existing-data-migration mode). Wizard-text drift between Premium and the canonical doc was reconciled in this PR (for example: "Pre-check" hyphenated, "Check Again" instead of "Recheck", "Incremental only" instead of "Incremental Data Only").
Which TiDB version(s) do your changes apply to?
What is the related PR or file when changing an API or RFC?
N/A — documentation only.
Do your changes match any of the following descriptions?
<CustomContent plan=\"premium\">blocks in existing docs