@@ -74,6 +74,9 @@ See [contributing instructions](CONTRIBUTING.md) to help improve this project.
7474- Databricks Workspace has to have network access to [ pypi.org] ( https://pypi.org ) to download ` databricks-sdk ` and ` pyyaml ` packages.
7575- A PRO or Serverless SQL Warehouse to render the [ report] ( docs/assessment.md ) for the [ assessment workflow] ( #assessment-workflow ) .
7676
77+ Once you [ install UCX] ( #install-ucx ) , you can proceed to the [ assessment workflow] ( #assessment-workflow ) to ensure
78+ the compatibility of your workspace with Unity Catalog.
79+
7780[[ back to top] ( #databricks-labs-ucx )]
7881
7982## Authenticate Databricks CLI
@@ -103,8 +106,9 @@ Install UCX via Databricks CLI:
103106databricks labs install ucx
104107```
105108
106- You'll be prompted to select a [ configuration profile] ( https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication )
107- created by ` databricks auth login ` command.
109+ You'll be prompted to select a [ configuration profile] ( https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication ) created by ` databricks auth login ` command.
110+
111+ Once you install, proceed to the [ assessment workflow] ( #assessment-workflow ) to ensure the compatibility of your workspace with UCX.
108112
109113The ` WorkspaceInstaller ` class is used to create a new configuration for Unity Catalog migration in a Databricks workspace.
110114It guides the user through a series of prompts to gather necessary information, such as selecting an inventory database, choosing
@@ -142,14 +146,14 @@ If there is an existing global installation of UCX, you can force a user install
142146At this moment there is no global override over a user installation of UCX. As this requires migration and can break existing installations.
143147
144148
145- | global | user | expected install location | install_folder | mode |
146- | --- | --- | --- | --- | --- |
147- | no | no | default | ` /Applications/ucx ` | install |
148- | yes | no | default | ` /Applications/ucx ` | upgrade |
149- | no | yes | default | ` /Users/X/.ucx ` | upgrade (existing installations must not break) |
150- | yes | yes | default | ` /Users/X/.ucx ` | upgrade |
151- | yes | no | ** USER** | ` /Users/X/.ucx ` | install (show prompt) |
152- | no | yes | ** GLOBAL** | ... | migrate |
149+ | global | user | expected install location | install_folder | mode |
150+ | -------- | ------ | --------------------------- | --------------------- | ------------------------------------------------- |
151+ | no | no | default | ` /Applications/ucx ` | install |
152+ | yes | no | default | ` /Applications/ucx ` | upgrade |
153+ | no | yes | default | ` /Users/X/.ucx ` | upgrade (existing installations must not break) |
154+ | yes | yes | default | ` /Users/X/.ucx ` | upgrade |
155+ | yes | no | ** USER** | ` /Users/X/.ucx ` | install (show prompt) |
156+ | no | yes | ** GLOBAL** | ... | migrate |
153157
154158
155159* ` UCX_FORCE_INSTALL=user databricks labs install ucx ` - will force the installation to be for user only
@@ -199,7 +203,9 @@ Databricks CLI will confirm a few options:
199203
200204# Migration process
201205
202- On the high level, the steps in migration process can be described as:
206+ On the high level, the steps in migration process start with the [ assessment workflow] ( #assessment-workflow ) ,
207+ followed by [ group migration] ( #group-migration-workflow ) , [ table migration] ( #table-migration-commands ) ,
208+ finalised with the [ code migration] ( #code-migration-commands ) . It can be described as:
203209
204210``` mermaid
205211flowchart TD
@@ -259,6 +265,9 @@ databricks labs ucx ensure-assessment-run
259265
260266![ ucx_assessment_workflow] ( docs/ucx_assessment_workflow.png )
261267
268+ Once you finish the assessment, proceed to the [ group migration workflow] ( #group-migration-workflow ) .
269+ See the [ migration process diagram] ( #migration-process ) to understand the role of the assessment workflow in the migration process.
270+
262271The assessment workflow is designed to assess the compatibility of various entities in the current workspace with Unity Catalog.
263272It identifies incompatible entities and provides information necessary for planning the migration to UC. The tasks in
264273the assessment workflow can be executed in parallel or sequentially, depending on the dependencies specified in the ` @task ` decorators.
@@ -286,12 +295,17 @@ After UCX assessment workflow is executed, the assessment dashboard will be popu
286295
287296## Group migration workflow
288297
298+ You are required to complete the [ assessment workflow] ( #assessment-workflow ) before starting the group migration workflow.
299+ See the [ migration process diagram] ( #migration-process ) to understand the role of the group migration workflow in the migration process.
300+
289301See the [ detailed design] ( docs/local-group-migration.md ) of this workflow. It helps you to upgrade all Databricks workspace assets:
290302Legacy Table ACLs, Entitlements, AWS instance profiles, Clusters, Cluster policies, Instance Pools,
291303Databricks SQL warehouses, Delta Live Tables, Jobs, MLflow experiments, MLflow registry, SQL Dashboards & Queries,
292304SQL Alerts, Token and Password usage permissions that are set on the workspace level, Secret scopes, Notebooks,
293305Directories, Repos, and Files.
294306
307+ Once done with the group migration, proceed to [ table migration] ( #table-migration-commands ) .
308+
295309Use [ ` validate-groups-membership ` command] ( #validate-groups-membership-command ) for extra confidence.
296310If you don't have matching account groups, please run [ ` create-account-groups ` command] ( #create-account-groups-command ) .
297311
@@ -358,6 +372,8 @@ status can be listed with the [`workflows` command](#workflows-command).
358372
359373## ` workflows ` command
360374
375+ See the [ migration process diagram] ( #migration-process ) to understand the role of each workflow in the migration process.
376+
361377``` text
362378$ databricks labs ucx workflows
363379Step State Started
@@ -435,7 +451,32 @@ related to multiple installations of `ucx` on the same workspace.
435451
436452# Table migration commands
437453
438- These commands are vital part of [ table migration] ( docs/table_upgrade.md ) process.
454+ These commands are vital part of [ table migration] ( docs/table_upgrade.md ) process and require
455+ the [ assessment workflow] ( #assessment-workflow ) and
456+ [ group migration workflow] ( #group-migration-workflow ) to be completed.
457+ See the [ migration process diagram] ( #migration-process ) to understand the role of the table migration commands in
458+ the migration process.
459+
460+ The first step is to run the [ ` principal-prefix-access ` command] ( #principal-prefix-access-command ) to identify all
461+ the storage accounts used by tables in the workspace and their permissions on each storage account.
462+
463+ If you don't have any storage credentials and external locations configured, you'll need to run
464+ the [ ` migrate-credentials ` command] ( #migrate-credentials-command ) to migrate the service principals
465+ and [ ` migrate-locations ` command] ( #migrate-locations-command ) to create the external locations.
466+ If some of the external locations already exist, you should run
467+ the [ ` validate-external-locations ` command] ( #validate-external-locations-command ) .
468+ You'll need to create the [ uber principal] ( #create-uber-principal-command ) with the _ ** access to all storage** _ used to tables in
469+ the workspace, so that you can migrate all the tables. If you already have the principal, you can skip this step.
470+
471+ Ask your Databricks Account admin to run the [ ` sync-workspace-info ` command] ( #sync-workspace-info-command ) to sync the
472+ workspace information with the UCX installations. Once the workspace information is synced, you can run the
473+ [ ` create-table-mapping ` command] ( #create-table-mapping-command ) to align your tables with the Unity Catalog,
474+ [ create catalogs and schemas] ( #create-catalogs-schemas-command ) and start the migration. During multiple runs of
475+ the table migration workflow, you can use the [ ` revert-migrated-tables ` command] ( #revert-migrated-tables-command ) to
476+ revert the tables that were migrated in the previous run. You can also skip the tables that you don't want to migrate
477+ using the [ ` skip ` command] ( #skip-command ) .
478+
479+ Once you're done with the table migration, proceed to the [ code migration] ( #code-migration-commands ) .
439480
440481[[ back to top] ( #databricks-labs-ucx )]
441482
@@ -445,10 +486,10 @@ These commands are vital part of [table migration](docs/table_upgrade.md) proces
445486databricks labs ucx principal-prefix-access [--subscription-id <Azure Subscription ID>] [--aws-profile <AWS CLI profile>]
446487```
447488
448- This command depends on results from the [ assessment workflow] ( #assessment-workflow ) and requires AWS CLI or Azure CLI
449- to be installed and authenticated for the given machine. This command identifies all the storage accounts used by tables
450- in the workspace and their permissions on each storage account. Required to be run before
451- the [ ` migrate-credentials ` command] ( #migrate-credentials-command ) .
489+ This command depends on results from the [ assessment workflow] ( #assessment-workflow ) and requires [ AWS CLI] ( #access-for-aws-s3-buckets )
490+ or [ Azure CLI ] ( #access-for-azure-storage-accounts ) to be installed and authenticated for the given machine. This command
491+ identifies all the storage accounts used by tables in the workspace and their permissions on each storage account.
492+ Once you're done running this command, proceed to the [ ` migrate-credentials ` command] ( #migrate-credentials-command ) .
452493
453494[[ back to top] ( #databricks-labs-ucx )]
454495
@@ -460,7 +501,9 @@ databricks labs ucx principal-prefix-access --aws-profile test-profile
460501
461502Use to identify all instance profiles in the workspace, and map their access to S3 buckets.
462503Also captures the IAM roles which has UC arn listed, and map their access to S3 buckets
463- This requires ` aws ` CLI to be installed and configured.
504+ This requires ` aws ` CLI to be installed and configured.
505+
506+ Once done, proceed to the [ ` migrate-credentials ` command] ( #migrate-credentials-command ) .
464507
465508[[ back to top] ( #databricks-labs-ucx )]
466509
@@ -473,6 +516,8 @@ databricks labs ucx principal-prefix-access --subscription-id test-subscription-
473516Use to identify all storage account used by tables, identify the relevant Azure service principals and their permissions
474517on each storage account. This requires Azure CLI to be installed and configured via ` az login ` .
475518
519+ Once done, proceed to the [ ` migrate-credentials ` command] ( #migrate-credentials-command ) .
520+
476521[[ back to top] ( #databricks-labs-ucx )]
477522
478523## ` create-uber-principal ` command
@@ -487,6 +532,10 @@ workspace and configure the [UCX Cluster Policy](#installation) with the details
487532service principal should be unprovisioned. On Azure, it creates a principal with ` Storage Blob Data Reader ` role
488533assignment on every storage account using Azure Resource Manager APIs.
489534
535+ Once done, proceed to the launching the [ table migration workflow] ( #table-migration-commands ) .
536+
537+ [[ back to top] ( #databricks-labs-ucx )]
538+
490539## ` migrate-credentials ` command
491540
492541``` commandline
@@ -501,7 +550,7 @@ by [`principal-prefix-access` command](#principal-prefix-access-command).
501550Please review the file and delete the Service Principals you do not want to be migrated.
502551The command will only migrate the Service Principals that have client secret stored in Databricks Secret.
503552
504- Run [ ` validate-external-locations ` command] ( #validate-external-locations-command ) after this one.
553+ Once you're done with this command, run [ ` validate-external-locations ` command] ( #validate-external-locations-command ) after this one.
505554
506555[[ back to top] ( #databricks-labs-ucx )]
507556
@@ -516,8 +565,9 @@ run this command to validate and report the missing Unity Catalog external locat
516565
517566This command validates and provides mapping to external tables to external locations, also as Terraform configurations.
518567
519- [[ back to top ] ( #databricks-labs-ucx )]
568+ Once you're done with this command, proceed to the [ migrate locations ] ( #migrate-locations-command ) command.
520569
570+ [[ back to top] ( #databricks-labs-ucx )]
521571
522572## ` migrate-locations ` command
523573
@@ -529,6 +579,8 @@ Once the [`assessment` workflow](#assessment-workflow) finished successfully, an
529579run this command to have Unity Catalog external locations created. The candidate locations to be created are extracted from guess_external_locations
530580task in the assessment job. You can run [ validate_external_locations] ( #validate-external-locations-command ) command to check the candidate locations.
531581
582+ Once you're done with this command, proceed to the [ create-table-mapping] ( #create-table-mapping-command ) command.
583+
532584[[ back to top] ( #databricks-labs-ucx )]
533585
534586## ` create-table-mapping ` command
@@ -549,6 +601,13 @@ labs-azure,labs_azure,default,default,ucx_tybzs,ucx_tybzs
549601You are supposed to review this mapping and adjust it if necessary. This file is in CSV format, so that you can edit it
550602easier in your favorite spreadsheet application.
551603
604+ Once you're done with this command, [ create catalogs and schemas] ( #create-catalogs-schemas-command ) . During
605+ multiple runs of the table migration workflow, you can use the [ ` revert-migrated-tables ` command] ( #revert-migrated-tables-command )
606+ to revert the tables that were migrated in the previous run. You can also skip the tables that you don't want to migrate
607+ using the [ ` skip ` command] ( #skip-command ) .
608+
609+ Once you're done with table migration, proceed to the [ code migration] ( #code-migration-commands ) .
610+
552611[[ back to top] ( #databricks-labs-ucx )]
553612
554613## ` skip ` command
@@ -564,6 +623,8 @@ The command takes `--schema` and optionally `--table` flags to specify the schem
564623is provided, all tables in the specified HMS database are skipped.
565624This command is useful to temporarily disable migration on a particular schema or table.
566625
626+ Once you're done with table migration, proceed to the [ code migration] ( #code-migration-commands ) .
627+
567628[[ back to top] ( #databricks-labs-ucx )]
568629
569630## ` revert-migrated-tables ` command
@@ -578,6 +639,8 @@ This command removes the `upgraded_from` property on a migrated table for re-mig
578639This command is useful for developers and administrators who want to revert the migration of a table. It can also be used
579640to debug issues related to table migration.
580641
642+ Go back to the [ create-table-mapping] ( #create-table-mapping-command ) command after you're done with this command.
643+
581644[[ back to top] ( #databricks-labs-ucx )]
582645
583646## ` create-catalogs-schemas ` command
@@ -588,6 +651,8 @@ databricks labs ucx create-catalogs-schemas
588651After [ ` create-table-mapping ` command] ( #create-table-mapping-command ) is executed, you can run this command to have the required UC catalogs and schemas created.
589652This command is supposed to be run before migrating tables to UC.
590653
654+ Once you're done with this command, proceed to the [ table migration workflow] ( #table-migration-commands ) .
655+
591656[[ back to top] ( #databricks-labs-ucx )]
592657
593658## ` move ` command
@@ -608,8 +673,6 @@ This command moves different table types differently:
608673This is due to Unity Catalog not supporting multiple tables with overlapping paths
609674- ` VIEW ` are recreated using the same view definition.
610675
611- [[ back to top] ( #databricks-labs-ucx )]
612-
613676This command supports moving multiple tables at once, by specifying ` * ` as the table name.
614677
615678[[ back to top] ( #databricks-labs-ucx )]
@@ -629,6 +692,13 @@ It can also be used to debug issues related to table aliasing.
629692
630693# Code migration commands
631694
695+ See the [ migration process diagram] ( #migration-process ) to understand the role of the code migration commands in the migration process.
696+
697+ After you're done with the [ table migration] ( #table-migration-commands ) , you can proceed to the code migration.
698+
699+ Once you're done with the code migration, you can run the [ ` cluster-remap ` command] ( #cluster-remap-command ) to remap the
700+ clusters to be UC compatible.
701+
632702[[ back to top] ( #databricks-labs-ucx )]
633703
634704## ` migrate-local-code ` command
@@ -653,6 +723,12 @@ authenticate your machine with:
653723* ` databricks auth login --host https://accounts.cloud.databricks.com/ ` (AWS)
654724* ` databricks auth login --host https://accounts.azuredatabricks.net/ ` (Azure)
655725
726+ Ask your Databricks Account admin to run the [ ` sync-workspace-info ` command] ( #sync-workspace-info-command ) to sync the
727+ workspace information with the UCX installations. Once the workspace information is synced, you can run the
728+ [ ` create-table-mapping ` command] ( #create-table-mapping-command ) to align your tables with the Unity Catalog.
729+
730+ [[ back to top] ( #databricks-labs-ucx )]
731+
656732## ` sync-workspace-info ` command
657733
658734``` text
@@ -718,6 +794,8 @@ The following scenarios are supported, if a group X:
718794
719795This command is useful for the setups, that don't have SCIM provisioning in place.
720796
797+ Once you're done with this command, proceed to the [ group migration workflow] ( #group-migration-workflow ) .
798+
721799[[ back to top] ( #databricks-labs-ucx )]
722800
723801## ` validate-groups-membership ` command
@@ -736,6 +814,10 @@ This command is useful for administrators who want to ensure that the groups hav
736814used to debug issues related to group membership. See [ group migration] ( docs/local-group-migration.md ) and
737815[ group migration] ( #group-migration-workflow ) for more details.
738816
817+ Once you're done with this command, proceed to the [ table migration] ( #table-migration-commands ) .
818+
819+ [[ back to top] ( #databricks-labs-ucx )]
820+
739821## ` cluster-remap ` command
740822
741823``` text
@@ -749,16 +831,18 @@ Shared Autoscaling Americas cluster 0329-145545-rugby794
749831Please provide the cluster id's as comma separated value from the above list (default: <ALL>):
750832```
751833
752- This command will remap the cluster to uc enabled one.When we run this command it will list all the clusters
834+ Once you're done with the [ code migration] ( #code-migration-commands ) , you can run this command to remap the clusters to UC enabled.
835+
836+ This command will remap the cluster to uc enabled one. When we run this command it will list all the clusters
753837and its id's and asks to provide the cluster id's as comma separated value which has to be remapped, by default it will take all cluster ids.
754838Once we provide the cluster id's it will update these clusters to UC enabled.Back up of the existing cluster
755839config will be stored in backup folder inside the installed location(backup/clusters/cluster_id.json) as a json file.This will help
756840to revert the cluster remapping.
757841
842+ You can revert the cluster remapping using the [ ` revert-cluster-remap ` command] ( #revert-cluster-remap-command ) .
758843
759844[[ back to top] ( #databricks-labs-ucx )]
760845
761-
762846## ` revert-cluster-remap ` command
763847
764848``` text
@@ -769,12 +853,10 @@ $ databricks labs ucx revert-cluster-remap
769853Please provide the cluster id's as comma separated value from the above list (default: <ALL>):
770854```
771855
772- If a customer want's to revert the cluster remap done using the ` cluster-remap ` command they can use this command to revert
856+ If a customer want's to revert the cluster remap done using the [ ` cluster-remap ` command] ( #cluster-remap-command ) they can use this command to revert
773857its configuration from UC to original one.It will iterate through the list of clusters from the back up folder and reverts the
774858cluster configurations to original one.This will also ask the user to provide the list of clusters that has to be reverted as a prompt.
775- By default it will revert all the clusters present in the back up folder
776-
777-
859+ By default, it will revert all the clusters present in the backup folder
778860
779861[[ back to top] ( #databricks-labs-ucx )]
780862
0 commit comments