Skip to content

Conversation

@asnare
Copy link
Contributor

@asnare asnare commented Oct 22, 2024

Changes

This PR follows on from #2743 by extending the set of updates that we capture to include updated DirectFsAccess snapshots for dashboards and jobs.

Linked issues

Follows #2743.

Functionality

  • modified existing workflow: migration-progress-experimental

Tests

  • added unit tests
  • existing integration tests

This field is encoded as a Spark SQL LONG, which has a (signed) range of 64-bits.
THe history will be maintained adjacent to the crawler framework.
…ion, updated to use the Historical record type.
@github-actions
Copy link

github-actions bot commented Oct 22, 2024

❌ 49/50 passed, 1 failed, 2 skipped, 1h7m16s total

❌ test_running_real_migration_progress_job: AssertionError: Workflow failed: migration-progress-experimental (31m40.612s)
AssertionError: Workflow failed: migration-progress-experimental
assert False
 +  where False = validate_step('migration-progress-experimental')
 +    where validate_step = <databricks.labs.ucx.installer.workflows.DeployedWorkflows object at 0x7f7e5b6aeb60>.validate_step
 +      where <databricks.labs.ucx.installer.workflows.DeployedWorkflows object at 0x7f7e5b6aeb60> = <tests.integration.conftest.MockInstallationContext object at 0x7f7e5b61c610>.deployed_workflows
[gw1] linux -- Python 3.10.15 /home/runner/work/ucx/ucx/.venv/bin/python
08:29 INFO [tests.integration.conftest] Dashboard Created ucx_DwHym_ra78a55a0a: https://DATABRICKS_HOST/sql/dashboards/c71e0587-e22c-40e8-9dd5-fb014b2d4981
08:29 INFO [tests.integration.conftest] Dashboard Created ucx_DPzY4_ra78a55a0a: https://DATABRICKS_HOST/sql/dashboards/49c8d7f5-d2fa-4435-ac3e-1d0207534e78
08:29 DEBUG [databricks.labs.ucx.install] Cannot find previous installation: Path (/Users/0a330eb5-dd51-4d97-b6e4-c474356b1d5d/.Mq6p/config.yml) doesn't exist.
08:29 INFO [databricks.labs.ucx.install] Please answer a couple of questions to configure Unity Catalog migration
08:29 INFO [databricks.labs.ucx.installer.hms_lineage] HMS Lineage feature creates one system table named system.hms_to_uc_migration.table_access and helps in your migration process from HMS to UC by allowing you to programmatically query HMS lineage data.
08:29 INFO [databricks.labs.ucx.install] Fetching installations...
08:29 INFO [databricks.labs.ucx.installer.policy] Creating UCX cluster policy.
08:29 DEBUG [tests.integration.conftest] Waiting for clusters to start...
08:29 DEBUG [tests.integration.conftest] Waiting for clusters to start...
08:29 INFO [databricks.labs.ucx.install] Installing UCX v0.47.1+11320241024082916
08:29 INFO [databricks.labs.ucx.install] Creating ucx schemas...
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-tables
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migration-progress-experimental
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-groups
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-tables-in-mounts-experimental
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-groups-experimental
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-external-hiveserde-tables-in-place-experimental
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-data-reconciliation
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-external-tables-ctas
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=failing
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=validate-groups-permissions
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=remove-workspace-local-backup-groups
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=assessment
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=scan-tables-in-mounts-experimental
08:29 INFO [databricks.labs.ucx.install] Creating dashboards...
08:29 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/views...
08:29 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment...
08:29 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration...
08:29 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/interactive...
08:29 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/estimates...
08:29 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/main...
08:29 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/CLOUD_ENV...
08:29 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration/groups...
08:29 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration/main...
08:29 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
08:29 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
08:29 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
08:29 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
08:29 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
08:29 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
08:29 INFO [databricks.labs.ucx.install] Installation completed successfully! Please refer to the https://DATABRICKS_HOST/#workspace/Users/0a330eb5-dd51-4d97-b6e4-c474356b1d5d/.Mq6p/README for the next steps.
08:29 DEBUG [databricks.labs.ucx.installer.workflows] starting assessment job: https://DATABRICKS_HOST#job/420836374142604
08:29 INFO [databricks.labs.ucx.installer.workflows] Started assessment job: https://DATABRICKS_HOST#job/420836374142604/runs/491727566136780
08:29 DEBUG [databricks.labs.ucx.installer.workflows] Waiting for completion of assessment job: https://DATABRICKS_HOST#job/420836374142604/runs/491727566136780
08:43 INFO [databricks.labs.ucx.installer.workflows] Completed assessment job run 491727566136780 with state: RunResultState.SUCCESS
08:43 INFO [databricks.labs.ucx.installer.workflows] Completed assessment job run 491727566136780 duration: 0:13:21.825000 (2024-10-24 08:29:39.747000+00:00 thru 2024-10-24 08:43:01.572000+00:00)
08:43 DEBUG [databricks.labs.ucx.installer.workflows] Validating assessment workflow: https://DATABRICKS_HOST#job/420836374142604
08:43 INFO [databricks.labs.ucx.progress.install] Installation completed successfully!
08:43 DEBUG [databricks.labs.ucx.installer.workflows] starting migration-progress-experimental job: https://DATABRICKS_HOST#job/728082396829929
08:43 INFO [databricks.labs.ucx.installer.workflows] Started migration-progress-experimental job: https://DATABRICKS_HOST#job/728082396829929/runs/1014998267246221
08:43 DEBUG [databricks.labs.ucx.installer.workflows] Waiting for completion of migration-progress-experimental job: https://DATABRICKS_HOST#job/728082396829929/runs/1014998267246221
08:59 INFO [databricks.labs.ucx.installer.workflows] Completed migration-progress-experimental job run 1014998267246221 with state: RunResultState.SUCCESS_WITH_FAILURES (The job run succeeded with 11 failed tasks)
08:59 INFO [databricks.labs.ucx.installer.workflows] Completed migration-progress-experimental job run 1014998267246221 duration: 0:15:41.134000 (2024-10-24 08:43:14.549000+00:00 thru 2024-10-24 08:58:55.683000+00:00)
08:59 DEBUG [databricks.labs.ucx.installer.workflows] Validating migration-progress-experimental workflow: https://DATABRICKS_HOST#job/728082396829929
08:29 INFO [tests.integration.conftest] Dashboard Created ucx_DwHym_ra78a55a0a: https://DATABRICKS_HOST/sql/dashboards/c71e0587-e22c-40e8-9dd5-fb014b2d4981
08:29 INFO [tests.integration.conftest] Dashboard Created ucx_DPzY4_ra78a55a0a: https://DATABRICKS_HOST/sql/dashboards/49c8d7f5-d2fa-4435-ac3e-1d0207534e78
08:29 DEBUG [databricks.labs.ucx.install] Cannot find previous installation: Path (/Users/0a330eb5-dd51-4d97-b6e4-c474356b1d5d/.Mq6p/config.yml) doesn't exist.
08:29 INFO [databricks.labs.ucx.install] Please answer a couple of questions to configure Unity Catalog migration
08:29 INFO [databricks.labs.ucx.installer.hms_lineage] HMS Lineage feature creates one system table named system.hms_to_uc_migration.table_access and helps in your migration process from HMS to UC by allowing you to programmatically query HMS lineage data.
08:29 INFO [databricks.labs.ucx.install] Fetching installations...
08:29 INFO [databricks.labs.ucx.installer.policy] Creating UCX cluster policy.
08:29 DEBUG [tests.integration.conftest] Waiting for clusters to start...
08:29 DEBUG [tests.integration.conftest] Waiting for clusters to start...
08:29 INFO [databricks.labs.ucx.install] Installing UCX v0.47.1+11320241024082916
08:29 INFO [databricks.labs.ucx.install] Creating ucx schemas...
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-tables
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migration-progress-experimental
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-groups
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-tables-in-mounts-experimental
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-groups-experimental
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-external-hiveserde-tables-in-place-experimental
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-data-reconciliation
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=migrate-external-tables-ctas
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=failing
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=validate-groups-permissions
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=remove-workspace-local-backup-groups
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=assessment
08:29 INFO [databricks.labs.ucx.installer.workflows] Creating new job configuration for step=scan-tables-in-mounts-experimental
08:29 INFO [databricks.labs.ucx.install] Creating dashboards...
08:29 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/views...
08:29 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment...
08:29 DEBUG [databricks.labs.ucx.install] Reading step folder /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration...
08:29 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/interactive...
08:29 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/estimates...
08:29 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/main...
08:29 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/assessment/CLOUD_ENV...
08:29 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration/groups...
08:29 INFO [databricks.labs.ucx.install] Creating dashboard in /home/runner/work/ucx/ucx/src/databricks/labs/ucx/queries/migration/main...
08:29 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
08:29 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
08:29 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
08:29 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
08:29 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
08:29 INFO [databricks.labs.ucx.installer.mixins] Fetching warehouse_id from a config
08:29 INFO [databricks.labs.ucx.install] Installation completed successfully! Please refer to the https://DATABRICKS_HOST/#workspace/Users/0a330eb5-dd51-4d97-b6e4-c474356b1d5d/.Mq6p/README for the next steps.
08:29 DEBUG [databricks.labs.ucx.installer.workflows] starting assessment job: https://DATABRICKS_HOST#job/420836374142604
08:29 INFO [databricks.labs.ucx.installer.workflows] Started assessment job: https://DATABRICKS_HOST#job/420836374142604/runs/491727566136780
08:29 DEBUG [databricks.labs.ucx.installer.workflows] Waiting for completion of assessment job: https://DATABRICKS_HOST#job/420836374142604/runs/491727566136780
08:43 INFO [databricks.labs.ucx.installer.workflows] Completed assessment job run 491727566136780 with state: RunResultState.SUCCESS
08:43 INFO [databricks.labs.ucx.installer.workflows] Completed assessment job run 491727566136780 duration: 0:13:21.825000 (2024-10-24 08:29:39.747000+00:00 thru 2024-10-24 08:43:01.572000+00:00)
08:43 DEBUG [databricks.labs.ucx.installer.workflows] Validating assessment workflow: https://DATABRICKS_HOST#job/420836374142604
08:43 INFO [databricks.labs.ucx.progress.install] Installation completed successfully!
08:43 DEBUG [databricks.labs.ucx.installer.workflows] starting migration-progress-experimental job: https://DATABRICKS_HOST#job/728082396829929
08:43 INFO [databricks.labs.ucx.installer.workflows] Started migration-progress-experimental job: https://DATABRICKS_HOST#job/728082396829929/runs/1014998267246221
08:43 DEBUG [databricks.labs.ucx.installer.workflows] Waiting for completion of migration-progress-experimental job: https://DATABRICKS_HOST#job/728082396829929/runs/1014998267246221
08:59 INFO [databricks.labs.ucx.installer.workflows] Completed migration-progress-experimental job run 1014998267246221 with state: RunResultState.SUCCESS_WITH_FAILURES (The job run succeeded with 11 failed tasks)
08:59 INFO [databricks.labs.ucx.installer.workflows] Completed migration-progress-experimental job run 1014998267246221 duration: 0:15:41.134000 (2024-10-24 08:43:14.549000+00:00 thru 2024-10-24 08:58:55.683000+00:00)
08:59 DEBUG [databricks.labs.ucx.installer.workflows] Validating migration-progress-experimental workflow: https://DATABRICKS_HOST#job/728082396829929
08:59 INFO [databricks.labs.ucx.install] Deleting UCX v0.47.1+11320241024082916 from https://DATABRICKS_HOST
08:59 INFO [databricks.labs.ucx.install] Deleting inventory database dummy_sjsac
08:59 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=799380929741655, as it is no longer needed
08:59 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=728082396829929, as it is no longer needed
08:59 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=381121446150907, as it is no longer needed
08:59 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=984284884208057, as it is no longer needed
08:59 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=148293406372514, as it is no longer needed
08:59 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=733738560341341, as it is no longer needed
08:59 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=835646867814657, as it is no longer needed
08:59 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=228396573518795, as it is no longer needed
08:59 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=921159314539989, as it is no longer needed
08:59 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=301197383412333, as it is no longer needed
08:59 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=296918394603234, as it is no longer needed
08:59 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=420836374142604, as it is no longer needed
08:59 INFO [databricks.labs.ucx.installer.workflows] Removing job_id=604828094927882, as it is no longer needed
08:59 INFO [databricks.labs.ucx.install] Deleting cluster policy
08:59 INFO [databricks.labs.ucx.install] Deleting secret scope
08:59 INFO [databricks.labs.ucx.install] UnInstalling UCX complete
[gw1] linux -- Python 3.10.15 /home/runner/work/ucx/ucx/.venv/bin/python

Running from acceptance #7013

@asnare
Copy link
Contributor Author

asnare commented Oct 23, 2024

❌ test_running_real_migration_progress_job: AssertionError: Workflow failed: migration-progress-experimental (25m43.878s)

This is currently failing due to a bug in the crawlers that means the snapshots cannot be loaded when the Spark-based runtime is being used; fixed in #3046.

Base automatically changed from crawler-snapshot-history to main October 23, 2024 12:27
nfx pushed a commit that referenced this pull request Oct 23, 2024
## Changes

This PR fixes an issue with the DFSA and used-table crawlers that could
prevent loading of the snapshots. When loading they convert the rows to
dictionaries using `.as_dict()` which isn't available on rows provided
by the spark-based lsql backend. Instead `.asDict()` needs to be used.

Incidental changes:
- An existing integration test was updated to also test snapshot loading
for these crawlers.
 - Another test was renamed to fix a typo in the name.

### Linked issues

Relates to #3036, #3039.

### Tests

- existing unit tests
- existing integration tests
Copy link
Collaborator

@nfx nfx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@asnare
Copy link
Contributor Author

asnare commented Oct 24, 2024

Following a discussion, we've decided not to include DFSA records in their current form in the history table. Each DFSA record corresponds to a problem with another resource (eg. notebook, jobs). As such the intent is to aggregate these records and include them in the list of failures on the resource-specific record.

nfx pushed a commit that referenced this pull request Oct 24, 2024
…ons in addition to the normal type-based ones (#3068)

## Changes

This PR cherry-picks some changes from #3039 that updated the
`HistoryEncoder` to work correctly with databases that are declared with
`__future__.__annotations__` in effect.

When this annotation is in effect, python converts all type-hints during
import/declaration into strings and then performs deferred resolution at
a later stage. (This is why forward references work.) Unfortunately the
dataclass mechanism captures field types prior to deferred resolution.
This PR ensures that our type checking works anyway.

### Linked issues

Cherry-picks from #3039.

### Tests

- updated unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request pr/do-not-merge this pull request is not ready to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants