v0.53.0
- Added dashboard crawlers (#3397). The open-source library has been updated with new dashboard crawlers for the assessment workflow, Redash migration, and QueryLinter. These crawlers are responsible for crawling and persisting dashboards, as well as migrating or reverting them during Redash migration. They also lint the queries of the crawled dashboards using QueryLinter. This change resolves issues #3366 and #3367, and progresses #2854. The 'databricks labs ucx {migrate-dbsql-dashboards|revert-dbsql-dashboards}' command and the
assessmentworkflow have been modified to incorporate these new features. Unit tests and integration tests have been added to ensure proper functionality of the new dashboard crawlers. Additionally, two new tables, $inventory.redash_dashboards and $inventory.lakeview_dashboards, have been introduced to hold a list of all Redash or Lakeview dashboards and are used by theQueryLinterandRedashmigration. These changes improve the assessment, migration, and linting processes for dashboards in the library. - DBFS Root Support for HMS Federation (#3425). The commit
DBFS Root Support for HMS Federationintroduces changes to support the DBFS root location for HMS federation. A new method,external_locations_with_root, is added to theExternalLocationsclass to return a list of external locations including the DBFS root location. This method is used in various functions and test cases, such astest_create_uber_principal_no_storage,test_create_uc_role_multiple_raises_error,test_create_uc_no_roles,test_save_spn_permissions, andtest_create_access_connectors_for_storage_accounts, to ensure that the DBFS root location is correctly identified and tested in different scenarios. Additionally, theexternal_locations.snapshot.return_valueis changed toexternal_locations.external_locations_with_root.return_valuein test functionstest_create_federated_catalogandtest_already_existing_connectionto retrieve a list of external locations including the DBFS root location. This commit closes issue #3406, which was related to this functionality. Overall, these changes improve the handling and testing of DBFS root location in HMS federation. - Log message as error when legacy permissions API is enabled/disabled depending on the workflow ran (#3443). In this release, logging behavior has been updated in several methods in the 'workflows.py' file. When the
use_legacy_permission_migrationconfiguration is set to False and specific conditions are met, error messages are now logged instead of info messages for the methods 'verify_metastore_attached', 'rename_workspace_local_groups', 'reflect_account_groups_on_workspace', 'apply_permissions_to_account_groups', 'apply_permissions', and 'validate_groups_permissions'. This change is intended to address issue #3388 and provides clearer guidance to users when the legacy permissions API is not functioning as expected. Users will now see an error message advising them to run themigrate-groupsjob or setuse_legacy_permission_migrationto True in the config.yml file. These updates will help ensure smoother workflow runs and more accurate logging for better troubleshooting. - MySQL External HMS Support for HMS Federation (#3385). This commit adds support for MySQL-based Hive Metastore (HMS) in HMS Federation, enhances the CLI for creating a federated catalog, and improves external HMS functionality. It introduces a new parameter
enable_hms_federationin theLocationsclass constructor, allowing users to enable or disable MySQL-based HMS federation. Theexternal_locationsmethod inapplication.pynow acceptsenable_hms_federationas a parameter, enabling more granular control of the federation feature. Additionally, the CLI for creating a federated catalog has been updated to accept apromptsparameter, providing more flexibility. The commit also introduces a new dataclassExternalHmsInfofor external HMS connection information and updates theHiveMetastoreFederationEnablerandHiveMetastoreFederationclasses to support non-Glue external metastores. Furthermore, it adds methods to handle the creation of a Federated Catalog from the command-line interface, split JDBC URLs, and manage external connections and permissions. - Skip listing built-in catalogs to update table migration process (#3464). In this release, the migration process for updating tables in the Hive Metastore has been optimized with the introduction of the
TableMigrationStatusRefresherclass, which inherits fromCrawlerBase. This new class includes modifications to the_iter_schemasmethod, which now filters out built-in catalogs and schemas when listing catalogs and schemas, thereby skipping unnecessary processing during the table migration process. Additionally, theget_seen_tablesmethod has been updated to include checks forschema.nameandschema.catalog_name, and the_crawland_try_fetchmethods have been modified to reflect changes in theTableMigrationStatusconstructor. These changes aim to improve the efficiency and performance of the migration process by skipping built-in catalogs and schemas. The release also includes modifications to the existingmigrate-tablesworkflow and adds unit tests that demonstrate the exclusion of built-in catalogs during the table migration status update process. The test case utilizes theCatalogInfoSecurableKindenumeration to specify the kind of catalog and verifies that the seen tables only include the non-builtin catalogs. These changes should prevent unnecessary processing of built-in catalogs and schemas during the table migration process, leading to improved efficiency and performance. - Updated databricks-sdk requirement from <0.39,>=0.38 to >=0.39,<0.40 (#3434). In this release, the requirement for the
databricks-sdkpackage has been updated in the pyproject.toml file to be strictly greater than or equal to 0.39 and less than 0.40, allowing for the use of the latest version of the package while preventing the use of versions above 0.40. This change is based on the release notes and changelog for version 0.39 of the package, which includes bug fixes, internal changes, and API changes such as the addition of thecleanroomspackage, delete() method for workspace-level services, and fields for various request and response objects. The commit history for the package is also provided. Dependabot has been configured to resolve any conflicts with this PR and can be manually triggered to perform various actions as needed. Additionally, Dependabot can be used to ignore specific dependency versions or close the PR. - Updated databricks-sdk requirement from <0.40,>=0.39 to >=0.39,<0.41 (#3456). In this pull request, the version range of the
databricks-sdkdependency has been updated from '<0.40,>=0.39' to '>=0.39,<0.41', allowing the use of the latest version of thedatabricks-sdkwhile ensuring that it is less than 0.41. The pull request also includes release notes detailing the API changes in version 0.40.0, such as the addition of new fields to various compute, dashboard, job, and pipeline services. A changelog is provided, outlining the bug fixes, internal changes, new features, and improvements in versions 0.39.0, 0.40.0, and 0.38.0. A list of commits is also included, showing the development progress of these versions. - Use LTS Databricks runtime version (#3459). This release introduces a change in the Databricks runtime version to a Long-Term Support (LTS) release to address issues encountered during the migration to external tables. The previous runtime version caused the
convert to external tablemigration strategy to fail, and this change serves as a temporary solution. Themigrate-tablesworkflow has been modified, and existing integration tests have been reused to ensure functionality. Thetest_job_cluster_policyfunction now uses the LTS version instead of the latest version, ensuring a specified Spark version for the cluster policy. The function also checks for matching node type ID, Spark version, and necessary resources. However, users may still encounter problems with the latest Universal Connectivity (UCX) release. The_convert_hms_table_to_externalmethod in thetable_migrate.pyfile has been updated to return a boolean value, with a new TODO comment about a possible failure with Databricks runtime 16.0 due to a JDK update. - Use
CREATE_FOREIGN_CATALOGinstead ofCREATE_FOREIGN_SECURABLEwith HMS federation enablement commands (#3309). A change has been made to update thedatabricks-sdkdependency version from>=0.38,<0.39to>=0.39in thepyproject.tomlfile, which may affect the project's functionality related to thedatabricks-sdklibrary. In the Hive Metastore Federation codebase,CREATE_FOREIGN_CATALOGis now used instead ofCREATE_FOREIGN_SECURABLEfor HMS federation enablement commands, aligned with issue #3308. The_add_missing_permissions_if_neededmethod has been updated to check forCREATE_FOREIGN_SECURABLEinstead ofCREATE_FOREIGN_CATALOGwhen granting permissions. Additionally, a unit test file for HiveMetastore Federation has been updated to reflect the use ofCREATE_FOREIGN_SECURABLEin the import statements and test functions, although this change is limited to the test file and does not affect production code. Thorough testing is recommended after applying this update to ensure that the project functions as expected and to benefit from the potential security improvements associated with the updated privilege handling.
Dependency updates:
- Updated databricks-sdk requirement from <0.40,>=0.39 to >=0.39,<0.41 (#3456).
Contributors: @JCZuurmond, @FastLee, @dependabot[bot]