Release v0.28.2 · databrickslabs/ucx

Fixed Table Access Control is not enabled on this cluster error (#2167). A fix has been implemented to address the Table Access Control is not enabled on this cluster error, changing it to a warning when the exception is raised. This modification involves the introduction of a new constant CLUSTER_WITHOUT_ACL_FRAGMENT to represent the error message and updates to the snapshot and grants methods to conditionally log a warning instead of raising an error when the exception is caught. These changes improve the robustness of the integration test by handling exceptions when many test schemas are being created and deleted quickly, without introducing any new functionality. However, the change has not been thoroughly tested.
Fixed infinite recursion when checking module of expression (#2159). In this release, we have addressed an infinite recursion issue (#2159) that occurred when checking the module of an expression. The append_statements method has been updated to no longer overwrite existing statements for globals when appending trees, instead extending the existing list of statements for the global with new values. This modification ensures that the accuracy of module checks is improved and prevents the infinite recursion issue. Additionally, unit tests have been added to verify the correct behavior of the changes and confirm the resolution of both the infinite recursion issue and the appending behavior. This enhancement was a collaborative effort with Eric Vergnaud.
Fixed parsing unsupported magic syntax (#2157). In this update, we have addressed a crashing issue that occurred when parsing unsupported magic syntax in a notebook's source code. We accomplished this by modifying the _read_notebook_path function in the cells.py file. Specifically, we changed the way the start variable, which marks the position of the command in a line, is obtained. Instead of using the index() method, we now use the find() method. This change resolves the crash and enhances the parser's robustness in handling various magic syntax types. The commit also includes a manual test to confirm the fix, which addresses one of the two reported issues.
Infer values from child notebook in magic line (#2091). This commit introduces improvements to the notebook linter for enhanced value inference during linting. By utilizing values from child notebooks loaded via the %run magic line, the linter can now provide more accurate suggestions and error detection. The FileLinter class has been updated to include a session_state parameter, allowing it to access variables and objects defined in child notebooks. New methods such as append_tree(), append_nodes(), and append_globals() have been added to the BaseLinter class for better code tree manipulation, enabling more accurate linting of combined code trees. Additionally, unit tests have been added to ensure the correct behavior of this feature. This change addresses issue #1201 and progresses issue #1901.
Updated databricks-labs-lsql requirement from ~=0.5.0 to >=0.5,<0.7 (#2160). In this update, the version constraint for the databricks-labs-lsql library has been updated from ~=0.5.0 to >=0.5,<0.7, allowing the project to utilize the latest features and bug fixes available in the library while maintaining compatibility with the existing codebase. This change ensures that the project can take advantage of any improvements or additions made to databricks-labs-lsql version 0.6.0 and above. For reference, the release notes for databricks-labs-lsql version 0.6.0 have been included in the commit, detailing the new features and improvements that come with the updated library.
Whitelist phonetics (#2163). This release introduces a whitelist for phonetics functionality in the known.json configuration file, allowing engineers to utilize five new phonetics methods: phonetics, phonetics.metaphone, phonetics.nysiis, phonetics.soundex, and phonetics.utils. These methods have been manually tested and are now available for use, contributing to issue #2163 and progressing issue #1901. As an adopting engineer, this addition enables you to incorporate these phonetics methods into your system's functionality, expanding the capabilities of the open-source library.
Whitelist pydantic (#2162). In this release, we have added the Pydantic library to the known.json file, which manages our project's third-party libraries. Pydantic is a data validation library for Python that allows developers to define data models and enforce type constraints, improving data consistency and correctness in the application. With this change, Pydantic and its submodules have been whitelisted and can be used in the project without being flagged as unknown libraries. This improvement enables us to utilize Pydantic's features for data validation and modeling, ensuring higher data quality and reducing the likelihood of errors in our application.
Whitelist statsmodels (#2161). In this change, the statsmodels library has been whitelisted for use in the project. Statsmodels is a comprehensive Python library for statistics and econometrics that offers a variety of tools for statistical modeling, testing, and visualization. With this update, the library has been added to the project's configuration file, enabling users to utilize its features without causing any conflicts. The modification does not affect the existing functionality of the project, but rather expands the range of statistical models and analysis tools available to users. Additionally, a test has been included to verify the successful integration of the library. These enhancements streamline the process of conducting statistical analysis and modeling within the project.
whitelist dbignite (#2132). A new commit has been made to whitelist the dbignite repository and add a set of codes and messages in the "known.json" file related to the use of RDD APIs on UC Shared Clusters and the change in the default format from Parquet to Delta in Databricks Runtime 8.0. The affected components include dbignite.fhir_mapping_model, dbignite.fhir_resource, dbignite.hosp_feeds, dbignite.hosp_feeds.adt, dbignite.omop, dbignite.omop.data_model, dbignite.omop.schemas, dbignite.omop.utils, and dbignite.readers. These changes are intended to provide information and warnings regarding the use of the specified APIs on UC Shared Clusters and the change in default format. It is important to note that no new methods have been added, and no existing functionality has been changed as part of this update. The focus of this commit is solely on the addition of the dbignite repository and its associated codes and messages.
whitelist duckdb (#2134). In this release, we have whitelisted the DuckDB library by adding it to the "known.json" file in the source code. DuckDB is an in-memory analytical database written in C++. This addition includes several modules such as adbc_driver_duckdb, duckdb.bytes_io_wrapper, duckdb.experimental, duckdb.filesystem, duckdb.functional, and duckdb.typing. Of particular note is the duckdb.experimental.spark.sql.session module, which includes a change in the default format for Databricks Runtime 8.0, from Parquet to Delta. This change is indicated by the table-migrate code and message in the commit. Additionally, the commit includes tests that have been manually verified. DuckDB is a powerful new addition to our library, and we are excited to make it available to our users.
whitelist fs (#2136). In this release, we have added the fs package to the known.json file, allowing its use in our open-source library. The fs package contains a wide range of modules and sub-packages, including fs._bulk, fs.appfs, fs.base, fs.compress, fs.copy, fs.error_tools, fs.errors, fs.filesize, fs.ftpfs, fs.glob, fs.info, fs.iotools, fs.lrucache, fs.memoryfs, fs.mirror, fs.mode, fs.mountfs, fs.move, fs.multifs, fs.opener, fs.osfs, fs.path, fs.permissions, fs.subfs, fs.tarfs, fs.tempfs, fs.time, fs.tools, fs.tree, fs.walk, fs.wildcard, fs.wrap, fs.wrapfs, and fs.zipfs. These additions address issue #1901 and have been thoroughly manually tested to ensure proper functionality.
whitelist httpx (#2139). In this release, we have updated the "known.json" file to include the httpx library along with all its submodules. This change serves to whitelist the library, and it does not introduce any new functionality or impact existing functionalities. The addition of httpx is purely for informational purposes, and it will not result in the inclusion of new methods or functions. Rest assured, the team has manually tested the changes, and the project's behavior remains unaffected. We recommend this update to software engineers looking to adopt our project, highlighting that the addition of httpx will only influence the library whitelist and not the overall functionality.
whitelist jsonschema and jsonschema-specifications (#2140). In this release, we have made changes to the "known.json" file to whitelist the jsonschema and jsonschema-specifications libraries. This modification addresses issue #1901 and does not introduce any new functionality or tests. The jsonschema library is utilized for schema validation, while the jsonschema-specifications library offers additional specifications for the jsonschema library. By adding these libraries to the "known.json" file, we ensure that they are recognized as approved dependencies and are not flagged as unknown or unapproved in the future. This enhancement improves the reliability and efficiency of our dependency management system, making it easier for software engineers to work with these libraries.
whitelist pickleshare (#2141). A new commit has been added to whitelist Pickleshare, a Python module for managing persistent data structures, in the known.json file. This change aligns with issue #1901 and is a preparatory step to ensure Pickleshare's compatibility with the project. The Pillow module is already included in the whitelist. No new functionality has been introduced, and existing functionality remains unchanged. The purpose of the whitelist is not explicitly stated in the given context. As a software engineer integrating this project, you are advised to verify the necessity of whitelisting Pickleshare for your specific use case.
whitelist referencing (#2142). This commit introduces a new whitelist referencing feature, which includes the creation of a referencing section in the "known.json" file. The new section contains several entries, including "referencing._attrs", "referencing._core", "referencing.exceptions", "referencing.jsonschema", "referencing.retrieval", and "referencing.typing", all of which are initially empty. This change is a step towards completing issue #2142 and addresses issue #1901. Manual testing has been conducted to ensure the proper functioning of the new functionality. This enhancement was co-authored by Eric Vergnaud.
whitelist slicer (#2143). A new security measure has been implemented in the slicer module with the addition of a whitelist that specifies allowed modules and functions. The whitelist is implemented as a JSON object in the known.json file, preventing unauthorized access or usage of certain parts of the codebase. A test has been included to verify the functionality of the whitelist, ensuring that the slicer module is secure and functioning as intended. No new methods were added and existing functionality remains unchanged. The changes are localized to the known.json file and the slicer module, enhancing the security and integrity of the project. This feature was developed by Eric Vergnaud and myself.
whitelist sparse (#2144). In this release, we have whitelisted the sparse module, adding it to the known.json file. This module encompasses various sub-modules and components such as _common, _compressed, _coo, _dok, _io, _numba_extension, _settings, _slicing, _sparse_array, _umath, _utils, finch_backend, and numba_backend. Each component may contain additional classes, functions, or functionality, and the numba_backend sub-module includes further sub-components. This change aims to improve organization, enhance codebase understanding, and prevent accidental deletion or modification of critical code. The modification is in reference to issue #1901 for additional context. Comprehensive testing has been carried out to guarantee the correct implementation of the whitelisting.
whitelist splink (#2145). In this release, we have added the splink library to our known_json file, which includes various modules and functions for entity resolution and data linking. This change is in line with issue #190
whitelist toolz (#2146). In this release, we have whitelisted the toolz library and added it to the known.json file. The toolz library is a collection of functional utilities, compatible with CPython, PyPy, Jython, and IronPython, and is a port of various modules from Python's standard library and other open-source packages. The newly added modules include tlz, toolz, toolz._signatures, toolz._version, toolz.compatibility, toolz.curried, toolz.dicttoolz, toolz.functoolz, toolz.itertoolz, toolz.recipes, toolz.sandbox, toolz.sandbox.core, toolz.sandbox.parallel, and toolz.utils. These changes have been manually tested and may address issue #1901.
whitelist xmod (#2147). In this release, we have made a modification to the open-source library that involves whitelisting xmod in the known.json file. This change includes the addition of a new key for xmod with an empty array as its initial value. It is important to note that this modification does not alter the existing functionality of the code. The development team has thoroughly tested the changes through manual testing to ensure proper implementation. This update is a significant milestone towards the progress of issue #1901. Software engineers are encouraged to incorporate these updates in their code to leverage the new whitelisting functionality for "xmod."

Dependency updates:

Updated databricks-labs-lsql requirement from ~=0.5.0 to >=0.5,<0.7 (#2160).

Contributors: @ericvergnaud, @dependabot[bot]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v0.28.2

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Contributors

Uh oh!