Databricks-labs-ucx

Latest version: v0.57.0

Safety actively analyzes 722525 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 12

0.57.0

* Convert UCX job ids to `int` before passing to `JobsCrawler` ([3816](https://github.com/databrickslabs/ucx/issues/3816)). In this release, we have addressed issue [#3722](https://github.com/databrickslabs/ucx/issues/3722) and improved the robustness of the open-source library by modifying the `jobs_crawler` method to handle job IDs more effectively. Previously, job IDs were passed directly to the `exclude_job_ids` parameter, which could cause issues if they were not integers. To address this problem, we have updated the `jobs_crawler` method to convert all job IDs to integers using a list comprehension before passing them to the method. This change ensures that only valid integer job IDs are used, thereby enhancing the reliability of the method. The commit includes a manual test to confirm the correct behavior of this modification. In summary, this modification improves the robustness of the code by ensuring that integer job IDs are utilized correctly in the `JobsCrawler` method.
* Exclude UCX jobs from crawling ([3733](https://github.com/databrickslabs/ucx/issues/3733)). In this release, we have made modifications to the `JobsCrawler` and the existing `assessment` workflow to exclude UCX jobs from crawling, avoiding confusion for users when they appear in assessment reports. This change addresses issues [#3656](https://github.com/databrickslabs/ucx/issues/3656) and [#3722](https://github.com/databrickslabs/ucx/issues/3722), and is a follow-up to previous issue [#3732](https://github.com/databrickslabs/ucx/issues/3732). We have also incorporated updates from pull requests [#3767](https://github.com/databrickslabs/ucx/issues/3767) and [#3759](https://github.com/databrickslabs/ucx/issues/3759) to improve integration tests and linting. Additionally, a retry mechanism has been added to wait for grants to exist before crawling, addressing issue [#3758](https://github.com/databrickslabs/ucx/issues/3758). The changes include the addition of unit and integration tests to ensure the correctness of the modifications. A new `exclude_job_ids` parameter has been added to the `JobsCrawler` constructor, which is initialized with the list of UCX job IDs, ensuring that UCX jobs are not included in the assessment report. The `_list_jobs` method now excludes jobs based on the provided `exclude_job_ids` and `include_job_ids` arguments. The `_crawl` method now uses the `_list_jobs` method to list the jobs to be crawled. The `_assess_jobs` method has been updated to take into account the exclusion of specific job IDs. The `test_grant_detail` file, an integration test for the Hive Metastore grants functionality, has been updated to include a retry mechanism to wait for grants to exist before crawling and to check if the SELECT permission on ANY FILE is present in the grants.
* Let `WorkflowLinter.refresh_report` lint jobs from `JobsCrawler` ([3732](https://github.com/databrickslabs/ucx/issues/3732)). In this release, the `WorkflowLinter.refresh_report` method has been updated to lint jobs from the `JobsCrawler` class, ensuring that only jobs within the scope of the crawler are processed. This change resolves issue [#3662](https://github.com/databrickslabs/ucx/issues/3662) and progresses issue [#3722](https://github.com/databrickslabs/ucx/issues/3722). The workflow linting code, the `assessment` workflow, and the `JobsCrawler` class have been modified. The `JobsCrawler` class now includes a `snapshot` method, which is used in the `WorkflowLinter.refresh_report` method to retrieve necessary data about jobs. Unit and integration tests have been updated correspondingly, with the integration test for workflows now verifying that all rows returned from a query to the `workflow_problems` table have a valid `path` field. The `WorkflowLinter` constructor now includes an instance of `JobsCrawler`, allowing for more targeted linting of jobs. The introduction of the `JobsCrawler` class enables more efficient and precise linting of jobs, improving the overall accuracy of workflow assessment.
* Let dashboard name adhere to naming convention ([3789](https://github.com/databrickslabs/ucx/issues/3789)). In this release, the naming convention for dashboard names in the `ucx` library has been enforced, restricting them to alphanumeric characters, hyphens, and underscores. This change replaces any non-conforming characters in existing dashboard names with hyphens or underscores, addressing several issues ([#3761](https://github.com/databrickslabs/ucx/issues/3761) through [#3788](https://github.com/databrickslabs/ucx/issues/3788)). A temporary fix has been added to the `_create_dashboard` method to ensure newly created dashboard names adhere to the new naming convention, indicated by a TODO comment. This release also resolves a test failure in a specific GitHub Actions run and addresses a total of 29 issues. The specifics of the modification made to the `databricks labs install ucx` command and the changes to existing functionality are not detailed, making it difficult to assess their scope. The commit includes the deletion of a file called `02_0_owner.filter.yml`, and all changes have been manually tested. For future reference, it would be helpful to include more information about the changes made, their impact, and the reason for deleting the specified file.
* Partial revert `Let dashboard name adhere to naming convention` ([3794](https://github.com/databrickslabs/ucx/issues/3794)). In this release, we have partially reverted a previous change to the migration progress dashboard, reintroducing the owner filter. This change was made in response to feedback from users who found the previous modification to the dashboard less intuitive. The new owner filter has been defined in a new file, '02_0_owner.filter.yml', which includes the title, column name, type, and width of the filter. To ensure proper functionality, this change requires the release of lsql after merging. The change has been thoroughly tested to guarantee its correct operation and to provide the best possible user experience.
* Partial revert `Let dashboard name adhere to naming convention` ([3795](https://github.com/databrickslabs/ucx/issues/3795)). In this release, we have partially reversed a previous change that enforced a naming convention for dashboard names, allowing the use of special characters such as spaces and brackets again. The `_create_dashboard` method in the `install.py` file and the `_name` method in the `mixins.py` file have been updated to reflect this change, affecting the migration progress dashboard. The `display_name` attribute of the `metadata` object has been updated to use the original format, which may include special characters. The `reference` variable has also been updated accordingly. The functions `created_job_tasks` and `created_job` have been updated to use the new naming convention when retrieving installation jobs with specific names. These changes have been manually tested and the tests have been verified to work correctly after the reversion. This change is related to issues [#3799](https://github.com/databrickslabs/ucx/issues/3799), [#3789](https://github.com/databrickslabs/ucx/issues/3789), and reverts commit 048bc8f9df27.
* Put back dashboard names ([3808](https://github.com/databrickslabs/ucx/issues/3808)). In the lsql release v0.16.0, the naming convention for dashboards has been updated to support non-alphanumeric characters in the dashboard names. This change modifies the `_create_dashboard` function in `install.py` and the `_name` method in `mixins.py` to create dashboard names with a format like `[UCX] assessment (Main)`, which includes parent and child folder names. This update addresses issues reported in tickets [#3797](https://github.com/databrickslabs/ucx/issues/3797) and [#3790](https://github.com/databrickslabs/ucx/issues/3790), and partially reverses previous changes made in commits 4017a2525ecd and 834ef14a4cffe891. The functionality of other methods remains unchanged. With this release, the `created_job_tasks` and `created_job` functions now accept dashboard names with non-alphanumeric characters as input.
* Updated databricks-labs-lsql requirement from <0.15,>=0.14.0 to >=0.14.0,<0.17 ([3801](https://github.com/databrickslabs/ucx/issues/3801)). In this update, we have updated the required version of the `dat ab ricks-l abs-ls ql` package from a version greater than or equal to 0.15.0 and less than 0.16.0 to a version greater than or equal to 0.16.0 and less than 0.17.0. This change allows for the use of the latest version of the package, which includes various bug fixes and dependency updates. The package is utilized in the acceptance tests that are run as part of the CI/CD pipeline. With this update, the acceptance tests can now be executed using the most recent version of the package, resulting in enhanced functionality and reliability.
* Updated databricks-sdk requirement from <0.42,>=0.40 to >=0.44,<0.45 ([3686](https://github.com/databrickslabs/ucx/issues/3686)). In this release, we have updated the version requirement for the `databricks-sdk` package to be greater than or equal to 0.44.0 and less than 0.45.0. This update allows for the use of the latest version of the `databricks-sdk`, which includes new methods, fields, and bug fixes. For instance, the `get_message_query_result_by_attachment` method has been added for the `w.genie.workspace_level_service`, and several fields such as `review_state`, `reviews`, and `runner_collaborators` have been removed for the `databricks.sdk.service.clean_rooms.CleanRoomAssetNotebook` object. Additionally, the `securable_kind` field has been removed for various objects such as `CatalogInfo` and `ConnectionInfo`. We recommend thoroughly testing this update to ensure compatibility with your project. The release notes for versions 0.44.0 and 0.43.0 can be found in the commit history. Please note that there are several backward-incompatible changes listed in the changelog for both versions.

Dependency updates:

* Updated databricks-labs-lsql requirement from <0.15,>=0.14.0 to >=0.14.0,<0.17 ([3801](https://github.com/databrickslabs/ucx/pull/3801)).
* Updated databricks-sdk requirement from <0.42,>=0.40 to >=0.44,<0.45 ([3686](https://github.com/databrickslabs/ucx/pull/3686)).

0.56.0

* Added documentation to use Delta Live Tables migration ([3587](https://github.com/databrickslabs/ucx/issues/3587)). In this documentation update, we introduce a new section for migrating Delta Live Table pipelines to the Unity Catalog as part of the migration process. This workflow allows for the original and cloned pipelines to run independently after the cloned pipeline reaches the `RUNNING` state. The update includes an example of stopping and renaming an existing HMS DLT pipeline, and creating a new cloned pipeline. Additionally, known issues and limitations are outlined, such as supported streaming sources, maintenance pausing, and querying by timestamp. To streamline the migration process, the `migrate-dlt-pipelines` command is introduced with optional parameters for including or excluding specific pipeline IDs. This feature is intended for developers and administrators managing data pipelines and handling table aliasing issues. Relevant user documentation has been added and the changes have been manually tested.
* Added support for MSSQL and POSTGRESQL to HMS Federation ([3701](https://github.com/databrickslabs/ucx/issues/3701)). In this enhancement, the open-source library now supports Microsoft SQL Server (MSSQL) and PostgreSQL databases in the Hive Metastore Federation (HMS Federation) feature. This update introduces classes for handling external Hive Metastore instances and their versions, and refactors a regex pattern for better support of various JDBC URL formats. A new `supported_databases_port` class variable is added to map supported databases to default ports, allowing the code to handle SQL Server's distinct default port. Additionally, a `supported_hms_versions` class variable is created, outlining supported Hive Metastore versions. The `_external_hms` method is updated to extract HMS version information more accurately, and the `_split_jdbc_url` method is refactored for better URL format compatibility and parameter extraction. The test file `test_federation.py` has been updated with new unit tests for external catalog creation with MSSQL and PostgreSQL, further enhancing compatibility with various databases and expanding HMS Federation's capabilities.
* Added the CLI command for migrating DLT pipelines ([3579](https://github.com/databrickslabs/ucx/issues/3579)). A new CLI command, "migrate-dlt-pipelines," has been added for migrating DLT pipelines from HMS to UC using the DLT Migration API. This command allows users to include or exclude specific pipeline IDs during migration using the `--include-pipeline-ids` and `--exclude-pipeline-ids` flags, respectively. The change impacts the `PipelinesMigrator` class, which has been updated to accept and use these new parameters. Currently, there is no information available about testing, but the changes are expected to be manually tested and accompanied by corresponding unit and integration tests in the future. The changes are isolated to the `PipelinesMigrator` class and related functionality, with no impact on existing methods or functionality.
* Addressed Bug with Dashboard migration ([3663](https://github.com/databrickslabs/ucx/issues/3663)). In this release, the `_crawl` method in `dashboards.py` has been enhanced to exclude SDK dashboards that lack IDs during the dashboard migration process. This modification enhances migration efficiency by avoiding unnecessary processing of incomplete dashboards. Additionally, the `_list_dashboards` method now includes a check for dashboards with no IDs while iterating through the `dashboards_iterator`. If a dashboard with no ID is found, the method fetches the dashboard details using the `_get_dashboard` method and adds them to the `dashboards` list, ensuring proper processing. Furthermore, a bug fix for issue [#3663](https://github.com/databrickslabs/ucx/issues/3663) has been implemented in the `RedashDashboardCrawler` class in `assessment/test_dashboards.py`. The `get` method has been added as a side effect to the `WorkspaceClient` mock's `dashboards` attribute, enabling the retrieval of individual dashboard objects by their IDs. This modification ensures that the `RedashDashboardCrawler` can correctly retrieve and process dashboard objects from the `WorkspaceClient` mock, preventing errors due to missing dashboard objects.
* Broaden safe read text caught exception scope ([3705](https://github.com/databrickslabs/ucx/issues/3705)). In this release, the `safe_read_text` function has been enhanced to handle a broader range of exceptions that may occur while reading a text file, including `OSError` and `UnicodeError`, making it more robust and safe. The function previously caught specific exceptions such as `FileNotFoundError`, `UnicodeDecodeError`, and `PermissionError`. Additionally, the codebase has been improved with updated unit tests, ensuring that the new functionality works correctly. The linting parts of the code have also been updated, enhancing the readability and maintainability of the project for other software engineers. A new method, `safe_read_text`, has been added to the `source_code` module, with several new test cases designed to ensure that the method handles edge cases correctly, such as when the file does not exist, when the path is a directory, or when an OSError occurs. These changes make the open-source library more reliable and robust for various use cases.
* Case sensitive/insensitive table validation ([3580](https://github.com/databrickslabs/ucx/issues/3580)). In this release, the library has been updated to enable more flexible and customizable metadata comparison for tables. A case sensitive flag has been introduced for metadata comparison, which allows for consideration or ignoring of column name case during validation. The `TableMetadataRetriever` abstract base class now includes a new parameter `column_name_transformer` in the `get_metadata` method, which is a callable that can be used to transform column names as needed for comparison. Additionally, a new `case_sensitive` parameter has been added to the `StandardSchemaComparator` constructor to determine whether column names should be compared case sensitively or not. A new parametrized test function `test_schema_comparison_case` has also been included to ensure that this functionality works as expected. These changes provide users with more control over the metadata comparison process and improve the library's handling of cases where column names in the source and target tables may have different cases.
* Catch `AttributeError` in `InfferedValue._safe_infer_internal` ([3684](https://github.com/databrickslabs/ucx/issues/3684)). In this release, we have implemented a change to the `_safe_infer_internal` method in the `InferredValue` class to catch `AttributeError`. This change addresses an issue in the Astroid library reported in their GitHub repository (<https://github.com/pylint-dev/astroid/issues/2683>) and resolves issue [#3659](https://github.com/databrickslabs/ucx/issues/3659) in our project. By handling `AttributeError` during the inference process, we have made the code more robust and safer. When an exception occurs, an error message is logged with debug-level logging, and the method yields the `Uninferable` sentinel value to indicate that inference failed for the node. This enhancement strengthens the source code linting code through value inference in our open-source library.
* Document to run `validate-groups-membership` before groups migration, not after ([3631](https://github.com/databrickslabs/ucx/issues/3631)). In this release, we have updated the order of executing the `validate-groups-membership` command in the group migration process. Previously, the command was recommended to be run after the groups migration, but it has been updated to be executed before the migration. This change ensures that the groups have the correct membership and the number of groups and users in the workspace and account are the same before migration, providing an extra level of safety. Additionally, we have updated the `remove-workspace-local-backup-groups` command to remove workspace-level backup groups and their permissions only after confirming the successful migration of all groups. We have also updated the spelling of the `validate-group-membership` command to `validate-groups-membership` in a documentation file. This release is aimed at software engineers who are adopting the project and looking to migrate their groups to the account level.
* Extend code migration progress documentation ([3588](https://github.com/databrickslabs/ucx/issues/3588)). In this documentation update, we have added two new sections, `Code Migration` and "Final details," to the open-source library's migration process documentation. The `Code Migration` section provides a detailed walkthrough of the steps to migrate code after completing table migration and data reconciliation, including using the linter to investigate compatibility issues and linted workspace resources. The "[linter advices](/docs/reference/linter_codes)" provide codes and messages on detected issues and resolution methods. The migrated code can then be prioritized and tracked using the `migration-progress` dashboard, and migrated using the `migrate-` commands. The `Final details` section outlines the steps to take once code migration is complete, including running the `cluster-remap` command to remap clusters to be Unity Catalog compatible. This update resolves issue [#2231](https://github.com/databrickslabs/ucx/issues/2231) and includes updated user documentation, with new methods for linting and migrating local code, managing dashboard migrations, and syncing workspace information. Additional commands for creating and validating table mappings, migrating locations, and assigning metastores are also included, with the aim of improving the code migration process by providing more detailed documentation and new commands for managing the migration.
* Fixed Skip/Unskip schema functionality ([3567](https://github.com/databrickslabs/ucx/issues/3567)). In this release, we have addressed the improper handling of skip/unskip schema functionality in our open-source library. The `skip_schema` and `unskip_schema` methods in the `mapping.py` file have been updated to include the `hive_metastore` schema prefix while setting or unsetting the database property that determines whether a schema should be skipped. Additionally, the `_get_database_in_scope_task` and `_get_table_in_scope_task` methods have been modified to parse table properties as a dictionary, allowing for more straightforward lookup of the skip property for a table. The `test_skip_with_schema` and `test_unskip_with_schema` methods in the `tests/unit/test_cli.py` file have also been updated. The `test_skip_with_schema` method now includes the catalog name `hive_metastore` in the `ALTER SCHEMA` statement, ensuring that the schema is properly skipped. The `test_unskip_with_schema` method has been modified to use the `SET DBPROPERTIES` statement to set the value of the `databricks.labs.ucx.skip` property to `false`, effectively unskipping the schema. Furthermore, the `execute` method in the `sbe` module and the queries in the `mock_backend` module have been updated to match the new commands. These changes address the issue of improperly skipping schemas and ensure that the code functions as intended, allowing users to skip and unskip schemas as needed. Overall, these modifications improve the reliability and correctness of the skip/unskip schema functionality, ensuring that it behaves as expected in different scenarios.
* Fixed `Total Tables` widget in assessment to only show table counts ([3738](https://github.com/databrickslabs/ucx/issues/3738)). In this release, we have addressed the issue with the `Total Tables` widget in the assessment dashboard as part of resolving [#3738](https://github.com/databrickslabs/ucx/issues/3738) and in relation to [#3252](https://github.com/databrickslabs/ucx/issues/3252). The revised `00_3_count_total_tables.sql` query in the `src/databricks/labs/ucx/queries/assessment/main/` directory now includes a WHERE clause to filter out views from the table count query. By excluding views and only displaying table counts in the `Total Tables` widget, the scope of changes is limited to the SQL query itself. The diff reflects the addition of the WHERE clause and necessary indentation. The commit has been manually tested as part of our quality assurance process, and the successful test results are documented in the `Tests` section of the commit message.
* Fixed broken anchor for doc release ([3720](https://github.com/databrickslabs/ucx/issues/3720)). In this release, we have developed and implemented fixes to address issues with the Databricks workflows documentation used in the migration process. The previous version contained a broken anchor reference for the workflow process, which has now been corrected. This improvement includes the addition of a manual test to verify the fix. The revised documentation enables users to view the status of deployed workflows and rerun failed workflows using the `workflows` and `repair-run` commands, respectively. These updates simplify the management and troubleshooting of workflows, enhancing the overall user experience.
* Fixed broken anchors in documentation ([3712](https://github.com/databrickslabs/ucx/issues/3712)). In this release, we have made significant improvements to the UCX process documentation, addressing issues related to broken anchors, outdated command names, and syntax. The commands `enable_hms_federation` and `create_federated_catalog` have been renamed to `enable-hms-federation` and `create-federated-catalog`, respectively. These updates include corresponding changes to the command syntax and have been manually tested to ensure accuracy. Additionally, we have added a new command, `validate-groups-membership`, which can be executed prior to the group migration workflow for added confidence. In case of no matching account group in the UCX-installed workspace, the `create-account-groups` command is now available. This release also includes updates to the section titles and links to enhance clarity and reflect current functionality.
* Fixed notebook sources with `NotebookLinter.apply` ([3693](https://github.com/databrickslabs/ucx/issues/3693)). A new `Github.py` file has been added to the `databricks/labs/ucx/` directory, providing functionality for working with GitHub issues. It includes an `IssueType` enum, a `construct_new_issue_url` function, and constants for constructing URLs to the documentation and GitHub repository. The `NotebookLinter` class has been updated to include notebook fixing functionality, and the `PythonLinter` class has been introduced to run `apply` on an Abstract Syntax Tree (AST) tree. The `Notebook.apply` method has been implemented to apply changes to notebook sources and the legacy `NotebookMigrator` has been removed. These changes also include various unit and integration tests and modifications to the existing `databricks labs ucx migrate-local-code` command. The `DOCS_URL` method has been added to the `databricks.labs.ucx.github` module, and the error message for external metastore connectivity issues now includes a link to the UCX installation instruction in the documentation.
* Fixed the broken documentation links in dashboards ([3726](https://github.com/databrickslabs/ucx/issues/3726)). This revision updates documentation links in various dashboards to correct broken links and enhance the user experience. Specifically, it addresses issues [#3725](https://github.com/databrickslabs/ucx/issues/3725) and [#3726](https://github.com/databrickslabs/ucx/issues/3726) by updating links in the "Assessment Overview," "Assessment Summary," and `Compute summary` dashboards, as well as the `group migration` and `table upgrade` documentation. The changes include replacing local Markdown file links with online documentation links and updating links to point to the correct documentation sections in the UCX GitHub repository. Although the changes have been manually tested, no unit or integration tests have been added, and staging environment verification has not been performed. Despite this, the revisions ensure accurate and up-to-date documentation links, improving the usability of the dashboards.
* Force `MaybeDependency` to have a `Dependency` OR `list[Problem]`, not neither nor both ([3635](https://github.com/databrickslabs/ucx/issues/3635)). This commit enforces the `MaybeDependency` object to have either a `Dependency` or a `list[Problem]`, but not neither or both, in order to handle known libraries during import registration. It resolves issue [#3585](https://github.com/databrickslabs/ucx/issues/3585), breaks up issue [#3626](https://github.com/databrickslabs/ucx/issues/3626), and progresses issue [#1527](https://github.com/databrickslabs/ucx/issues/1527), while modifying code linting logic and updating unit tests to accommodate these changes. Specifically, new classes like `KnownLoader`, `KnownDependency`, and `KnownProblem` have been introduced, and the `_resolve_allow_list` method has been updated to reflect the new enforcement. Additionally, tests have been added and modified to ensure the correct behavior of the modified logic, with a focus on handling directories, resolving children in context, and detecting known problems in imported libraries.
* HMS Federation Documentation ([3688](https://github.com/databrickslabs/ucx/issues/3688)). The HMS Federation feature allows Hive Metastore (HMS) to be federated to a catalog, acting as a step towards migrating to Unity Catalog or as a hybrid solution where both HMS and UC access to the data is required. This feature provides an alternative to the table migration process, eliminating the need for table mapping, creating catalogs and schemas, and migrating Hive metastore data objects. The `enable_hms_federation` command enables the Hive Metastore federation process, while the `create_federated_catalog` command creates a UC catalog that mirrors all the schemas and tables in the source Hive Metastore. The `migrate-glue-credentials` command, which is AWS-only, creates a UC Service Credential for GLUE. These new commands are documented in the HMS Federation Documentation section and are now part of the migration process documentation with the data reconciliation step following it. To enable HMS Federation, use the `enable-hms-federation` and `create-federated-catalog` commands.
* Make `MaybeTree` the main Python AST entrypoint for constructing the syntax tree ([3550](https://github.com/databrickslabs/ucx/issues/3550)). In this release, the main entry point for constructing the Python AST syntax tree has been changed from `Tree` to `MaybeTree` in the open-source library. This change involves moving class methods and static methods that construct a `MaybeTree` from the `Tree` class to the `MaybeTree` class, and making the class method that normalizes the source code before parsing the only entry point. The `normalized_parse` method has been renamed to `from_source_code` to match the commonly used naming for class methods within UCX. The `walk` and `first_statement` methods have been removed from `MaybeTree` as they were repetitions from `Tree`'s methods. These changes aim to enforce normalization and improve code consistency. Additionally, unit tests have been added and the Python linting related code has been modified to work with the new `MaybeTree` class. This change resolves issues [#3457](https://github.com/databrickslabs/ucx/issues/3457) and [#3213](https://github.com/databrickslabs/ucx/issues/3213).
* Make fixer diagnostic codes unique ([3582](https://github.com/databrickslabs/ucx/issues/3582)). This commit modifies the `databricks labs ucx migrate-local-code` command to make fixer diagnostic codes unique, ensuring accurate code migration and fixing. Two new methods have been added for modifying and adding unit and integration tests. Diagnostic codes for the `table-migrated-to-uc` issue are now unique depending on the context where the table is referenced: SQL, Python, or Python-SQL. This ensures the appropriate fixer is applied when addressing code migration issues, improving overall functionality and user experience. Additionally, the commit updates the documentation to include the new postfixes for the `table-migrated-to-uc` linter code and their descriptions, making it clearer for developers to diagnose and resolve issues related to table migration.
* Removed the linting false positive for missing table format warning when using `spark.table` ([3589](https://github.com/databrickslabs/ucx/issues/3589)). In this release, linting false positives related to missing table format warnings when using `spark.table` have been addressed, resolving issue [#3545](https://github.com/databrickslabs/ucx/issues/3545). The linting logic and unit tests have been updated to handle changes in the default format for table references in Databricks Runtime 8.0, which now uses Delta as the default format. These changes improve the accuracy of the linting process, reducing unnecessary warnings and enhancing the overall developer experience. Additionally, the `test_linting_walker_populates_paths` unit test in the `test_jobs.py` file has been updated to use a different file path for testing.
* Removed tree from `PythonSequentialLinter` ([3535](https://github.com/databrickslabs/ucx/issues/3535)). In this release, the `PythonSequentialLinter` has been refactored to no longer manipulate the code tree, and instead, the tree manipulation logic has been moved to `NotebookLinter`. This change improves the separation of concerns between the two components, resulting in a more modular and maintainable codebase. The `NotebookLinter` now handles early failure when resolving the code used by a notebook and attaches `%run` notebook trees as a child tree to the cell that calls the notebook. The code linting functionality has been modified, and the `databricks labs ucx lint-local-code` command has been updated. These changes resolve [#3543](https://github.com/databrickslabs/ucx/issues/3543) and progress [#3514](https://github.com/databrickslabs/ucx/issues/3514) and are dependent on PRs [#3529](https://github.com/databrickslabs/ucx/issues/3529) and [#3550](https://github.com/databrickslabs/ucx/issues/3550). The changes have been manually tested and include added and modified unit tests. Additionally, the `Advice` class has been updated to include a type variable `T`, which allows for more specific type hinting when creating instances of the class and its subclasses.
* Rename file language helper function ([3661](https://github.com/databrickslabs/ucx/issues/3661)). In this code change, the helper function for determining the file language and checking its support by the linter has been renamed and refactored. The function, previously called `file_language`, has been updated and now named `infer_file_language_if_supported`. This change clarifies the function's purpose as it not only infers the file language but also checks if the file is supported by the linter, acting as a filter. The function returns a `Language` object if the file is supported or `None` if it is not. The `infer_file_language_if_supported` function has been used in other parts of the codebase, such as the `is_a_notebook` function. This change improves the codebase's readability and maintainability by making the helper function's purpose more explicit. The related code has been updated to use the new function accordingly.
* Scope crawled jobs in `JobsCrawler` with `include_job_ids` ([3658](https://github.com/databrickslabs/ucx/issues/3658)). In this release, the `JobsCrawler` class in the `workflow_task.py` file has been updated to include a new optional parameter `include_job_ids` in the constructor. This parameter allows users to specify a list of job IDs to include in the crawling process, improving efficiency in large workspaces. Additionally, a check has been added to the `_assess_jobs` method to skip jobs whose IDs are not in the list of included IDs. Integration tests have been added to ensure the correct behavior of the new feature. This change resolves issue [#3656](https://github.com/databrickslabs/ucx/issues/3656), which requested the ability to crawl jobs based on a specific list of job IDs. It is recommended to add a comment to the code explaining the purpose and usage of the `include_job_ids` parameter and update the documentation accordingly.
* Support fixing `LocalFile`'s with `FileLinter` ([3660](https://github.com/databrickslabs/ucx/issues/3660)). In this release, we have added new methods `write_text`, `safe_write_text`, `back_up_path`, and `revert_back_up_path` to the `base.py` file to support fixing files in `LocalFile` containers and adding unit tests and integration tests. The `LocalFile` class in the "files.py" file has been extended to include new methods and properties, such as `apply`, `migrated_code`, `back_up_path`, and `back_up_original_and_flush_migrated_code`, enabling fixing files using linters and writing changes back to the container. The `databricks labs ucx migrate-local-code` command has also been updated to utilize the new functionality. These changes address issue [#3514](https://github.com/databrickslabs/ucx/issues/3514), ensuring the proper handling of errors during file writing and providing automated fixing of code issues within LocalFiles.
* Updated `migate-local-code` to use latest linter functionality ([3700](https://github.com/databrickslabs/ucx/issues/3700)). In this update, the `migrate-local-code` command has been enhanced by incorporating the latest linter functionality. The `LocalFileMigrator` and `LocalCodeLinter` classes have been merged, and the interfaces of `.fix` and `.apply` methods have been aligned. A new `FixerWalker` has been introduced to address dependencies in the dependency graph, and the existing `databricks labs ucx migrate-local-code` command has been updated accordingly. Relevant unit tests and integration tests have been added and modified to ensure the correctness of the changes, which resolve issue [#3514](https://github.com/databrickslabs/ucx/issues/3514) and supersede issue [#3520](https://github.com/databrickslabs/ucx/issues/3520). The `lint-local-code` command has also been updated with a flag to specify the path for linting. The `migate-local-code` command now lints local code and generates advice on how to make it compatible with the Unity Catalog, and can also apply local code fixes to make them compatible.
* Updated sqlglot requirement from <26.3,>=25.5.0 to >=25.5.0,<26.4 ([3572](https://github.com/databrickslabs/ucx/issues/3572)). In this pull request, we have updated the requirement for the `sqlglot` library in the 'pyproject.toml' file, changing it from being greater than or equal to version 25.5.0 and less than 26.3, to being greater than or equal to version 25.5.0 and less than 26.4. This change is part of issue [#3572](https://github.com/databrickslabs/ucx/issues/3572) and was made to allow for the use of the latest version of 'sqlglot'. The pull request includes a changelog from the `sqlglot` repository, detailing the changes made in each version between 25.5.0 and 26.4. The commits relevant to this update include bumping the version of `sqlglotrs` to various versions between 0.3.7 and 0.3.14. This pull request was automatically generated by Dependabot, a tool that creates pull requests to update the dependencies in a project. It is now ready for review and merging.
* Updated sqlglot requirement from <26.4,>=25.5.0 to >=25.5.0,<26.7 ([3677](https://github.com/databrickslabs/ucx/issues/3677)). In this release, we have updated the `sqlglot` dependency from version `>=25.5.0,<26.4` to `>=25.5.0,<26.7`. This change allows us to leverage the latest version of `sqlglot`, which includes various bug fixes and improvements, such as avoiding redundant casts in FROM/TO_UTC_TIMESTAMP and enhancing UUID support. Although there are some breaking changes introduced in the latest version, they should not affect our project's functionality. Additionally, this update includes several bug fixes and improvements for specific dialects such as Redshift, BigQuery, and TSQL. Overall, this update enhances the performance and functionality of the `sqlglot` library, ensuring compatibility with the latest version.
* Use cached property for table migration index on local checkout context ([3711](https://github.com/databrickslabs/ucx/issues/3711)). In this release, we introduce a new cached property, `_migration_index`, to the `LocalCheckoutContext` class, designed to store the table migration index for the local checkout context. This change aims to prevent multiple recrawling when the migration index is empty. The `linter_context_factory` method has been refactored to utilize the new `_migration_index` property, and the `CurrentSessionState` parameter is removed. Additionally, the `local_code_linter` method has been updated to leverage the new `LinterContext` instance with the `_migration_index` property, instead of using the `linter_context_factory` method. The `LocalCodeLinter` object now accepts a new callable lambda function, returning a `LinterContext` instance with the `_migration_index` property. These enhancements improve code performance by reducing the migration index crawls in the local checkout context and simplify the code by eliminating the `CurrentSessionState` parameter.
* [DOCS] Explain when to run `remove-workspace-local-backup-groups` workflow ([3707](https://github.com/databrickslabs/ucx/issues/3707)). In this release, the UCX component of the application has been enhanced with new Databricks workflows for orchestrating the group migration process. The `workflows` command displays the status of the workflows, and the `repair-run` command allows for rerunning failed workflows. The group migration workflow is specifically designed to be executed after a successful assessment workflow, and running it is followed by an optional `remove-workspace-local-backup-groups` workflow. This final step removes unnecessary workspace-level backup groups and their associated permissions, keeping the workspace clean and organized. The `remove-workspace-local-backup-groups` workflow should only be executed after confirming the successful migration of all groups involved.

Dependency updates:

* Updated sqlglot requirement from <26.3,>=25.5.0 to >=25.5.0,<26.4 ([3572](https://github.com/databrickslabs/ucx/pull/3572)).
* Updated sqlglot requirement from <26.4,>=25.5.0 to >=25.5.0,<26.7 ([3677](https://github.com/databrickslabs/ucx/pull/3677)).

0.55.0

* Introducing UCX docs! ([3458](https://github.com/databrickslabs/ucx/pull/3458)). In this release, we introduced the new documents for UCX, you can find them here: [https://databrickslabs.github.io/ucx/](https://databrickslabs.github.io/ucx/)
* Hosted Runner for release ([3532](https://github.com/databrickslabs/ucx/issues/3532)). In this release, we have made improvements to the release job's security and control by moving the release.yml file to a new location within a hosted runner group labeled "linux-ubuntu-latest." This change ensures that the release job now runs in a protected runner group, enhancing the overall security and reliability of the release process. The job's environment remains set to "release," and it retains the same authentication and artifact signing permissions as before the move, ensuring a seamless transition while improving the security and control of the release process.

0.54.0

* Implement disposition field in SQL backend ([3477](https://github.com/databrickslabs/ucx/issues/3477)). This commit adds a `query_statement_disposition` configuration option for the SQL backend in the UCX tool, allowing users to specify the disposition of SQL statements during assessment results export and preventing failures when dealing with large workspaces and a large number of findings. The new configuration option is added to the `config.yml` file and used by the `SqlBackend` definition. The `databricks labs install ucx` and `databricks labs ucx export-assessment` commands have been modified to support this new functionality. A new `Disposition` enum has been added to the `databricks.sdk.service.sql` module. This change resolves issue [#3447](https://github.com/databrickslabs/ucx/issues/3447) and is related to pull request [#3455](https://github.com/databrickslabs/ucx/issues/3455). The functionality has been manually tested.
* AWS role issue with external locations pointing to the root of a storage account ([3510](https://github.com/databrickslabs/ucx/issues/3510)). The `AWSResources` class in the `aws.py` file has been updated to enhance the regular expression pattern for matching S3 bucket names, now including an optional group for trailing slashes and any subsequent characters. This allows for recognition of external locations pointing to the root of a storage account, addressing issue [#3505](https://github.com/databrickslabs/ucx/issues/3505). The `access.py` file within the AWS module has also been updated, introducing a new `path` variable and updating a for loop condition to accurately identify missing paths in external locations referencing the root of a storage account. New unit tests have been added to `tests/unit/aws/test_access.py`, including a `test_uc_roles_create_all_roles` method that checks the creation of all possible UC roles when none exist and external locations with and without folders. Additionally, the `backend` fixture has been updated to include a new external location `s3://BUCKET4`, and various tests have been updated to incorporate this location and handle errors appropriately.
* Added assert to make sure installation is finished before re-installation ([3546](https://github.com/databrickslabs/ucx/issues/3546)). In this release, we have added an assertion to ensure that the installation process is completed before attempting to reinstall, addressing a previous issue where the reinstallation was starting before the first installation was finished, causing a warning to not be raised and resulting in a test failure. We have introduced a new function `wait_for_installation_to_finish()`, which retries loading the installation if it is not found, with a timeout of 2 minutes. This function is utilized in the `test_compare_remote_local_install_versions` test to ensure that the installation is finished before proceeding. Furthermore, we have extracted the warning message to a variable `error_message` for better readability. This change enhances the reliability of the installation process.
* Added dashboards to migration progress dashboard ([3314](https://github.com/databrickslabs/ucx/issues/3314)). This commit introduces significant updates to the migration progress dashboard, adding dashboards, linting resources, and modifying existing components. The changes include a new dashboard displaying the number of dashboards pending migration, with the data sourced from the `ucx_catalog.multiworkspace.objects_snapshot` table. The existing 'Migration [main]' dashboard has been updated, and unit and integration tests have been adapted accordingly. The commit also renames several SQL files, updates the percentage UDF, grant, job, cluster, table, and pipeline migration progress queries, and resolves linting compatibility issues related to Unity Catalog. The changes depend on issue [#3424](https://github.com/databrickslabs/ucx/issues/3424), progress issue [#3045](https://github.com/databrickslabs/ucx/issues/3045), and break up issue [#3112](https://github.com/databrickslabs/ucx/issues/3112). The new dashboard aims to enhance the migration process and ensure a smooth transition to the Unity Catalog.
* Added history log encoder for dashboards ([3424](https://github.com/databrickslabs/ucx/issues/3424)). A new history log encoder for dashboards has been added, addressing issues [#3368](https://github.com/databrickslabs/ucx/issues/3368) and [#3369](https://github.com/databrickslabs/ucx/issues/3369), and modifying the existing `experimental-migration-progress` workflow. This update includes the addition of the `DashboardOwnership` class, used to generate ownership information for dashboards, and the `DashboardProgressEncoder` class, responsible for encoding progress data related to dashboards. The new functionality is tested through manual, unit, and integration testing. In the `Table` class, the `from_table_info` and `from_historical_data` methods have been added, allowing for the creation of `Table` instances from `TableInfo` objects and historical data dictionaries with more flexibility and safety. The `test_tables.py` file in the `integration/progress` directory has also been updated to include a new test function for checking table failures. These changes improve the tracking and management of dashboard IDs, enhance user name retrieval, and ensure the accurate determination of object ownership.
* Create specific failure for Python syntax error while parsing with Astroid ([3498](https://github.com/databrickslabs/ucx/issues/3498)). This commit enhances the Python linting functionality in our open-source library by introducing a specific failure message, `python-parse-error`, for syntax errors encountered during code parsing using Astroid. Previously, a generic `system-error` message was used, which has been renamed to maintain consistency with the existing `sql-parse-error` message. This change provides clearer failure indicators and includes more detailed information about the error location. Additionally, modifications to Python linting-related code, unit test additions, and updates to the README guide users on handling these new error types have been implemented. A new method, `Tree.maybe_parse()`, has been introduced to parse Python code and detect syntax errors, ensuring more precise error handling for users.
* DBR 16 and later support ([3481](https://github.com/databrickslabs/ucx/issues/3481)). This pull request introduces support for Databricks Runtime (DBR) 16 and later in the code that converts Hive Metastore (HMS) tables to external tables within the `migrate-tables` workflow. The changes include the addition of a new static method `_get_entity_storage_locations` to handle the new `entityStorageLocations` property in DBR16 and the modification of the `_convert_hms_table_to_external` method to account for this property. Additionally, the `run_workflow` function in the `assessment` workflow now has the `skip_job_wait` parameter set to `True`, which allows the workflow to continue running even if a job within it fails. The changes have been manually tested for DBR16, verified in a staging environment, and existing integration tests have been run for DBR 15. The diff also includes updates to the `test_table_migration_convert_manged_to_external` method to skip job waiting during testing, enabling the test to run successfully on DBR 16.
* Delete stale code: `NotebookLinter._load_source_from_run_cell` ([3529](https://github.com/databrickslabs/ucx/issues/3529)). In this update, we have removed the stale code `NotebookLinter._load_source_from_run_cell`, which was responsible for loading the source code from a run cell in a notebook. This change is a part of the ongoing effort to address issue [#3514](https://github.com/databrickslabs/ucx/issues/3514) and enhances the overall codebase. Additionally, we have modified the existing `databricks labs ucx lint-local-code` command to update the code linting functionality. We have conducted manual testing to ensure that the changes function as intended and have added and modified several unit tests. The `_load_source_from_run_cell` method is no longer needed, as it was part of a deprecated functionality. The modifications to the `databricks labs ucx lint-local-code` command impact the way code linting is performed, ultimately improving the efficiency and maintainability of the codebase.
* Exclude ucx dashboards from Lakeview dashboard crawler ([3450](https://github.com/databrickslabs/ucx/issues/3450)). In this release, we have enhanced the `lakeview_crawler` method in the open-source library to exclude Ucx dashboards and prevent false positives. This has been achieved by adding a new optional argument, `exclude_dashboard_ids`, to the `__init__` method, which takes a list of dashboard IDs to exclude from the crawler. The `_crawl` method has been updated to skip dashboards whose IDs match the ones in the `exclude_dashboard_ids` list. The change includes unit tests and manual testing to ensure proper functionality and has been verified on the staging environment. These updates improve the accuracy and reliability of the dashboard crawler, providing better results for software engineers utilizing this library.
* Fixed issue in installing UCX on UC enabled workspace ([3501](https://github.com/databrickslabs/ucx/issues/3501)). This PR introduces changes to the `ClusterPolicyInstaller` class, updating the `spark_version` policy definition from a fixed value to an allowlist with a default value. This resolves an issue where, when UC is enabled on a workspace, the cluster definition takes on `single_user` and `user_isolation` values instead of `Legacy_Single_User` and 'Legacy_Table_ACL'. The job definition is also updated to use the default value when not explicitly provided. These changes improve compatibility with UC-enabled workspaces, ensuring the correct values for `spark_version` in the cluster definition. The PR includes updates to unit tests and installation tests, addressing issue [#3420](https://github.com/databrickslabs/ucx/issues/3420).
* Fixed typo in workflow name (in error message) ([3491](https://github.com/databrickslabs/ucx/issues/3491)). This PR (Pull Request) addresses a minor typo in the error message displayed by the `validate_groups_permissions` method in the `workflows.py` file. The typo occurred in the workflow name mentioned in the error message, where `group` was incorrectly spelled as "groups." The corrected spelling is now `validate-groups-permissions`. This change does not introduce any new methods or modify any existing functionality, but instead focuses on enhancing the clarity and accuracy of the error message. Ensuring that error messages are free from typos and other inaccuracies is essential for maintaining the usability and effectiveness of the code, as it enables users to more easily troubleshoot any issues that may arise during its usage.
* HMS Federation Glue Support ([3526](https://github.com/databrickslabs/ucx/issues/3526)). This commit introduces support for HMS Federation Glue in the open-source library, resolving issue [#3011](https://github.com/databrickslabs/ucx/issues/3011). The changes include adding a new command, `migrate-glue-credentials`, to migrate Glue credentials to UC storage credentials in the federation glue for HMS. The `AWSResourcePermissions` class has been updated to include a new parameter `config` for HMS Federation Glue configuration and the `load_uc_compatible_roles` method now accepts an optional `resource_type` parameter for filtering compatible roles based on the provided type. Additionally, the `ExternalLocations` class has been updated to handle S3 resource type when identifying missing external locations. The commit also includes several bug fixes, new classes, methods, and changes to the existing methods to handle AWS Glue resources, and updates to the integration tests. Overall, these changes add significant functionality for AWS Glue support in the HMS Federation Glue feature.
* Make link to issue template url safe ([3508](https://github.com/databrickslabs/ucx/issues/3508)). In this release, we have updated the `python_ast.py` file to enhance the encoding of the link to the issue template for bug reports. By utilizing the `urllib.parse.quote_plus()` function from Python's standard library, we have ensured that any special characters in the provided source code will be properly encoded. This eliminates the risk of issues arising from incorrectly interpreted characters, enhancing the reliability of the bug reporting process. This change, initially introduced in issue [#3498](https://github.com/databrickslabs/ucx/issues/3498), has been thoroughly tested to guarantee its correct functioning. The rest of the file remains unaffected, preserving its original functionality.
* Refactor `PipelineMigrator`'s to add `include_pipeline_ids` ([3495](https://github.com/databrickslabs/ucx/issues/3495)). In this release, the `PipelineMigrator` class has been refactored to enhance pipeline migration functionality. The `skip-pipeline-ids` flag has been replaced with `include-pipeline-ids`, allowing users to specify a list of pipelines to migrate, rather than listing pipelines to skip. Additionally, the `exclude_pipeline_ids` functionality has been added to provide even more granularity in pipeline selection. The `migrate_pipelines` method now prioritizes `include_pipeline_ids` and `exclude_pipeline_ids` parameters to determine the list of pipelines for migration. The `_migrate_pipeline` method has been updated to accept a string pipeline ID and now checks if the pipeline has already been migrated. Several support methods, such as `_clone_pipeline`, have also been refactored for improved functionality. Although no new methods were added, the behavior of the `migrate_pipelines` method has changed. While unit tests have been updated to cover the changes, integration tests have not been modified yet. Ensure thorough testing to prevent any new issues or breaks in existing functionality.
* Release v0.54.0 ([3530](https://github.com/databrickslabs/ucx/issues/3530)). 0.54.0 brings several enhancements and bug fixes to the UCX library. A `query_statement_disposition` option is added to the SQL backend to handle large SQL queries during assessment results export, preventing potential failures in large workspaces with high volumes of findings. AWS role compatibility checks are improved for external locations pointing to the root of a storage account. Dashboards are enhanced with added migration progress dashboards and a history log encoder. New failure types are introduced for Python syntax errors during parsing and SQL parsing errors. The library now supports DBR 16 and later versions, with optional conversion of Hive Metastore tables to external tables in the `migrate-tables` workflow. The `PipelineMigrator` functionality is refactored to add an `include_pipeline_ids` parameter for better control over the migration process. Various dependency updates, including `databricks-labs-blueprint`, `databricks-sdk`, and `sqlglot`, are included in this release, which bring new features, improvements, and bug fixes, as well as API changes. Please thoroughly test and review the changes to ensure seamless functionality.
* Rename Python AST's `Tree` methods for clarity ([3524](https://github.com/databrickslabs/ucx/issues/3524)). In this release, we have made significant improvements to the clarity of the Python AST's `Tree` methods in the `python_analyzer.py` file. The `append_` and `extend_` methods have been renamed to `attach_` to better reflect their functionality. These methods now always return `None`. New methods such as `attach_child_tree`, `attach_nodes`, and `extend_globals` have been introduced to enhance the functionality of the library. The `attach_child_tree` method allows for attaching one tree as a child of another tree, propagating module references and enabling traversal from both the parent and child trees. The `attach_nodes` method sets the parent of the attached nodes and adds them to the body of the tree. Additionally, docstrings have been added, and unit testing has been expanded. The changes include modifications to code linting, existing command functionalities, and manual testing to ensure compatibility. These enhancements improve the clarity, functionality, and flexibility of the Python AST's `Tree` methods.
* Revert "Release v0.54.0" ([3569](https://github.com/databrickslabs/ucx/issues/3569)). In version 0.53.1, we have reverted changes from 0.54.0 to address issues with the previous release and ensure proper propagation to PyPI. This version includes various updates such as implementing a disposition field in the SQL backend, improving ARN pattern matching for AWS roles, adding dashboards to migration progress, enhancing Python linting functionality, and adding support for DBR 16 in converting Hive Metastore tables to external tables. We have also excluded UCX dashboards from the Lakeview dashboard crawler, refactored PipelineMigrator's to add include_pipeline_ids, and updated the sqlglot and databricks-labs-blueprint requirements. Additionally, several issues related to installation, typo in workflow name, and table-migration workflows have been fixed. The sqlglot requirement has been updated from <26.1,>=25.5.0 to >=25.5.0,<26.2, and databricks-labs-blueprint from <0.10,>=0.9.1 to >=0.9.1,<0.11. This release does not introduce any new methods or change existing functionality, but focuses on addressing bugs and improving functionality.
* Schedule the migration progress workflow to run daily ([3485](https://github.com/databrickslabs/ucx/issues/3485)). This PR introduces a daily scheduling mechanism for the UCX installation's migration progress workflow, allowing it to run automatically once per day at 5 a.m. UTC. It includes refactoring the plumbing for managing and installing workflows, enabling them to have a Cron-based schedule. Relevant user documentation has been updated, and existing unit and integration tests have been added to ensure the changes function as intended. A new test has been added to verify the migration-progress workflow is installed with a schedule attached, checking the workflow schedule's quartz cron expression, time zone, and pause status, as well as confirming that the workflow is unpaused upon installation. The PR also introduces new methods to manage workflow scheduling and configure cron-based schedules.
* Scope crawled pipelines in PipelineCrawler ([3513](https://github.com/databrickslabs/ucx/issues/3513)). In the latest release, we have introduced a new optional argument, 'include_pipeline_ids', in the constructor of the PipelinesCrawler class located in the 'databricks/labs/ucx/assessment' module. This argument allows users to filter pipelines based on a list of pipeline IDs, improving the crawler's flexibility and efficiency in processing pipelines. In the `_crawl` method of the PipelinesCrawler class, a new behavior has been implemented based on the value of 'include_pipeline_ids'. If the argument is not None, then the method uses the pipeline IDs from this list instead of retrieving all pipelines. Additionally, two unit tests have been added to verify the functionality of this new argument and ensure that the crawler handles cases where a pipeline is not found or its specification is missing. A new parameter, 'force_refresh', has also been added to the `snapshot` function. This release aims to provide a more efficient and customizable pipeline crawling experience for users.
* Updated databricks-labs-blueprint requirement from <0.10,>=0.9.1 to >=0.9.1,<0.11 ([3519](https://github.com/databrickslabs/ucx/issues/3519)). In this update, the requirement for the `databricks-labs-blueprint` library has been changed from version range '<0.10,>=0.9.1>' to a new range of '>=0.9.1,<0.11'. This change allows for the use of the latest version of the library while maintaining compatibility with the current project setup, and is based on information from the library's releases and changelog. The commit includes a list of commits and dependencies for the updated library. This update was automatically implemented by Dependabot, a tool that handles dependency updates and conflict resolution, ensuring a seamless integration process for engineers adopting the project.
* Updated databricks-sdk requirement from <0.41,>=0.40 to >=0.40,<0.42 ([3553](https://github.com/databrickslabs/ucx/issues/3553)). In this release, we have updated the `databricks-sdk` package requirement to permit version 0.41 while excluding version 0.42. This update includes several improvements and new features in version 0.41, such as the addition of the `serving.http_request` method for calling external functions and enhancements to the Files API client to recover from download failures. The commit also includes bug fixes, internal changes, and updates to the API for better functionality and compatibility. It is essential to note that these changes have been made to ensure compatibility with the latest features and improvements in the `databricks-sdk` package.
* Updated sqlglot requirement from <26.1,>=25.5.0 to >=25.5.0,<26.2 ([3500](https://github.com/databrickslabs/ucx/issues/3500)). In this release, we have updated the version requirement for the sqlglot package. The minimum version required is now 25.5.0 and less than 26.2, previously it was 25.5.0 and less than 26.1. This change allows for the most recent version of sqlglot to be installed, while still maintaining compatibility with the current codebase. The update is necessary due to breaking changes introduced in version 26.1.0 of sqlglot, including normalizing before qualifying tables, requiring the `AS` token in CTEs for all dialects except spark and databricks, supporting Unicode in sqlite, mysql, tsql, postgres, and oracle, parsing ASCII into Unicode to facilitate transpilation, and improving transpilation of CHAR[ACTER]_LENGTH. Additionally, several bug fixes and new features have been added in this update.
* Updated sqlglot requirement from <26.2,>=25.5.0 to >=25.5.0,<26.3 ([3528](https://github.com/databrickslabs/ucx/issues/3528)). In this release, we have updated the version constraint for the `sqlglot` dependency in our project's "pyproject.toml" file. The previous constraint allowed versions between 25.5.0 and 26.2, while the new constraint allows versions between 25.5.0 and 26.3. This change was made to ensure that we can use the latest version of sqlglot while also preventing the version from exceeding 26.3. Additionally, the commit includes detailed information about the specific commits and changes made in the updated version of sqlglot, providing valuable insights for software engineers working with this open-source library.
* Updated table-migration workflows to also capture updated migration progress into the history log ([3239](https://github.com/databrickslabs/ucx/issues/3239)). The table-migration workflows have been updated to log not only the tables that still need to be migrated, but also the updated progress information into the history log, ensuring a more comprehensive record of migration progress. The affected workflows include `migrate-tables`, `migrate-external-hiveserde-tables-in-place-experimental`, `migrate-external-tables-ctas`, `scan-tables-in-mounts-experimental`, and `migrate-tables-in-mounts-experimental`. The encoder for table-history has been updated to prevent implicit refresh of `TableMigrationStatus` data during initialization. Additionally, the documentation has been updated to reflect which workflows update which tables. New and updated unit and integration tests, as well as manual testing, have been conducted to ensure the functionality of the changes.

Dependency updates:

* Updated sqlglot requirement from <26.1,>=25.5.0 to >=25.5.0,<26.2 ([3500](https://github.com/databrickslabs/ucx/pull/3500)).
* Updated databricks-labs-blueprint requirement from <0.10,>=0.9.1 to >=0.9.1,<0.11 ([3519](https://github.com/databrickslabs/ucx/pull/3519)).
* Updated databricks-sdk requirement from <0.41,>=0.40 to >=0.40,<0.42 ([3553](https://github.com/databrickslabs/ucx/pull/3553)).

0.53.1

* Removed `packaging` package dependency ([3469](https://github.com/databrickslabs/ucx/issues/3469)). In this release, we have removed the dependency on the `packaging` package in the open-source library to address a release issue. The import statements for "packaging.version.Version" and "packaging.version.InvalidVersion" have been removed. The function _external_hms in the federation.py file has been updated to retrieve the Hive Metastore version using the "spark.sql.hive.metastore.version" configuration key and validate it using a regular expression pattern. If the version is not valid, the function logs an informational message and returns None. This change modifies the Hive Metastore version validation logic and improves the overall reliability and maintainability of the library.

0.53.0

* Added dashboard crawlers ([3397](https://github.com/databrickslabs/ucx/issues/3397)). The open-source library has been updated with new dashboard crawlers for the assessment workflow, Redash migration, and QueryLinter. These crawlers are responsible for crawling and persisting dashboards, as well as migrating or reverting them during Redash migration. They also lint the queries of the crawled dashboards using QueryLinter. This change resolves issues [#3366](https://github.com/databrickslabs/ucx/issues/3366) and [#3367](https://github.com/databrickslabs/ucx/issues/3367), and progresses [#2854](https://github.com/databrickslabs/ucx/issues/2854). The 'databricks labs ucx {migrate-dbsql-dashboards|revert-dbsql-dashboards}' command and the `assessment` workflow have been modified to incorporate these new features. Unit tests and integration tests have been added to ensure proper functionality of the new dashboard crawlers. Additionally, two new tables, _.redash_dashboards and _.lakeview_dashboards, have been introduced to hold a list of all Redash or Lakeview dashboards and are used by the `QueryLinter` and `Redash` migration. These changes improve the assessment, migration, and linting processes for dashboards in the library.
* DBFS Root Support for HMS Federation ([3425](https://github.com/databrickslabs/ucx/issues/3425)). The commit `DBFS Root Support for HMS Federation` introduces changes to support the DBFS root location for HMS federation. A new method, `external_locations_with_root`, is added to the `ExternalLocations` class to return a list of external locations including the DBFS root location. This method is used in various functions and test cases, such as `test_create_uber_principal_no_storage`, `test_create_uc_role_multiple_raises_error`, `test_create_uc_no_roles`, `test_save_spn_permissions`, and `test_create_access_connectors_for_storage_accounts`, to ensure that the DBFS root location is correctly identified and tested in different scenarios. Additionally, the `external_locations.snapshot.return_value` is changed to `external_locations.external_locations_with_root.return_value` in test functions `test_create_federated_catalog` and `test_already_existing_connection` to retrieve a list of external locations including the DBFS root location. This commit closes issue [#3406](https://github.com/databrickslabs/ucx/issues/3406), which was related to this functionality. Overall, these changes improve the handling and testing of DBFS root location in HMS federation.
* Log message as error when legacy permissions API is enabled/disabled depending on the workflow ran ([3443](https://github.com/databrickslabs/ucx/issues/3443)). In this release, logging behavior has been updated in several methods in the 'workflows.py' file. When the `use_legacy_permission_migration` configuration is set to False and specific conditions are met, error messages are now logged instead of info messages for the methods 'verify_metastore_attached', 'rename_workspace_local_groups', 'reflect_account_groups_on_workspace', 'apply_permissions_to_account_groups', 'apply_permissions', and 'validate_groups_permissions'. This change is intended to address issue [#3388](https://github.com/databrickslabs/ucx/issues/3388) and provides clearer guidance to users when the legacy permissions API is not functioning as expected. Users will now see an error message advising them to run the `migrate-groups` job or set `use_legacy_permission_migration` to True in the config.yml file. These updates will help ensure smoother workflow runs and more accurate logging for better troubleshooting.
* MySQL External HMS Support for HMS Federation ([3385](https://github.com/databrickslabs/ucx/issues/3385)). This commit adds support for MySQL-based Hive Metastore (HMS) in HMS Federation, enhances the CLI for creating a federated catalog, and improves external HMS functionality. It introduces a new parameter `enable_hms_federation` in the `Locations` class constructor, allowing users to enable or disable MySQL-based HMS federation. The `external_locations` method in `application.py` now accepts `enable_hms_federation` as a parameter, enabling more granular control of the federation feature. Additionally, the CLI for creating a federated catalog has been updated to accept a `prompts` parameter, providing more flexibility. The commit also introduces a new dataclass `ExternalHmsInfo` for external HMS connection information and updates the `HiveMetastoreFederationEnabler` and `HiveMetastoreFederation` classes to support non-Glue external metastores. Furthermore, it adds methods to handle the creation of a Federated Catalog from the command-line interface, split JDBC URLs, and manage external connections and permissions.
* Skip listing built-in catalogs to update table migration process ([3464](https://github.com/databrickslabs/ucx/issues/3464)). In this release, the migration process for updating tables in the Hive Metastore has been optimized with the introduction of the `TableMigrationStatusRefresher` class, which inherits from `CrawlerBase`. This new class includes modifications to the `_iter_schemas` method, which now filters out built-in catalogs and schemas when listing catalogs and schemas, thereby skipping unnecessary processing during the table migration process. Additionally, the `get_seen_tables` method has been updated to include checks for `schema.name` and `schema.catalog_name`, and the `_crawl` and `_try_fetch` methods have been modified to reflect changes in the `TableMigrationStatus` constructor. These changes aim to improve the efficiency and performance of the migration process by skipping built-in catalogs and schemas. The release also includes modifications to the existing `migrate-tables` workflow and adds unit tests that demonstrate the exclusion of built-in catalogs during the table migration status update process. The test case utilizes the `CatalogInfoSecurableKind` enumeration to specify the kind of catalog and verifies that the seen tables only include the non-builtin catalogs. These changes should prevent unnecessary processing of built-in catalogs and schemas during the table migration process, leading to improved efficiency and performance.
* Updated databricks-sdk requirement from <0.39,>=0.38 to >=0.39,<0.40 ([3434](https://github.com/databrickslabs/ucx/issues/3434)). In this release, the requirement for the `databricks-sdk` package has been updated in the pyproject.toml file to be strictly greater than or equal to 0.39 and less than 0.40, allowing for the use of the latest version of the package while preventing the use of versions above 0.40. This change is based on the release notes and changelog for version 0.39 of the package, which includes bug fixes, internal changes, and API changes such as the addition of the `cleanrooms` package, delete() method for workspace-level services, and fields for various request and response objects. The commit history for the package is also provided. Dependabot has been configured to resolve any conflicts with this PR and can be manually triggered to perform various actions as needed. Additionally, Dependabot can be used to ignore specific dependency versions or close the PR.
* Updated databricks-sdk requirement from <0.40,>=0.39 to >=0.39,<0.41 ([3456](https://github.com/databrickslabs/ucx/issues/3456)). In this pull request, the version range of the `databricks-sdk` dependency has been updated from '<0.40,>=0.39' to '>=0.39,<0.41', allowing the use of the latest version of the `databricks-sdk` while ensuring that it is less than 0.41. The pull request also includes release notes detailing the API changes in version 0.40.0, such as the addition of new fields to various compute, dashboard, job, and pipeline services. A changelog is provided, outlining the bug fixes, internal changes, new features, and improvements in versions 0.39.0, 0.40.0, and 0.38.0. A list of commits is also included, showing the development progress of these versions.
* Use LTS Databricks runtime version ([3459](https://github.com/databrickslabs/ucx/issues/3459)). This release introduces a change in the Databricks runtime version to a Long-Term Support (LTS) release to address issues encountered during the migration to external tables. The previous runtime version caused the `convert to external table` migration strategy to fail, and this change serves as a temporary solution. The `migrate-tables` workflow has been modified, and existing integration tests have been reused to ensure functionality. The `test_job_cluster_policy` function now uses the LTS version instead of the latest version, ensuring a specified Spark version for the cluster policy. The function also checks for matching node type ID, Spark version, and necessary resources. However, users may still encounter problems with the latest Universal Connectivity (UCX) release. The `_convert_hms_table_to_external` method in the `table_migrate.py` file has been updated to return a boolean value, with a new TODO comment about a possible failure with Databricks runtime 16.0 due to a JDK update.
* Use `CREATE_FOREIGN_CATALOG` instead of `CREATE_FOREIGN_SECURABLE` with HMS federation enablement commands ([3309](https://github.com/databrickslabs/ucx/issues/3309)). A change has been made to update the `databricks-sdk` dependency version from `>=0.38,<0.39` to `>=0.39` in the `pyproject.toml` file, which may affect the project's functionality related to the `databricks-sdk` library. In the Hive Metastore Federation codebase, `CREATE_FOREIGN_CATALOG` is now used instead of `CREATE_FOREIGN_SECURABLE` for HMS federation enablement commands, aligned with issue [#3308](https://github.com/databrickslabs/ucx/issues/3308). The `_add_missing_permissions_if_needed` method has been updated to check for `CREATE_FOREIGN_SECURABLE` instead of `CREATE_FOREIGN_CATALOG` when granting permissions. Additionally, a unit test file for HiveMetastore Federation has been updated to reflect the use of `CREATE_FOREIGN_SECURABLE` in the import statements and test functions, although this change is limited to the test file and does not affect production code. Thorough testing is recommended after applying this update to ensure that the project functions as expected and to benefit from the potential security improvements associated with the updated privilege handling.

Dependency updates:

* Updated databricks-sdk requirement from <0.40,>=0.39 to >=0.39,<0.41 ([3456](https://github.com/databrickslabs/ucx/pull/3456)).

Page 1 of 12

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.