Databricks-labs-remorph

Latest version: v0.8.0

Safety actively analyzes 682244 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 3

0.3.0

* Added Oracle ojdbc8 dependent library during reconcile Installation ([474](https://github.com/databrickslabs/remorph/issues/474)). In this release, the `deployment.py` file in the `databricks/labs/remorph/helpers` directory has been updated to add the `ojdbc8` library as a `MavenLibrary` in the `_job_recon_task` function, enabling the reconciliation process to access the Oracle Data source and pull data for reconciliation between Oracle and Databricks. The `JDBCReaderMixin` class in the `jdbc_reader.py` file has also been updated to include the Oracle ojdbc8 dependent library for reconciliation during the `reconcile` process. This involves installing the `com.oracle.database.jdbc:ojdbc8:23.4.0.24.05` jar as a dependent library and updating the driver class to `oracle.jdbc.driver.OracleDriver` from `oracle`. A new dictionary `driver_class` has been added, which maps the driver name to the corresponding class name, allowing for dynamic driver class selection during the `_get_jdbc_reader` method call. The `test_read_data_with_options` unit test has been updated to test the Oracle connector for reading data with specific options, including the use of the correct driver class and specifying the database table for data retrieval, improving the accuracy and reliability of the reconciliation process.
* Added TSQL coverage tests in the generated report artifact ([452](https://github.com/databrickslabs/remorph/issues/452)). In this release, we have added new TSQL coverage tests and Snowflake coverage tests to the generated report artifact in the CI/CD pipeline. These tests are executed using Maven with the updated command "mvn --update-snapshots -B test -pl coverage --file pom.xml --fail-at-end" and "mvn --update-snapshots -B exec:java -pl coverage --file pom.xml --fail-at-end -Dexec.args="-i tests/resources/functional/snowflake -o coverage-result.json" respectively, and the "continue-on-error: true" option is added to allow the pipeline to proceed even if the tests fail. Additionally, we have introduced a new constructor to the `CommentBasedQueryExtractor` class, which accepts a `dialect` parameter and allows for easier configuration of the start and end comments for different SQL dialects. We have also updated the CommentBasedQueryExtractor for Snowflake and added two TSQL coverage tests to the generated report artifact to ensure that the `QueryExtractor` is working correctly for TSQL queries. These changes will help ensure thorough testing and identification of TSQL and Snowflake queries during the CI/CD process.
* Added full support for analytical windowing functions ([401](https://github.com/databrickslabs/remorph/issues/401)). In this release, full support for analytical windowing functions has been implemented, addressing issue [#401](https://github.com/databrickslabs/remorph/issues/401). The functions were previously specified in the parser grammar but have been moved to the standard function lookup table for more consistent handling. This enhancement allows for the use of analytical aggregate functions, such as FIRST_VALUE and PERCENTILE_CONT, with a `WITHIN GROUP` syntax and an `OVER` clause, enabling more complex queries and data analysis. The `FixedArity` and `VariableArity` classes have been updated with new methods for the supported functions, and appropriate examples have been provided to demonstrate their usage in SQL.
* Added parsing for STRPOS in presto ([462](https://github.com/databrickslabs/remorph/issues/462)). A new feature has been added to the remorph/snow package's presto module to parse the STRPOS function in SQL code. This has been achieved by importing the locate_to_strposition function from sqlglot.dialects.dialect and incorporating it into the FUNCTIONS dictionary in the Parser class. This change enables the parsing of the STRPOS function, which returns the position of the first occurrence of a substring in a string. The implementation has been tested with a SQL file containing two queries for Presto SQL using STRPOS and Databricks SQL using LOCATE, both aimed at finding the position of the letter `l` in the string 'Hello world', starting the search from the second position. This feature is particularly relevant for software engineers working on data processing and analytics projects involving both Presto and Databricks SQL, as it ensures compatibility and consistent behavior between the two for string manipulation functions. The commit is part of issue [#462](https://github.com/databrickslabs/remorph/issues/462), and the diff provided includes a new SQL file with test cases for the STRPOS function in Presto and Locate function in Databricks SQL. The test cases confirm if the `hello` string is present in the greeting_message column of the greetings_table. This feature allows users to utilize the STRPOS function in Presto to determine if a specific substring is present in a string.
* Added validation for join columns for all query builders and limiting rows for reports ([413](https://github.com/databrickslabs/remorph/issues/413)). In this release, we've added validation for join columns in all query builders, ensuring consistent and accurate data joins. A limit on the number of rows displayed for reports has been implemented with a default of 50. The `compare.py` and `execute.py` files have been updated to include validation, and the `QueryBuilder` and `HashQueryBuilder` classes have new methods for validating join columns. The `SamplingQueryBuilder`, `ThresholdQueryBuilder`, and `recon_capture.py` files have similar updates for validation and limiting rows for reports. The `recon_config.py` file now has a new return type for the `get_join_columns` method, and a new method `test_no_join_columns_raise_exception()` has been added in the `test_threshold_query.py` file. These changes aim to enhance data consistency, accuracy, and efficiency for software engineers.
* Adds more coverage tests for functions to TSQL coverage ([420](https://github.com/databrickslabs/remorph/issues/420)). This commit adds new coverage tests for various TSQL functions, focusing on the COUNT, MAX, MIN, STDEV, STDEVP, SUM, and VARP functions, which are identical in Databricks SQL. The tests include cases with and without the DISTINCT keyword to ensure consistent behavior between TSQL and Databricks. For the GROUPING and GROUPING_ID functions, which have some differences, tests and examples of TSQL and Databicks SQL code are provided. The CHECKSUM_AGG function, not directly supported in Databricks SQL, is tested using MD5 and CONCAT_WS for equivalence. The CUME_DIST function, identical in both systems, is also tested. Additionally, a new test file for the STDEV function and updated tests for the VAR function are introduced, enhancing the reliability and robustness of TSQL conversions in the project.
* Catalog, Schema Permission checks ([492](https://github.com/databrickslabs/remorph/issues/492)). This release introduces enhancements to the Catalog and Schema functionality, with the addition of permission checks that raise explicit `Permission Denied` exceptions. The logger messages have been updated for clarity and a new variable, README_RECON_REPO, has been created to reference the readme file for the recon_config repository. The ReconcileUtils class has been modified to handle scenarios where the recon_config file is not found or corrupted during loading, providing clear error messages and guidance for users. The unit tests for the install feature have been updated with permission checks for Catalog and Schema operations, ensuring robust handling of permission denied errors. These changes improve the system's error handling and provide clearer guidance for users encountering permission issues.
* Changing the secret name acc to install script ([432](https://github.com/databrickslabs/remorph/issues/432)). In this release, the `recon` function in the `execute.py` file of the `databricks.labs.remorph.reconcile` package has been updated to dynamically generate the secret name instead of hardcoding it as "secret_scope". This change utilizes the new `get_key_form_dialect` function to create a secret name specific to the source dialect being used in the reconciliation process. The `get_dialect` function, along with `DatabaseConfig`, `TableRecon`, and the newly added `get_key_form_dialect`, have been imported from `databricks.labs.remorph.config`. This enhancement improves the security and flexibility of the reconciliation process by generating dynamic and dialect-specific secret names.
* Feature/recon documentation ([395](https://github.com/databrickslabs/remorph/issues/395)). This commit introduces a new reconciliation process, enhancing data consistency between sources, co-authored by Ganesh Dogiparthi, ganeshdogiparthi-db, and SundarShankar89. The README.md file provides detailed documentation for the reconciliation process. A new binary file, docs/transpile-install.gif, offers installation instructions or visual aids, while a mermaid flowchart in `report_types_visualisation.md` illustrates report generation for data, rows, schema, and overall reconciliation. No existing functionality was modified, ensuring the addition of valuable features for software engineers adopting this project.
* Fixing issues in sample query builder to handle Null's and zero ([457](https://github.com/databrickslabs/remorph/issues/457)). This commit introduces improvements to the sample query builder's handling of Nulls and zeroes, addressing bug [#450](https://github.com/databrickslabs/remorph/issues/450). The changes include updated SQL queries in the test threshold query file with COALESCE and TRIM functions to replace Null values with a specified string, ensuring consistent comparison of datasets. The query store in test_execute.py has also been enhanced to handle NULL and zero values using COALESCE, improving overall robustness and consistency. Additionally, new methods such as build_join_clause, trim, and coalesce have been added to enhance null handling in the query builder. The commit also introduces the MockDataSource class, a likely test implementation of a data source, and updates the log_and_throw_exception function for clearer error messaging.
* Implement Lakeview Dashboard Publisher ([405](https://github.com/databrickslabs/remorph/issues/405)). In this release, we've introduced the `DashboardPublisher` class in the `dashboard_publisher.py` module to streamline the process of creating and publishing dashboards in Databricks Workspace. This class simplifies dashboard creation by accepting an instance of `WorkspaceClient` and `Installation` and providing methods for creating and publishing dashboards with optional parameter substitution. Additionally, we've added a new JSON file, 'Remorph-Reconciliation-Substituted.lvdash.json', which contains a dashboard definition for a data reconciliation feature. This dashboard includes various widgets for filtering and displaying reconciliation results. We've also added a test file for the Lakeview Dashboard Publisher feature, which includes tests to ensure that the `DashboardPublisher` can create dashboards using specified file paths and parameters. These new features and enhancements are aimed at improving the user experience and streamlining the process of creating and publishing dashboards in Databricks Workspace.
* Integrate recon metadata reconcile cli ([444](https://github.com/databrickslabs/remorph/issues/444)). A new CLI command, `databricks labs remorph reconcile`, has been added to initiate the Data Reconciliation process, loading `reconcile.yml` and `recon_config.json` configuration files from the Databricks Workspace. If these files are missing, the user is prompted to reinstall the `reconcile` module and exit the command. The command then triggers the `Remorph_Reconciliation_Job` based on the Job ID stored in the `reconcile.yml` file. This simplifies the reconcile execution process, requiring users to first configure the `reconcile` module and generate the `recon_config_<SOURCE>.json` file using `databricks labs remorph install` and `databricks labs remorph generate-recon-config` commands. The new CLI command has been manually tested and includes unit tests. Integration tests and verification on the staging environment are pending. This feature was co-authored by Bishwajit, Ganesh Dogiparthi, and SundarShankar89.
* Introduce coverage tests ([382](https://github.com/databrickslabs/remorph/issues/382)). This commit introduces coverage tests and updates the GitHub Actions workflow to use Java 11 with Corretto distribution, improving testing and coverage analysis for the project. Coverage tests are added as part of the remorph project with the introduction of a new module for coverage and updating the artifact version to 0.2.0-SNAPSHOT. The pom.xml file is modified to change the parent project version to 0.2.0-SNAPSHOT, ensuring accurate assessment and maintenance of code coverage during development. In addition, a new Main object within the com.databricks.labs.remorph.coverage package is implemented for running coverage tests using command-line arguments, along with the addition of a new file QueryRunner.scala and case classes for ReportEntryHeader, ReportEntryReport, and ReportEntry for capturing and reporting on the status and results of parsing and transpilation processes. The `Cache Maven packages` step is removed and replaced with two new steps: `Run Unit Tests with Maven` and "Run Coverage Tests with Maven." The former executes unit tests and generates a test coverage report, while the latter downloads remorph-core jars as artifacts, executes coverage tests with Maven, and uploads coverage tests results as json artifacts. The `coverage-tests` job runs after the `test-core` job and uses the same environment, checking out the code with full history, setting up Java 11 with Corretto distribution, downloading remorph-core-jars artifacts, and running coverage tests with Maven, even if there are errors. The JUnit report is also published, and the coverage tests results are uploaded as json artifacts, providing better test coverage and more reliable code for software engineers adopting the project.
* Presto approx percentile func fix ([411](https://github.com/databrickslabs/remorph/issues/411)). The remorph library has been updated to support the Presto database system, with a new module added to the config.py file to enable robust and maintainable interaction. An `APPROX_PERCENTILE` function has been implemented in the `presto.py` file of the `sqlglot.dialects.presto` package, allowing for approximate percentile calculations in Presto and Databricks SQL. A test file has been included for both SQL dialects, with queries calculating the approximate median of the height column in the people table. The new functionality enhances the compatibility and versatility of the remorph library in working with Presto databases and improves overall project functionality. Additionally, a new test file for Presto in the snowflakedriver project has been introduced to test expected exceptions, further ensuring robustness and reliability.
* Raise exception if reconciliation fails for any table ([412](https://github.com/databrickslabs/remorph/issues/412)). In this release, we have implemented significant changes to improve exception handling and raise meaningful exceptions when reconciliation fails for any table in our open-source library. A new exception class, `ReconciliationException`, has been added as a child of the `Exception` class, which takes two optional parameters in its constructor, `message` and `reconcile_output`. The `ReconcileOutput` property has been created for accessing the reconcile output object. The `InvalidInputException` class now inherits from `ValueError`, making the code more explicit with the type of errors being handled. A new method, `_verify_successful_reconciliation`, has been introduced to check the reconciliation output status and raise a `ReconciliationException` if any table fails reconciliation. The `test_execute.py` file has been updated to raise a `ReconciliationException` if reconciliation for a specific report type fails, and new tests have been added to the test suite to ensure the correct behavior of the `reconcile` function with and without raising exceptions.
* Removed USE catalog/schema statement as lsql has added the feature ([465](https://github.com/databrickslabs/remorph/issues/465)). In this release, the usage of `USE` statements for selecting a catalog and schema has been removed in the `get_sql_backend` function, thanks to the new feature provided by the lsql library. This enhancement improves code readability, maintainability, and enables better integration with the SQL backend. The commit also includes changes to the installation process for reconciliation metadata tables, providing more clarity and simplicity in the code. Additionally, several test functions have been added or modified to ensure the proper functioning of the `get_sql_backend` function in various scenarios, including cases where a warehouse ID is not provided or when executing SQL statements in a notebook environment. An error simulation test has also been added for handling `DatabricksError` exceptions when executing SQL statements using the `DatabricksConnectBackend` class.
* Sampling with clause query to have `from dual` in from clause for oracle source ([464](https://github.com/databrickslabs/remorph/issues/464)). In this release, we've added the `get_key_from_dialect` function, replacing the previous `get_key_form_dialect` function, to retrieve the key associated with a given dialect object, serving as a unique identifier for the dialect. This improvement enhances the flexibility and readability of the codebase, making it easier to locate and manipulate dialect objects. Additionally, we've modified the 'sampling_query.py' file to include `from dual` in the `from` clause for Oracle sources in a sampling query with a clause, enabling sampling from Oracle databases. The `_insert_into_main_table` method in the `recon_capture.py` file of the `databricks.labs.remorph.reconcile` module has been updated to ensure accurate key retrieval for the specified dialect, thereby improving the reconciliation process. These changes resolve issues [#458](https://github.com/databrickslabs/remorph/issues/458) and [#464](https://github.com/databrickslabs/remorph/issues/464), enhancing the functionality of the sampling query builder and providing better support for various databases.
* Support function translation to Databricks SQL in TSql and Snowflake ([414](https://github.com/databrickslabs/remorph/issues/414)). This commit introduces a dialect-aware FunctionBuilder system and a ConversionStrategy system to enable seamless translation of SQL functions between TSQL, Snowflake, and Databricks SQL IR. The new FunctionBuilder system can handle both simple name translations and more complex conversions when there is no direct equivalent. For instance, TSQL's ISNULL function translates to IFNULL in Databricks SQL, while Snowflake's ISNULL remains unchanged. The commit also includes updates to the TSqlExpressionBuilder and new methods for building and visiting various contexts, enhancing compatibility and expanding the range of supported SQL dialects. Additionally, new tests have been added in the FunctionBuilderSpec to ensure the correct arity and function type for various SQL functions.
* TSQL: Create coverage tests for TSQL -> Databricks functions ([415](https://github.com/databrickslabs/remorph/issues/415)). This commit introduces coverage tests for T-SQL functions and their equivalent Databricks SQL implementations, focusing on the DATEADD function's `yy` keyword. The DATEADD function is translated to the ADD_MONTHS function in Databricks SQL, with the number of months multiplied by 12. This ensures functional equivalence between T-SQL and Databricks SQL for date addition involving years. The tests are written as SQL scripts and are located in the `tests/resources/functional/tsql/functions` directory, covering various scenarios and possible engine differences between T-SQL and Databricks SQL. The conversion process is documented, and future automation of this documentation is considered.
* TSQL: Implement WITH CTE ([443](https://github.com/databrickslabs/remorph/issues/443)). With this commit, we have extended the TSQL functionality by adding support for Common Table Expressions (CTEs). CTEs are temporary result sets that can be defined within a single execution of a SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement, allowing for more complex and efficient queries. The implementation includes the ability to create a CTE with an optional name and a column list, followed by a SELECT statement that defines the CTE. CTEs can be self-referential and can be used to simplify complex queries, improving code readability and performance. This feature is particularly useful for cases where multiple queries rely on the same intermediate result set, as it enables reusing the results without having to repeat the query.
* TSQL: Implement functions with specialized syntax ([430](https://github.com/databrickslabs/remorph/issues/430)). This commit introduces new data type conversion functions and JSON manipulation capabilities to T-SQL, addressing issue [#430](https://github.com/databrickslabs/remorph/issues/430). The newly implemented features include `NEXT VALUE FOR sequence`, `CAST(col TO sometype)`, `TRY_CAST(col TO sometype)`, `JSON_ARRAY`, and `JSON_OBJECT`. These functions support specialized syntax for handling data type conversions and JSON operations, including NULL value handling using `NULL ON NULL` and `ABSENT ON NULL` syntax. The `TSqlFunctionBuilder` class has been updated to accommodate these changes, and new test cases have been added to the `TSqlFunctionSpec` test class in Scala. This enhancement enables SQL-based querying and data manipulation with increased functionality for T-SQL parser and function evaluations.
* TSQL: Support DISTINCT in SELECT list and aggregate functions ([400](https://github.com/databrickslabs/remorph/issues/400)). This commit adds support for the `DISTINCT` keyword in T-SQL for use in the `SELECT` list and aggregate functions such as `COUNT`. When used in the `SELECT` list, `DISTINCT` ensures unique values of the specified expression are returned, and in aggregate functions like `COUNT`, it considers only distinct values of the specified argument. This change aligns with the SQL standard and enhances the functionality of the T-SQL parser, providing developers with greater flexibility and control when using `DISTINCT` in complex queries and aggregate functions. The default behavior in SQL, `ALL`, remains unchanged, and the parser has been updated to accommodate these improvements.
* TSQL: Update the SELECT statement to support XML workspaces ([451](https://github.com/databrickslabs/remorph/issues/451)). This release introduces updates to the TSQL Select statement grammar to correctly support XMLWORKSPACES in accordance with the latest specification. Although Databricks SQL does not currently support XMLWORKSPACES, this change is a syntax-only update to enable compatibility with other platforms that do support it. Newly added components include 'xmlNamespaces', 'xmlDeclaration', 'xmlSchemaCollection', 'xmlTypeDefinition', 'createXmlSchemaCollection', 'xmlIndexOptions', 'xmlIndexOption', 'openXml', 'xmlCommonDirectives', and 'xmlColumnDefinition'. These additions enable the creation, configuration, and usage of XML schemas and indexes, as well as the specification of XML namespaces and directives. A new test file for functional tests has been included to demonstrate the use of XMLWORKSPACES in TSQL and its equivalent syntax in Databricks SQL. While this update does not affect the existing codebase's functionality, it does enable support for XMLWORKSPACES syntax in TSQL, facilitating easier integration with other platforms that support it. Please note that Databricks SQL does not currently support XML workspaces.
* Test merge queue ([424](https://github.com/databrickslabs/remorph/issues/424)). In this release, the Scalafmt configuration has been updated to version 3.8.0, with changes to the formatting of Scala code. The `danglingParentheses` preset option has been set to "false", removing dangling parentheses from the code. Additionally, the `configStyleArguments` option has been set to `false` under "optIn". These modifications to the configuration file are likely to affect the formatting and style of the Scala code in the project, ensuring consistent and organized code. This change aims to enhance the readability and maintainability of the codebase.
* Updated bug and feature yml to support reconcile ([390](https://github.com/databrickslabs/remorph/issues/390)). The open-source library has been updated to improve issue and feature categorization. In the `.github/ISSUE_TEMPLATE/bug.yml` file, new options for TranspileParserError, TranspileValidationError, and TranspileLateralColumnAliasError have been added to the `label: Category of Bug / Issue` field. Additionally, a new option for ReconcileError has been included. The `feature.yml` file in the `.github/ISSUE_TEMPLATE` directory has also been updated, introducing a required dropdown menu labeled "Category of feature request." This dropdown offers options for Transpile, Reconcile, and Other categories, ensuring accurate classification and organization of incoming feature requests. The modifications aim to enhance clarity for maintainers in reviewing and prioritizing issue resolutions and feature implementations related to reconciliation functionality.
* Updated the documentation with json config examples ([486](https://github.com/databrickslabs/remorph/issues/486)). In this release, the Remorph Reconciliation tool on Databricks has been updated to include JSON config examples for various config elements such as jdbc_reader_options, column_mapping, transformations, thresholds, and filters. These config elements enable users to define source and target data, join columns, JDBC reader options, select and drop columns, column mappings, transformations, thresholds, and filters. The update also provides examples in both Python and JSON formats, as well as instructions for installing the necessary Oracle JDBC library on a Databricks cluster. This update enhances the tool's functionality, making it easier for software engineers to reconcile source data with target data on Databricks.
* Updated uninstall flow ([476](https://github.com/databrickslabs/remorph/issues/476)). In this release, the `uninstall` functionality of the `databricks labs remorph` tool has been updated to align with the latest changes made to the `install` refactoring. The `uninstall` flow now utilizes a new `MockInstallation` class, which handles the uninstallation process and takes a dictionary of configuration files and their corresponding contents as input. The `uninstall` function has been modified to return `False` in two cases, either when there is no remorph directory or when the user decides not to uninstall. A `MockInstallation` object is created for the reconcile.yml file, and appropriate exceptions are raised in the aforementioned cases. The `uninstall` function now uses a `WorkspaceUnInstallation` or `WorkspaceUnInstaller` object, depending on the input arguments, to handle the uninstallation process. Additionally, the `MockPrompts` class is used to prompt the user for confirmation before uninstalling remorph.
* Updates to developer documentation and add grammar formatting to maven ([490](https://github.com/databrickslabs/remorph/issues/490)). The developer documentation has been updated to include grammar formatting instructions and support for dialects other than Snowflake. The Maven build cycle has been modified to format grammars before ANTLR processes them, enhancing readability and easing conflict resolution during maintenance. The TSqlLexer.g4 file has been updated with formatting instructions and added dialect recognition. These changes ensure that grammars are consistently formatted and easily resolvable during merges. Engineers adopting this project should reformat the grammar file before each commit, following the provided formatting instructions and reference link. Grammar modifications in the TSqlParser.g4 file, such as alterations in partitionFunction and freetextFunction rules, improve structure and readability.
* Upgrade sqlglot from 23.13.7 to 25.1.0 ([473](https://github.com/databrickslabs/remorph/issues/473)). In the latest release, the sqlglot package has been upgraded from version 23.13.7 to 25.1.0, offering potential new features, bug fixes, and performance improvements for SQL processing. The package dependency for numpy has been updated to version 1.26.4, which may introduce new functionality, improve existing features, or fix numpy integration issues. Furthermore, the addition of the types-pytz package as a dependency provides type hints for pytz, enhancing codebase type checking and static analysis capabilities. Specific modifications to the test_sql_transpiler.py file include updating the expected result in the test_parse_query function and removing unnecessary whitespaces in the transpiled_sql assertion in the test_procedure_conversion function. Although the find_root_tables function remains unchanged, the upgrade to sqlglot promises overall functionality enhancements, which software engineers can leverage in their projects.
* Use default_factory in recon_config.py ([431](https://github.com/databrickslabs/remorph/issues/431)). In this release, the default value handling for the `status` field in the `DataReconcileOutput` and `ReconcileTableOutput` classes has been improved to comply with Python 3.11. Previously, a mutable default value was used, causing a `ValueError` issue. This has been addressed by implementing the `default_factory` argument in the `field` function to ensure a new instance of `StatusOutput` is created for each class. Additionally, `MismatchOutput` and `ThresholdOutput` classes now also utilize `default_factory` for consistent and robust default value handling, enhancing the overall code quality and preventing potential issues arising from mutable default values.
* edit distance ([501](https://github.com/databrickslabs/remorph/issues/501)). In this release, we have implemented an `edit distance` feature for calculating the difference between two strings using the LEVENSHTEIN function. This has been achieved by adding a new method, `anonymous_sql`, to the `Generator` class in the `databricks.py` file. The method takes expressions of the `Anonymous` type as arguments and calls the `LEVENSHTEIN` function if the `this` attribute of the expression is equal to "EDITDISTANCE". Additionally, a new test file has been introduced for the anonymous user in the functional snowflake test suite to ensure the accurate calculation of string similarity using the EDITDISTANCE function. This change includes examples of using the EDITDISTANCE function with different parameters and compares it with the LEVENSHTEIN function available in Databricks. It addresses issue [#500](https://github.com/databrickslabs/remorph/issues/500), which was related to testing the edit distance functionality.

0.2.0

* Capture Reconcile metadata in delta tables for dashbaords ([369](https://github.com/databrickslabs/remorph/issues/369)). In this release, changes have been made to improve version control management, reduce repository size, and enhance build times. A new directory, "spark-warehouse/", has been added to the Git ignore file to prevent unnecessary files from being tracked and included in the project. The `WriteToTableException` class has been added to the `exception.py` file to raise an error when a runtime exception occurs while writing data to a table. A new `ReconCapture` class has been implemented in the `reconcile` package to capture and persist reconciliation metadata in delta tables. The `recon` function has been updated to initialize this new class, passing in the required parameters. Additionally, a new file, `recon_capture.py`, has been added to the reconcile package, which implements the `ReconCapture` class responsible for capturing metadata related to data reconciliation. The `recon_config.py` file has been modified to introduce a new class, `ReconcileProcessDuration`, and restructure the classes `ReconcileOutput`, `MismatchOutput`, and `ThresholdOutput`. The commit also captures reconcile metadata in delta tables for dashboards in the context of unit tests in the `test_execute.py` file and includes a new file, `test_recon_capture.py`, to test the reconcile capture functionality of the `ReconCapture` class.
* Expand translation of Snowflake `expr` ([351](https://github.com/databrickslabs/remorph/issues/351)). In this release, the translation of the `expr` category in the Snowflake language has been significantly expanded, addressing uncovered grammar areas, incorrect interpretations, and duplicates. The `subquery` is now excluded as a valid `expr`, and new case classes such as `NextValue`, `ArrayAccess`, `JsonAccess`, `Collate`, and `Iff` have been added to the `Expression` class. These changes improve the comprehensiveness and accuracy of the Snowflake parser, allowing for a more flexible and accurate translation of various operations. Additionally, the `SnowflakeExpressionBuilder` class has been updated to handle previously unsupported cases, enhancing the parser's ability to parse Snowflake SQL expressions.
* Fixed orcale missing datatypes ([333](https://github.com/databrickslabs/remorph/issues/333)). In the latest release, the Oracle class of the Tokenizer in the open-source library has undergone a fix to address missing datatypes. Previously, the KEYWORDS mapping did not require Tokens for keys, which led to unsupported Oracle datatypes. This issue has been resolved by modifying the test_schema_compare.py file to ensure that all Oracle datatypes, including LONG, NCLOB, ROWID, UROWID, ANYTYPE, ANYDATA, ANYDATASET, XMLTYPE, SDO_GEOMETRY, SDO_TOPO_GEOMETRY, and SDO_GEORASTER, are now mapped to the TEXT TokenType. This improvement enhances the compatibility of the code with Oracle datatypes and increases the reliability of the schema comparison functionality, as demonstrated by the test function test_schema_compare, which now returns is_valid as True and a count of 0 for is_valid = `false` in the resulting dataframe.
* Fixed the recon_config functions to handle null values ([399](https://github.com/databrickslabs/remorph/issues/399)). In this release, the recon_config functions have been enhanced to manage null values and provide more flexible column mapping for reconciliation purposes. A `__post_init__` method has been added to certain classes to convert specified attributes to lowercase and handle null values. A new helper method, `_get_is_string`, has been introduced to determine if a column is of string type. Additionally, new functions such as `get_tgt_to_src_col_mapping_list`, `get_layer_tgt_to_src_col_mapping`, `get_src_to_tgt_col_mapping_list`, and `get_layer_src_to_tgt_col_mapping` have been added to retrieve column mappings, enhancing the overall functionality and robustness of the reconciliation process. These improvements will benefit software engineers by ensuring more accurate and reliable configuration handling, as well as providing more flexibility in mapping source and target columns during reconciliation.
* Improve Exception handling ([392](https://github.com/databrickslabs/remorph/issues/392)). The commit titled `Improve Exception Handling` enhances error handling in the project, addressing issues [#388](https://github.com/databrickslabs/remorph/issues/388) and [#392](https://github.com/databrickslabs/remorph/issues/392). Changes include refactoring the `create_adapter` method in the `DataSourceAdapter` class, updating method arguments in test functions, and adding new methods in the `test_execute.py` file for better test doubles. The `DataSourceAdapter` class is replaced with the `create_adapter` function, which takes the same arguments and returns an instance of the appropriate `DataSource` subclass based on the provided `engine` parameter. The diff also modifies the behavior of certain test methods to raise more specific and accurate exceptions. Overall, these changes improve exception handling, streamline the codebase, and provide clearer error messages for software engineers.
* Introduced morph_sql and morph_column_expr functions for inline transpilation and validation ([328](https://github.com/databrickslabs/remorph/issues/328)). Two new classes, TranspilationResult and ValidationResult, have been added to the config module of the remorph package to store the results of transpilation and validation. The morph_sql and morph_column_exp functions have been introduced to support inline transpilation and validation of SQL code and column expressions. A new class, Validator, has been added to the validation module to handle validation, and the validate_format_result method within this class has been updated to return a ValidationResult object. The _query method has also been added to the class, which executes a given SQL query and returns a tuple containing a boolean indicating success, any exception message, and the result of the query. Unit tests for these new functions have been updated to ensure proper functionality.
* Output for the reconcile function ([389](https://github.com/databrickslabs/remorph/issues/389)). A new function `get_key_form_dialect` has been added to the `config.py` module, which takes a `Dialect` object and returns the corresponding key used in the `SQLGLOT_DIALECTS` dictionary. Additionally, the `MorphConfig` dataclass has been updated to include a new attribute `__file__`, which sets the filename to "config.yml". The `get_dialect` function remains unchanged. Two new exceptions, `WriteToTableException` and `InvalidInputException`, have been introduced, and the existing `DataSourceRuntimeException` has been modified in the same module to improve error handling. The `execute.py` file's reconcile function has undergone several changes, including adding imports for `InvalidInputException`, `ReconCapture`, and `generate_final_reconcile_output` from `recon_exception` and `recon_capture` modules, and modifying the `ReconcileOutput` type. The `hash_query.py` file's reconcile function has been updated to include a new `_get_with_clause` method, which returns a `Select` object for a given DataFrame, and the `build_query` method has been updated to include a new query construction step using the `with_clause` object. The `threshold_query.py` file's reconcile function's output has been updated to include query and logger statements, a new method for allowing user transformations on threshold aliases, and the dialect specified in the sql method. A new `generate_final_reconcile_output` function has been added to the `recon_capture.py` file, which generates a reconcile output given a recon_id and a SparkSession. New classes and dataclasses, including `SchemaReconcileOutput`, `ReconcileProcessDuration`, `StatusOutput`, `ReconcileTableOutput`, and `ReconcileOutput`, have been introduced in the `reconcile/recon_config.py` file. The `tests/unit/reconcile/test_execute.py` file has been updated to include new test cases for the `recon` function, including tests for different report types and scenarios, such as data, schema, and all report types, exceptions, and incorrect report types. A new test case, `test_initialise_data_source`, has been added to test the `initialise_data_source` function, and the `test_recon_for_wrong_report_type` test case has been updated to expect an `InvalidInputException` when an incorrect report type is passed to the `recon` function. The `test_reconcile_data_with_threshold_and_row_report_type` test case has been added to test the `reconcile_data` method of the `Reconciliation` class with a row report type and threshold options. Overall, these changes improve the functionality and robustness of the reconcile process by providing more fine-grained control over the generation of the final reconcile output and better handling of exceptions and errors.
* Threshold Source and Target query builder ([348](https://github.com/databrickslabs/remorph/issues/348)). In this release, we've introduced a new method, `build_threshold_query`, that constructs a customizable threshold query based on a table's partition, join, and threshold columns configuration. The method identifies necessary columns, applies specified transformations, and includes a WHERE clause based on the filter defined in the table configuration. The resulting query is then converted to a SQL string using the dialect of the source database. Additionally, we've updated the test file for the threshold query builder in the reconcile package, including refactoring of function names and updated assertions for query comparison. We've added two new test methods: `test_build_threshold_query_with_single_threshold` and `test_build_threshold_query_with_multiple_thresholds`. These changes enhance the library's functionality, providing a more robust and customizable threshold query builder, and improve test coverage for various configurations and scenarios.
* Unpack nested alias ([336](https://github.com/databrickslabs/remorph/issues/336)). This release introduces a significant update to the 'lca_utils.py' file, addressing the limitation of not handling nested aliases in window expressions and where clauses, which resolves issue [#334](https://github.com/databrickslabs/remorph/issues/334). The `unalias_lca_in_select` method has been implemented to recursively parse nested selects and unalias lateral column aliases, thereby identifying and handling unsupported lateral column aliases. This method is utilized in the `check_for_unsupported_lca` method to handle unsupported lateral column aliases in the input SQL string. Furthermore, the 'test_lca_utils.py' file has undergone changes, impacting several test functions and introducing two new ones, `test_fix_nested_lca` and 'test_fix_nested_lca_with_no_scope', to ensure the code's reliability and accuracy by preventing unnecessary assumptions and hallucinations. These updates demonstrate our commitment to improving the library's functionality and test coverage.

0.1.7

* Added `Configure Secrets` support to `databricks labs remorph configure-secrets` cli command ([254](https://github.com/databrickslabs/remorph/issues/254)). The `Configure Secrets` feature has been implemented in the `databricks labs remorph` CLI command, specifically for the new `configure-secrets` command. This addition allows users to establish Scope and Secrets within their Databricks Workspace, enhancing security and control over resource access. The implementation includes a new `recon_config_utils.py` file in the `databricks/labs/remorph/helpers` directory, which contains classes and methods for managing Databricks Workspace secrets. Furthermore, the `ReconConfigPrompts` helper class has been updated to handle prompts for selecting sources, entering secret scope names, and handling overwrites. The CLI command has also been updated with a new `configure_secrets` function and corresponding tests to ensure correct functionality.
* Added handling for invalid alias usage by manipulating the AST ([219](https://github.com/databrickslabs/remorph/issues/219)). The recent commit addresses the issue of invalid alias usage in SQL queries by manipulating the Abstract Syntax Tree (AST). It introduces a new method, `unalias_lca_in_select`, which unaliases Lateral Column Aliases (LCA) in the SELECT clause of a query. The AliasInfo class is added to manage aliases more effectively, with attributes for the name, expression, and a flag indicating if the alias name is the same as a column. Additionally, the execute.py file is modified to check for unsupported LCA using the `lca_utils.check_for_unsupported_lca` method, improving the system's robustness when handling invalid aliases. Test cases are also added in the new file, test_lca_utils.py, to validate the behavior of the `check_for_unsupported_lca` function, ensuring that SQL queries are correctly formatted for Snowflake dialect and avoiding errors due to invalid alias usage.
* Added support for `databricks labs remorph generate-lineage` CLI command ([238](https://github.com/databrickslabs/remorph/issues/238)). A new CLI command, `databricks labs remorph generate-lineage`, has been added to generate lineage for input SQL files, taking the source dialect, input, and output directories as arguments. The command uses existing logic to generate a directed acyclic graph (DAG) and then creates a DOT file in the output directory using the DAG. The new command is supported by new functions `_generate_dot_file_contents`, `lineage_generator`, and methods in the `RootTableIdentifier` and `DAG` classes. The command has been manually tested and includes unit tests, with plans for adding integration tests in the future. The commit also includes a new method `temp_dirs_for_lineage` and updates to the `configure_secrets_databricks` method to handle a new source type "databricks". The command handles invalid input and raises appropriate exceptions.
* Custom oracle tokenizer ([316](https://github.com/databrickslabs/remorph/issues/316)). In this release, the remorph library has been updated to enhance its handling of Oracle databases. A custom Oracle tokenizer has been developed to map the `LONG` datatype to text (string) in the tokenizer, allowing for more precise parsing and manipulation of `LONG` columns in Oracle databases. The Oracle dialect in the configuration file has also been updated to utilize the new custom Oracle tokenizer. Additionally, the Oracle class from the snow module has been imported and integrated into the Oracle dialect. These improvements will enable the remorph library to manage Oracle databases more efficiently, with a particular focus on improving the handling of the `LONG` datatype. The commit also includes updates to test files in the functional/oracle/test_long_datatype directory, which ensure the proper conversion of the `LONG` datatype to text. Furthermore, a new test file has been added to the tests/unit/snow directory, which checks for compatibility with Oracle's long data type. These changes enhance the library's compatibility with Oracle databases, ensuring accurate handling and manipulation of the `LONG` datatype in Oracle SQL and Databricks SQL.
* Removed strict source dialect checks ([284](https://github.com/databrickslabs/remorph/issues/284)). In the latest release, the `transpile` and `generate_lineage` functions in `cli.py` have undergone changes to allow for greater flexibility in source dialect selection. Previously, only `snowflake` or `tsql` dialects were supported, but now any source dialect supported by SQLGLOT can be used, controlled by the `SQLGLOT_DIALECTS` dictionary. Providing an unsupported source dialect will result in a validation error. Additionally, the input and output folder paths for the `generate_lineage` function are now validated against the file system to ensure their existence and validity. In the `install.py` file of the `databricks/labs/remorph` package, the source dialect selection has been updated to use `SQLGLOT_DIALECTS.keys()`, replacing the previous hardcoded list. This change allows for more flexibility in selecting the source dialect. Furthermore, recent updates to various test functions in the `test_install.py` file suggest that the source selection process has been modified, possibly indicating the addition of new sources or a change in source identification. These modifications provide greater flexibility in testing and potentially in the actual application.
* Set Catalog, Schema from default Config ([312](https://github.com/databrickslabs/remorph/issues/312)). A new feature has been added to our open-source library that allows users to specify the `catalog` and `schema` configuration options as part of the `transpile` command-line interface (CLI). If these options are not provided, the `transpile` function in the `cli.py` file will now set them to the values specified in `default_config`. This ensures that a default catalog and schema are used if they are not explicitly set by the user. The `labs.yml` file has been updated to reflect these changes, with the addition of the `catalog-name` and `schema-name` options to the `commands` object. The `default` property of the `validation` object has also been updated to `true`, indicating that the validation step will be skipped by default. These changes provide increased flexibility and ease-of-use for users of the `transpile` functionality.
* Support for Null safe equality join for databricks generator ([280](https://github.com/databrickslabs/remorph/issues/280)). In this release, we have implemented support for a null-safe equality join in the Databricks generator, addressing issue [#280](https://github.com/databrickslabs/remorph/issues/280). This feature introduces the use of the " <=> " operator in the generated SQL code instead of the `is not distinct from` syntax to ensure accurate comparisons when NULL values are present in the columns being joined. The Generator class has been updated with a new method, NullSafeEQ, which takes in an expression and returns the binary version of the expression using the " <=> " operator. The preprocess method in the Generator class has also been modified to include this new functionality. It is important to note that this change may require users to update their existing code to align with the new syntax in the Databricks environment. With this enhancement, the Databricks generator is now capable of performing null-safe equality joins, resulting in consistent results regardless of the presence of NULL values in the join conditions.

0.1.6

* Added serverless validation using lsql library ([176](https://github.com/databrickslabs/remorph/issues/176)). Workspaceclient object is used with `product` name and `product_version` along with corresponding `cluster_id` or `warehouse_id` as `sdk_config` in `MorphConfig` object.
* Enhanced install script to enforce usage of a warehouse or cluster when `skip-validation` is set to `False` ([213](https://github.com/databrickslabs/remorph/issues/213)). In this release, the installation process has been enhanced to mandate the use of a warehouse or cluster when the `skip-validation` parameter is set to `False`. This change has been implemented across various components, including the install script, `transpile` function, and `get_sql_backend` function. Additionally, new pytest fixtures and methods have been added to improve test configuration and resource management during testing. Unit tests have been updated to enforce usage of a warehouse or cluster when the `skip-validation` flag is set to `False`, ensuring proper resource allocation and validation process improvement. This development focuses on promoting a proper setup and usage of the system, guiding new users towards a correct configuration and improving the overall reliability of the tool.
* Patch subquery with json column access ([190](https://github.com/databrickslabs/remorph/issues/190)). The open-source library has been updated with new functionality to modify how subqueries with JSON column access are handled in the `snowflake.py` file. This change includes the addition of a check for an opening parenthesis after the `FROM` keyword to detect and break loops when a subquery is found, as opposed to a table name. This improvement enhances the handling of complex subqueries and JSON column access, making the code more robust and adaptable to different query structures. Additionally, a new test method, `test_nested_query_with_json`, has been introduced to the `tests/unit/snow/test_databricks.py` file to test the behavior of nested queries involving JSON column access when using a Snowflake dialect. This new method validates the expected output of a specific nested query when it is transpiled to Snowflake's SQL dialect, allowing for more comprehensive testing of JSON column access and type casting in Snowflake dialects. The existing `test_delete_from_keyword` method remains unchanged.
* Snowflake `UPDATE FROM` to Databricks `MERGE INTO` implementation ([198](https://github.com/databrickslabs/remorph/issues/198)).
* Use Runtime SQL backend in Notebooks ([211](https://github.com/databrickslabs/remorph/issues/211)). In this update, the `db_sql.py` file in the `databricks/labs/remorph/helpers` directory has been modified to support the use of the Runtime SQL backend in Notebooks. This change includes the addition of a new `RuntimeBackend` class in the `backends` module and an import statement for `os`. The `get_sql_backend` function now returns a `RuntimeBackend` instance when the `DATABRICKS_RUNTIME_VERSION` environment variable is present, allowing for more efficient and secure SQL statement execution in Databricks notebooks. Additionally, a new test case for the `get_sql_backend` function has been added to ensure the correct behavior of the function in various runtime environments. These enhancements improve SQL execution performance and security in Databricks notebooks and increase the project's versatility for different use cases.
* Added Issue Templates for bugs, feature and config ([194](https://github.com/databrickslabs/remorph/issues/194)). Two new issue templates have been added to the project's GitHub repository to improve issue creation and management. The first template, located in `.github/ISSUE_TEMPLATE/bug.yml`, is for reporting bugs and prompts users to provide detailed information about the issue, including the current and expected behavior, steps to reproduce, relevant log output, and sample query. The second template, added under the path `.github/ISSUE_TEMPLATE/config.yml`, is for configuration-related issues and includes support contact links for general Databricks questions and Remorph documentation, as well as fields for specifying the operating system and software version. A new issue template for feature requests, named "Feature Request", has also been added, providing a structured format for users to submit requests for new functionality for the Remorph project. These templates will help streamline the issue creation process, improve the quality of information provided, and make it easier for the development team to quickly identify and address bugs and feature requests.
* Added Databricks Source Adapter ([185](https://github.com/databrickslabs/remorph/issues/185)). In this release, the project has been enhanced with several new features for the Databricks Source Adapter. A new `engine` parameter has been added to the `DataSource` class, replacing the original `source` parameter. The `_get_secrets` and `_get_table_or_query` methods have been updated to use the `engine` parameter for key naming and handling queries with a `select` statement differently, respectively. A Databricks Source Adapter for Oracle databases has been introduced, which includes a new `OracleDataSource` class that provides functionality to connect to an Oracle database using JDBC. A Databricks Source Adapter for Snowflake has also been added, featuring the `SnowflakeDataSource` class that handles data reading and schema retrieval from Snowflake. The `DatabricksDataSource` class has been updated to handle data reading and schema retrieval from Databricks, including a new `get_schema_query` method that generates the query to fetch the schema based on the provided catalog and table name. Exception handling for reading data and fetching schema has been implemented for all new classes. These changes provide increased flexibility for working with various data sources, improved code maintainability, and better support for different use cases.
* Added Threshold Query Builder ([188](https://github.com/databrickslabs/remorph/issues/188)). In this release, the open-source library has added a Threshold Query Builder feature, which includes several changes to the existing functionality in the data source connector. A new import statement adds the `re` module for regular expressions, and new parameters have been added to the `read_data` and `get_schema` abstract methods. The `_get_jdbc_reader_options` method has been updated to accept a `options` parameter of type "JdbcReaderOptions", and a new static method, "_get_table_or_query", has been added to construct the table or query string based on provided parameters. Additionally, a new class, "QueryConfig", has been introduced in the "databricks.labs.remorph.reconcile" package to configure queries for data reconciliation tasks. A new abstract base class QueryBuilder has been added to the query_builder.py file, along with HashQueryBuilder and ThresholdQueryBuilder classes to construct SQL queries for generating hash values and selecting columns based on threshold values, transformation rules, and filtering conditions. These changes aim to enhance the functionality of the data source connector, add modularity, customizability, and reusability to the query builder, and improve data reconciliation tasks.
* Added snowflake connector code ([177](https://github.com/databrickslabs/remorph/issues/177)). In this release, the open-source library has been updated to add a Snowflake connector for data extraction and schema manipulation. The changes include the addition of the SnowflakeDataSource class, which is used to read data from Snowflake using PySpark, and has methods for getting the JDBC URL, reading data with and without JDBC reader options, getting the schema, and handling exceptions. A new constant, SNOWFLAKE, has been added to the SourceDriver enum in constants.py, which represents the Snowflake JDBC driver class. The code modifications include updating the constructor of the DataSource abstract base class to include a new parameter 'scope', and updating the `_get_secrets` method to accept a `key_name` parameter instead of 'key'. Additionally, a test file 'test_snowflake.py' has been added to test the functionality of the SnowflakeDataSource class. This release also updates the pyproject.toml file to version lock the dependencies like black, ruff, and isort, and modifies the coverage report configuration to exclude certain files and lines from coverage checks. These changes were completed by Ravikumar Thangaraj and SundarShankar89.
* `remorph reconcile` baseline for Query Builder and Source Adapter for oracle as source ([150](https://github.com/databrickslabs/remorph/issues/150)).

Dependency updates:

* Bump sqlglot from 22.4.0 to 22.5.0 ([175](https://github.com/databrickslabs/remorph/pull/175)).
* Updated databricks-sdk requirement from <0.22,>=0.18 to >=0.18,<0.23 ([178](https://github.com/databrickslabs/remorph/pull/178)).
* Updated databricks-sdk requirement from <0.23,>=0.18 to >=0.18,<0.24 ([189](https://github.com/databrickslabs/remorph/pull/189)).
* Bump actions/checkout from 3 to 4 ([203](https://github.com/databrickslabs/remorph/pull/203)).
* Bump actions/setup-python from 4 to 5 ([201](https://github.com/databrickslabs/remorph/pull/201)).
* Bump codecov/codecov-action from 1 to 4 ([202](https://github.com/databrickslabs/remorph/pull/202)).
* Bump softprops/action-gh-release from 1 to 2 ([204](https://github.com/databrickslabs/remorph/pull/204)).

0.1.5

* Added Pylint Checker ([149](https://github.com/databrickslabs/remorph/issues/149)). This diff adds a Pylint checker to the project, which is used to enforce a consistent code style, identify potential bugs, and check for errors in the Python code. The configuration for Pylint includes various settings, such as a line length limit, the maximum number of arguments for a function, and the maximum number of lines in a module. Additionally, several plugins have been specified to load, which add additional checks and features to Pylint. The configuration also includes settings that customize the behavior of Pylint's naming conventions checks and handle various types of code constructs, such as exceptions, logging statements, and import statements. By using Pylint, the project can help ensure that its code is of high quality, easy to understand, and free of bugs. This diff includes changes to various files, such as cli.py, morph_status.py, validate.py, and several SQL-related files, to ensure that they adhere to the desired Pylint configuration and best practices for code quality and organization.
* Fixed edge case where column name is same as alias name ([164](https://github.com/databrickslabs/remorph/issues/164)). A recent commit has introduced fixes for edge cases related to conflicts between column names and alias names in SQL queries, addressing issues [#164](https://github.com/databrickslabs/remorph/issues/164) and [#130](https://github.com/databrickslabs/remorph/issues/130). The `check_for_unsupported_lca` function has been updated with two helper functions `_find_aliases_in_select` and `_find_invalid_lca_in_window` to detect aliases with the same name as a column in a SELECT expression and identify invalid Least Common Ancestors (LCAs) in window functions, respectively. The `find_windows_in_select` function has been refactored and renamed to `_find_windows_in_select` for improved code readability. The `transpile` and `parse` functions in the `sql_transpiler.py` file have been updated with try-except blocks to handle cases where a column name matches the alias name, preventing errors or exceptions such as `ParseError`, `TokenError`, and `UnsupportedError`. A new unit test, "test_query_with_same_alias_and_column_name", has been added to verify the fix, passing a SQL query with a subquery having a column alias `ca_zip` which is also used as a column name in the same query, confirming that the function correctly handles the scenario where a column name conflicts with an alias name.
* `TO_NUMBER` without `format` edge case ([172](https://github.com/databrickslabs/remorph/issues/172)). The `TO_NUMBER without format edge case` commit introduces changes to address an unsupported usage of the `TO_NUMBER` function in Databicks SQL dialect when the `format` parameter is not provided. The new implementation introduces constants `PRECISION_CONST` and `SCALE_CONST` (set to 38 and 0 respectively) as default values for `precision` and `scale` parameters. These changes ensure Databricks SQL dialect requirements are met by modifying the `_to_number` method to incorporate these constants. An `UnsupportedError` will now be raised when `TO_NUMBER` is called without a `format` parameter, improving error handling and ensuring users are aware of the required `format` parameter. Test cases have been added for `TO_DECIMAL`, `TO_NUMERIC`, and `TO_NUMBER` functions with format strings, covering cases where the format is taken from table columns. The commit also ensures that an error is raised when `TO_DECIMAL` is called without a format parameter.

Dependency updates:

* Bump sqlglot from 21.2.1 to 22.0.1 ([152](https://github.com/databrickslabs/remorph/pull/152)).
* Bump sqlglot from 22.0.1 to 22.1.1 ([159](https://github.com/databrickslabs/remorph/pull/159)).
* Updated databricks-labs-blueprint[yaml] requirement from ~=0.2.3 to >=0.2.3,<0.4.0 ([162](https://github.com/databrickslabs/remorph/pull/162)).
* Bump sqlglot from 22.1.1 to 22.2.0 ([161](https://github.com/databrickslabs/remorph/pull/161)).
* Bump sqlglot from 22.2.0 to 22.2.1 ([163](https://github.com/databrickslabs/remorph/pull/163)).
* Updated databricks-sdk requirement from <0.21,>=0.18 to >=0.18,<0.22 ([168](https://github.com/databrickslabs/remorph/pull/168)).
* Bump sqlglot from 22.2.1 to 22.3.1 ([170](https://github.com/databrickslabs/remorph/pull/170)).
* Updated databricks-labs-blueprint[yaml] requirement from <0.4.0,>=0.2.3 to >=0.2.3,<0.5.0 ([171](https://github.com/databrickslabs/remorph/pull/171)).
* Bump sqlglot from 22.3.1 to 22.4.0 ([173](https://github.com/databrickslabs/remorph/pull/173)).

0.1.4

* Added conversion logic for Try_to_Decimal without format ([142](https://github.com/databrickslabs/remorph/pull/142)).
* Identify Root Table for folder containing SQLs ([124](https://github.com/databrickslabs/remorph/pull/124)).
* Install Script ([106](https://github.com/databrickslabs/remorph/pull/106)).
* Integration Test Suite ([145](https://github.com/databrickslabs/remorph/pull/145)).

Dependency updates:

* Updated databricks-sdk requirement from <0.20,>=0.18 to >=0.18,<0.21 ([143](https://github.com/databrickslabs/remorph/pull/143)).
* Bump sqlglot from 21.0.0 to 21.1.2 ([137](https://github.com/databrickslabs/remorph/pull/137)).
* Bump sqlglot from 21.1.2 to 21.2.0 ([147](https://github.com/databrickslabs/remorph/pull/147)).
* Bump sqlglot from 21.2.0 to 21.2.1 ([148](https://github.com/databrickslabs/remorph/pull/148)).

Page 2 of 3

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.