The Major Release of Airflow ClickHouse Plugin 🎉
The plugin has undergone significant refactoring, preserving and extending its functionality. Here are the key changes:
Two Operators Families
The plugin introduces two families of operators:
1. `ClickHouseOperator`, `ClickHouseSensor`, and `ClickHouseHook`
These operators are based on [`Client.execute`](https://clickhouse-driver.readthedocs.io/en/latest/api.html#clickhouse_driver.Client.execute) method of [mymarilyn/clickhouse-driver](https://github.com/mymarilyn/clickhouse-driver) and provide its full functionality. In particular, the following new arguments are now supported:
- Templated `settings`, `external_tables`, and `query_id`.
- Boolean options: `with_column_types`, `types_check`, `columnar`.
2. Standardized DB API 2.0-Compatible Counterparts
These operators are based on [`clickhouse_driver.dbapi`](https://clickhouse-driver.readthedocs.io/en/latest/dbapi.html). They do not support the most of `Client.execute` arguments but offer compatibility with Airflow's `common.sql` package. They are suitable for simplifying migration from other SQL databases as a drop-in replacement. Key features include:
- Operators, hooks, and sensors such as `ClickHouseSQLExecuteQueryOperator`, and `ClickHouseSQLColumnCheckOperator`, and many others from `common.sql`.
- Support for `pandas.DataFrame`.
- Organized in `clickhouse_dbapi` modules, as opposed to `clickhouse` modules for their “regular“ counterparts.
- Requires an extra dependency: [apache-airflow-providers-common-sql](https://airflow.apache.org/docs/apache-airflow-providers-common-sql/stable/index.html). You can install it using `pip install airflow-clickhouse-plugin[common.sql]`.
Compatibility and Ease of Use
The newest version of the plugin aims to maintain compatibility with existing installations and interfaces as much as possible. It should serve as a seamless drop-in replacement for your current setup.
Additional notes on support:
- Support for Python 3.7 is dropped. Python 3.8+ is required.
- Support for Airflow 2.1 is restored. All versions of Airflow 2 (2.0–2.6) are supported now.
- Installation of `pandas` is not required unless you use [some specific functionality of `common.sql`](https://github.com/apache/airflow/blob/0d93cc5cab8cef56faf3be1705ec6784e2d8a74a/airflow/providers/common/sql/hooks/sql.py#L201).
- Also, `Makefile` of the project is dropped: it has introduced implicit behaviour breaking GitHub workflow integrity. If you want to keep using it, maintain a local copy in your personal development environment.
Improved Codebase
The code has been refactored and decomposed, simplifying maintenance and encouraging contributions.
Development Status Update
The project is now in a "Production/Stable" status, as it has undergone extensive testing by numerous companies worldwide and boasts 100K+ of monthly downloads from PyPI, demonstrating its reliability and maturity.
---
For more details, check out [README.md](https://github.com/bryzgaloff/airflow-clickhouse-plugin#readme).
Kudos to 1ng4lipt for an inspiration with 67 introducing `query_id` which has motivated me to bring this huge update 🤝
Available on PyPI: https://pypi.org/project/airflow-clickhouse-plugin/1.0.0.post0/ (post-release just fixes authors, no functional changes)
Full Changelog: https://github.com/bryzgaloff/airflow-clickhouse-plugin/compare/v0.11.0...v1.0.0