Koheesio

Latest version: v0.10.1

Safety actively analyzes 723217 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 4

0.10.1

What's Changed
* [BUGFIX] OktaAccessToken bug and enhance logging and tests by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/174


**Full Changelog**: https://github.com/Nike-Inc/koheesio/compare/koheesio-v0.10.0...koheesio-v0.10.1

0.10.1a0

Bugfix related to OktaAccessToken

**Full Changelog**: https://github.com/Nike-Inc/koheesio/compare/koheesio-v0.10.0...koheesio-v0.10.1a0

0.10

> Note: this release in incomplete, further features are still being developed. This release is ALPHA, and should not be used in a production setting until the actual v0.10 release.

What's Changed
* [FEATURE] Add transformation to download file from url by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/143
* [BUGFIX] Restore Box CSVReader behavior (144) by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/147
* [BUGFIX] ColumnConfig defaults in ColumnsTransformation for run_for_all_data_type and limit_data_type were not working correctly by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/142
* [FEATURE] add partial method to BaseModel by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/150
* [BUGFIX] Update PySpark connect support check by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/154
* [BUGFIX]: Improve merge clause handling in DeltaTableWriter by mikita-sakalouski in https://github.com/Nike-Inc/koheesio/pull/155


**Full Changelog**: https://github.com/Nike-Inc/koheesio/compare/koheesio-v0.9.1...koheesio-v0.10.0a0

0.10.0

What's Changed

v0.10.0 brings several important features, security improvements, and bug fixes across different modules of Koheesio.
The overall API remains unchanged.

New Features / Refactors

The following new features are included:

* [feature] **Core** Introduces Koheesio specific `SecretStr` and `SecretBytes` classes for handling secret strings and bytes with enhanced security by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/164
* Introduces Koheesio specific `SecretStr` and `SecretBytes` along with a `SecretMixin` class (to reduce code duplication across)
* New Secret classes are compatible with Pydantic's `SecretStr` and `SecretBytes` and allow seamless integration with existing code
* To use: Replace `from pydantic import SecretStr, SecretBytes` with `from koheesio.models import SecretStr, SecretBytes` to use the enhanced secret handling
* These classes expand support to allow usage with an f-string (or `.format`) and `"string" + "other_string"` concatenation while remaining secure
* [feature] **Core > BaseModel** Added `partial` classmethod to `BaseModel` for enhanced customization and flexibility by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/150
* Partial allows for creating a new instance of a model with only the specified fields updated (such as overwriting or setting a fields default values)
* [feature] **Box** Added a buffered version of `BoxFileWriter` by riccamini in https://github.com/Nike-Inc/koheesio/pull/161 (#87, 148)
* Added the `BoxBufferFileWriter` class for writing files to Box when physical storage isn't available. Data is instead buffered in memory before being written to Box.
* Also improves `BoxCsvFileReader` logging output by providing the file name in addition to the file ID.
* [feature] **Dev Experience** Easier debugging and dev improvements by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/168
* To make debugging easier, `pyproject.toml` was updated to allow for easier running `spark connect` in your local dev environment:
* Added extra dependencies for `pyspark[connect]==3.5.4`.
* Added environment variables for Spark Connect in the development environment.
* Changed to verbose mode logging in the pytest output (also visible through Github Actions tests run output).
* [refactor] **Snowflake** Snowflake classes now use `params` over `options` by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/168
* Snowflake classes now also bases `ExtraParamsMixin`
* Renamed `options` field to `params` and added alias `options` for backwards compatibility.
* Introduced `SF_DEFAULT_PARAMS`.
* [feature] **Delta** Support for Delta table history by zarembat in https://github.com/Nike-Inc/koheesio/pull/163
* Enables fetching Delta table history and checking data staleness based on defined intervals and refresh days.
* Changes to `DeltaTableStep` class:
* Added `describe_history()` method to `DeltaTableStep` for fetching Delta table history as a Spark DataFrame.
* Added `is_date_stale()` method to `DeltaTableStep` to check data staleness based on time intervals or specific refresh days.
* [feature] **Http** Added support for authorization headers with proper masking for improved security by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/158 and https://github.com/Nike-Inc/koheesio/pull/170 (#157)
* Addresses potential data leaks in authorization headers, ensuring secure handling of sensitive information.
* Comprehensive unit tests added to prevent regressions and ensure expected behavior.
* Changes to `HttpStep` class:
* Added `decode_sensitive_headers` method to decode `SecretStr` values in headers.
* Modified `get_headers` method to dump headers into JSON without `SecretStr` masking.
* Added `auth_header` field to handle authorization headers.
* Implemented masking for bearer tokens to maintain their 'secret' status.
* [feature] **Step & Spark** Add transformation to download file from url data through python or spark by mikita-sakalouski and dannymeijer in https://github.com/Nike-Inc/koheesio/pull/143 (#75)
* Allow downloading files from a given URL
* Added `DownloadFileStep` class in a new module `koheesio.steps.download_file`
* Added `FileWriteMode` enum with supported wrtie modes: `OVERWRITE`, `APPEND`, `IGNORE`, `EXCLUSIVE`, `BACKUP`:
* Also made available as a spark `Transformation` in `DownloadFileFromUrlTransformation` in a new module `koheesio.spark.transformations.download_files`
* The spark implementation allows passing urls through a column in the a given DataFrame
* All URLs are then downloaded by the Spark Driver to a given location
* [refactor] **Spark > Reader > JDBC** Updated JDBC behavior by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/168
* `JDBCReader` class now also base `ExtraParamsMixin`.
* Renamed `options` field to `params` and added alias `options` for backwards compatibility.
* `dbtable` and `query` validation now handled upon initialization rather than at runtime.
* Behavior now requires either `dbtable` or `query` to be submitted to be able to use JDBC.
* [refactor] **Spark > Reader > HanaReader** Updated `HanaReader` behavior by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/168
* `HanaReader` class no longer has an `options` field.
* Instead uses `params` and the alias `options` for backwards compatibility (see `JDBCReader` changes mentioned above).
* [refactor] **Spark > Reader > TeradataReader** Updated `TeradataReader` behavior by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/168
* `TeradataReader` class no longer has an `options` field
* Instead uses `params` and the alias `options` for backwards compatibility (see `JDBCReader` changes mentioned above).
* [feature] **Spark > Transformation > CamelToSnake** added more efficient Spark 3.4+ supported operation for `CamelToSnakeTransformation` by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/142

Bug fixes

The following bug fixes are included:

* [bugfix] **Core > Context** Fix Context initialization with another Context object and dotted notation by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/160 (#159)
* The __init__ method of the Context class incorrectly updated the `kwargs` making it return `None`. Calls to Context containing another Context object, would previously fail.
* Also fixed an issue with how Context handled get operations for nested keys when using dotted notation
* [bugfix] **Core > Step** Fixed duplicate logging issues in nested Step classes by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/168
* We observed log duplication when using specific super call sequences in nested Step classes
* Several changes were made to the `StepMetaClass` to address duplicate logs when using `super()` in the execute method of a Step class under specific circumstances.
* Updated `_is_called_through_super` method to traverse the entire method resolution order (MRO) and correctly identify `super()` calls.
* Ensured `_execute_wrapper` method triggers logging only once per execute call.
* This change prevents duplicate logs and ensures accurate log entries. The `_is_called_through_super` method was also used for `Output` validation, ensuring it is called only once.
* [bugfix]: **Delta** Improve merge clause handling in `DeltaTableWriter` by mikita-sakalouski in https://github.com/Nike-Inc/koheesio/pull/155 (#149)
* Before, when using delta merge configuration (as dict) to provide merge condition to merge builder and having multiple calls for merge operation (e.g. for each batch processing in streaming), the original implementation was breaking due to a pop call on the used dictionary.
* [bugfix] **Spark** Pyspark Connect support fixes by nogitting and dannymeijer in https://github.com/Nike-Inc/koheesio/pull/154 (#153)
* Connect support check previously excluded Spark 3.4 wrongfully
* Fix gets rid of False positives in our spark connect check utility
* [bugfix] **Spark > ColumnsTransformation** `ColumnConfig` defaults in `ColumnsTransformation` not working correctly by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/142
* `run_for_all_data_type` and `limit_data_type` were previously not working correctly
* [bugfix] **Spark > Transformation > Hash** Fix error handling missing columns in Spark Connect by dannymeijer in https://github.com/Nike-Inc/koheesio/pull/168
* Updated `sha2` function call to use named parameters.
* Changes to `Sha2Hash` class:
* Added check for missing columns.
* Improved handling when no columns are provided.

New Contributors

Big shout out to all contributors and a heartfelt welcome to our new contributors:

* nogitting made their first contribution in https://github.com/Nike-Inc/koheesio/pull/154
* zarembat made their first contribution in https://github.com/Nike-Inc/koheesio/pull/163

**Full Changelog**: https://github.com/Nike-Inc/koheesio/compare/koheesio-v0.9.1...koheesio-v0.10.0

0.10.0a0

Not secure

0.9.1

Not secure
What's Changed
* [BUGFIX] 144 Box CSV Reader handling data types incorrectly by louis-paulvlx in https://github.com/Nike-Inc/koheesio/pull/145


**Full Changelog**: https://github.com/Nike-Inc/koheesio/compare/koheesio-v0.9.0...koheesio-v0.9.1

Page 1 of 4

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.