Kedro

Latest version: v0.19.6

Safety actively analyzes 641153 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 16 of 21

0.18.5

Not secure
> This release introduced a bug that causes a failure in experiment tracking within the `Kedro-Viz` console. We recommend that you use Kedro version `0.18.6` in preference.

Major features and improvements
* Added new `OmegaConfigLoader` which uses `OmegaConf` for loading and merging configuration.
* Added the `--conf-source` option to `kedro run`, allowing users to specify a source for project configuration for the run.
* Added `omegaconf` syntax as option for `--params`. Keys and values can now be separated by colons or equals signs.
* Added support for generator functions as nodes, i.e. using `yield` instead of return.
* Enable chunk-wise processing in nodes with generator functions.
* Save node outputs after every `yield` before proceeding with next chunk.
* Fixed incorrect parsing of Azure Data Lake Storage Gen2 URIs used in datasets.
* Added support for loading credentials from environment variables using `OmegaConfigLoader`.
* Added new `--namespace` flag to `kedro run` to enable filtering by node namespace.
* Added a new argument `node` for all four dataset hooks.
* Added the `kedro run` flags `--nodes`, `--tags`, and `--load-versions` to replace `--node`, `--tag`, and `--load-version`.

Bug fixes and other changes
* Commas surrounded by square brackets (only possible for nodes with default names) will no longer split the arguments to `kedro run` options which take a list of nodes as inputs (`--from-nodes` and `--to-nodes`).
* Fixed bug where `micropkg` manifest section in `pyproject.toml` isn't recognised as allowed configuration.
* Fixed bug causing `load_ipython_extension` not to register the `%reload_kedro` line magic when called in a directory that does not contain a Kedro project.
* Added `anyconfig`'s `ac_context` parameter to `kedro.config.commons` module functions for more flexible `ConfigLoader` customizations.
* Change reference to `kedro.pipeline.Pipeline` object throughout test suite with `kedro.modular_pipeline.pipeline` factory.
* Fixed bug causing the `after_dataset_saved` hook only to be called for one output dataset when multiple are saved in a single node and async saving is in use.
* Log level for "Credentials not found in your Kedro project config" was changed from `WARNING` to `DEBUG`.
* Added safe extraction of tar files in `micropkg pull` to fix vulnerability caused by [CVE-2007-4559](https://github.com/advisories/GHSA-gw9q-c7gh-j9vm).
* Documentation improvements
* Bug fix in table font size
* Updated API docs links for datasets
* Improved CLI docs for `kedro run`
* Revised documentation for visualisation to build plots and for experiment tracking
* Added example for loading external credentials to the Hooks documentation

Breaking changes to the API

Community contributions
Many thanks to the following Kedroids for contributing PRs to this release:

* [adamfrly](https://github.com/adamfrly)
* [corymaklin](https://github.com/corymaklin)
* [Emiliopb](https://github.com/Emiliopb)
* [grhaonan](https://github.com/grhaonan)
* [JStumpp](https://github.com/JStumpp)
* [michalbrys](https://github.com/michalbrys)
* [sbrugman](https://github.com/sbrugman)

0.18.4

Not secure
Major features and improvements
* Make Kedro instantiate datasets from `kedro_datasets` with higher priority than `kedro.extras.datasets`. `kedro_datasets` is the namespace for the new `kedro-datasets` python package.
* The config loader objects now implement `UserDict` and the configuration is accessed through `conf_loader['catalog']`.
* You can configure config file patterns through `settings.py` without creating a custom config loader.
* Added the following new datasets:

| Type | Description | Location |
| ------------------------------------ | -------------------------------------------------------------------------- | -------------------------------- |
| `svmlight.SVMLightDataSet` | Work with svmlight/libsvm files using scikit-learn library | `kedro.extras.datasets.svmlight` |
| `video.VideoDataSet` | Read and write video files from a filesystem | `kedro.extras.datasets.video` |
| `video.video_dataset.SequenceVideo` | Create a video object from an iterable sequence to use with `VideoDataSet` | `kedro.extras.datasets.video` |
| `video.video_dataset.GeneratorVideo` | Create a video object from a generator to use with `VideoDataSet` | `kedro.extras.datasets.video` |
* Implemented support for a functional definition of schema in `dask.ParquetDataSet` to work with the `dask.to_parquet` API.

Bug fixes and other changes
* Fixed `kedro micropkg pull` for packages on PyPI.
* Fixed `format` in `save_args` for `SparkHiveDataSet`, previously it didn't allow you to save it as delta format.
* Fixed save errors in `TensorFlowModelDataset` when used without versioning; previously, it wouldn't overwrite an existing model.
* Added support for `tf.device` in `TensorFlowModelDataset`.
* Updated error message for `VersionNotFoundError` to handle insufficient permission issues for cloud storage.
* Updated Experiment Tracking docs with working examples.
* Updated `MatplotlibWriter`, `text.TextDataSet`, `plotly.PlotlyDataSet` and `plotly.JSONDataSet` docs with working examples.
* Modified implementation of the Kedro IPython extension to use `local_ns` rather than a global variable.
* Refactored `ShelveStore` to its own module to ensure multiprocessing works with it.
* `kedro.extras.datasets.pandas.SQLQueryDataSet` now takes optional argument `execution_options`.
* Removed `attrs` upper bound to support newer versions of Airflow.
* Bumped the lower bound for the `setuptools` dependency to <=61.5.1.

Minor breaking changes to the API

0.18.3

Not secure
Major features and improvements
* Implemented autodiscovery of project pipelines. A pipeline created with `kedro pipeline create <pipeline_name>` can now be accessed immediately without needing to explicitly register it in `src/<package_name>/pipeline_registry.py`, either individually by name (e.g. `kedro run --pipeline=<pipeline_name>`) or as part of the combined default pipeline (e.g. `kedro run`). By default, the simplified `register_pipelines()` function in `pipeline_registry.py` looks like:

python
def register_pipelines() -> Dict[str, Pipeline]:
"""Register the project's pipelines.

Returns:
A mapping from pipeline names to ``Pipeline`` objects.
"""
pipelines = find_pipelines()
pipelines["__default__"] = sum(pipelines.values())
return pipelines


* The Kedro IPython extension should now be loaded with `%load_ext kedro.ipython`.
* The line magic `%reload_kedro` now accepts keywords arguments, e.g. `%reload_kedro --env=prod`.
* Improved resume pipeline suggestion for `SequentialRunner`, it will backtrack the closest persisted inputs to resume.

Bug fixes and other changes

* Changed default `False` value for rich logging `show_locals`, to make sure credentials and other sensitive data isn't shown in logs.
* Rich traceback handling is disabled on Databricks so that exceptions now halt execution as expected. This is a workaround for a [bug in `rich`](https://github.com/Textualize/rich/issues/2455).
* When using `kedro run -n [some_node]`, if `some_node` is missing a namespace the resulting error message will suggest the correct node name.
* Updated documentation for `rich` logging.
* Updated Prefect deployment documentation to allow for reruns with saved versioned datasets.
* The Kedro IPython extension now surfaces errors when it cannot load a Kedro project.
* Relaxed `delta-spark` upper bound to allow compatibility with Spark 3.1.x and 3.2.x.
* Added `gdrive` to list of cloud protocols, enabling Google Drive paths for datasets.
* Added svg logo resource for ipython kernel.

0.18.2

Not secure
Major features and improvements
* Added `abfss` to list of cloud protocols, enabling abfss paths.
* Kedro now uses the [Rich](https://github.com/Textualize/rich) library to format terminal logs and tracebacks.
* The file `conf/base/logging.yml` is now optional. See [our documentation](https://docs.kedro.org/en/0.18.2/logging/logging.html) for details.
* Introduced a `kedro.starters` entry point. This enables plugins to create custom starter aliases used by `kedro starter list` and `kedro new`.
* Reduced the `kedro new` prompts to just one question asking for the project name.

Bug fixes and other changes
* Bumped `pyyaml` upper bound to make Kedro compatible with the [pyodide](https://pyodide.org/en/stable/usage/loading-packages.html#micropip) stack.
* Updated project template's Sphinx configuration to use `myst_parser` instead of `recommonmark`.
* Reduced number of log lines by changing the logging level from `INFO` to `DEBUG` for low priority messages.
* Kedro's framework-side logging configuration no longer performs file-based logging. Hence superfluous `info.log`/`errors.log` files are no longer created in your project root, and running Kedro on read-only file systems such as Databricks Repos is now possible.
* The `root` logger is now set to the Python default level of `WARNING` rather than `INFO`. Kedro's logger is still set to emit `INFO` level messages.
* `SequentialRunner` now has consistent execution order across multiple runs with sorted nodes.
* Bumped the upper bound for the Flake8 dependency to <5.0.
* `kedro jupyter notebook/lab` no longer reuses a Jupyter kernel.
* Required `cookiecutter>=2.1.1` to address a [known command injection vulnerability](https://security.snyk.io/vuln/SNYK-PYTHON-COOKIECUTTER-2414281).
* The session store no longer fails if a username cannot be found with `getpass.getuser`.
* Added generic typing for `AbstractDataSet` and `AbstractVersionedDataSet` as well as typing to all datasets.
* Rendered the deployment guide flowchart as a Mermaid diagram, and added Dask.

Minor breaking changes to the API
* The module `kedro.config.default_logger` no longer exists; default logging configuration is now set automatically through `kedro.framework.project.LOGGING`. Unless you explicitly import `kedro.config.default_logger` you do not need to make any changes.

0.18.1

Not secure
Major features and improvements
* Added a new hook `after_context_created` that passes the `KedroContext` instance as `context`.
* Added a new CLI hook `after_command_run`.
* Added more detail to YAML `ParserError` exception error message.
* Added option to `SparkDataSet` to specify a `schema` load argument that allows for supplying a user-defined schema as opposed to relying on the schema inference of Spark.
* The Kedro package no longer contains a built version of the Kedro documentation significantly reducing the package size.

Bug fixes and other changes
* Removed fatal error from being logged when a Kedro session is created in a directory without git.
* `KedroContext` is now a attr's dataclass, `config_loader` is available as public attribute.
* Fixed `CONFIG_LOADER_CLASS` validation so that `TemplatedConfigLoader` can be specified in settings.py. Any `CONFIG_LOADER_CLASS` must be a subclass of `AbstractConfigLoader`.
* Added runner name to the `run_params` dictionary used in pipeline hooks.
* Updated [Databricks documentation](https://docs.kedro.org/en/0.18.1/deployment/databricks.html) to include how to get it working with IPython extension and Kedro-Viz.
* Update sections on visualisation, namespacing, and experiment tracking in the spaceflight tutorial to correspond to the complete spaceflights starter.
* Fixed `Jinja2` syntax loading with `TemplatedConfigLoader` using `globals.yml`.
* Removed global `_active_session`, `_activate_session` and `_deactivate_session`. Plugins that need to access objects such as the config loader should now do so through `context` in the new `after_context_created` hook.
* `config_loader` is available as a public read-only attribute of `KedroContext`.
* Made `hook_manager` argument optional for `runner.run`.
* `kedro docs` now opens an online version of the Kedro documentation instead of a locally built version.

0.18.0

Not secure
* `kedro.framework.context.load_context` will be removed in release 0.18.0.
* `kedro.framework.cli.get_project_context` will be removed in release 0.18.0.
* We've added a `DeprecationWarning` to the decorator API for both `node` and `pipeline`. These will be removed in release 0.18.0. Use Hooks to extend a node's behaviour instead.
* We've added a `DeprecationWarning` to the Transformers API when adding a transformer to the catalog. These will be removed in release 0.18.0. Use Hooks to customise the `load` and `save` methods.

Thanks for supporting contributions
[Deepyaman Datta](https://github.com/deepyaman),
[Zach Schuster](https://github.com/zschuster)

Migration guide from Kedro 0.16.* to 0.17.*

**Reminder:** Our documentation on [how to upgrade Kedro](https://docs.kedro.org/en/0.17.0/12_faq/01_faq.html#how-do-i-upgrade-kedro) covers a few key things to remember when updating any Kedro version.

The Kedro 0.17.0 release contains some breaking changes. If you update Kedro to 0.17.0 and then try to work with projects created against earlier versions of Kedro, you may encounter some issues when trying to run `kedro` commands in the terminal for that project. Here's a short guide to getting your projects running against the new version of Kedro.


>*Note*: As always, if you hit any problems, please check out our documentation:
>* [How can I find out more about Kedro?](https://docs.kedro.org/en/0.17.0/12_faq/01_faq.html#how-can-i-find-out-more-about-kedro)
>* [How can I get my questions answered?](https://docs.kedro.org/en/0.17.0/12_faq/01_faq.html#how-can-i-get-my-question-answered).

To get an existing Kedro project to work after you upgrade to Kedro 0.17.0, we recommend that you create a new project against Kedro 0.17.0 and move the code from your existing project into it. Let's go through the changes, but first, note that if you create a new Kedro project with Kedro 0.17.0 you will not be asked whether you want to include the boilerplate code for the Iris dataset example. We've removed this option (you should now use a Kedro starter if you want to create a project that is pre-populated with code).

To create a new, blank Kedro 0.17.0 project to drop your existing code into, you can create one, as always, with `kedro new`. We also recommend creating a new virtual environment for your new project, or you might run into conflicts with existing dependencies.

* **Update `pyproject.toml`**: Copy the following three keys from the `.kedro.yml` of your existing Kedro project into the `pyproject.toml` file of your new Kedro 0.17.0 project:


toml
[tools.kedro]
package_name = "<package_name>"
project_name = "<project_name>"
project_version = "0.17.0"


Check your source directory. If you defined a different source directory (`source_dir`), make sure you also move that to `pyproject.toml`.


* **Copy files from your existing project**:

+ Copy subfolders of `project/src/project_name/pipelines` from existing to new project
+ Copy subfolders of `project/src/test/pipelines` from existing to new project
+ Copy the requirements your project needs into `requirements.txt` and/or `requirements.in`.
+ Copy your project configuration from the `conf` folder. Take note of the new locations needed for modular pipeline configuration (move it from `conf/<env>/pipeline_name/catalog.yml` to `conf/<env>/catalog/pipeline_name.yml` and likewise for `parameters.yml`).
+ Copy from the `data/` folder of your existing project, if needed, into the same location in your new project.
+ Copy any Hooks from `src/<package_name>/hooks.py`.

* **Update your new project's README and docs as necessary**.

* **Update `settings.py`**: For example, if you specified additional Hook implementations in `hooks`, or listed plugins under `disable_hooks_by_plugin` in your `.kedro.yml`, you will need to move them to `settings.py` accordingly:

python
from <package_name>.hooks import MyCustomHooks, ProjectHooks

HOOKS = (ProjectHooks(), MyCustomHooks())

DISABLE_HOOKS_FOR_PLUGINS = ("my_plugin1",)


* **Migration for `node` names**. From 0.17.0 the only allowed characters for node names are letters, digits, hyphens, underscores and/or fullstops. If you have previously defined node names that have special characters, spaces or other characters that are no longer permitted, you will need to rename those nodes.

* **Copy changes to `kedro_cli.py`**. If you previously customised the `kedro run` command or added more CLI commands to your `kedro_cli.py`, you should move them into `<project_root>/src/<package_name>/cli.py`. Note, however, that the new way to run a Kedro pipeline is via a `KedroSession`, rather than using the `KedroContext`:

python
with KedroSession.create(package_name=...) as session:
session.run()


* **Copy changes made to `ConfigLoader`**. If you have defined a custom class, such as `TemplatedConfigLoader`, by overriding `ProjectContext._create_config_loader`, you should move the contents of the function in `src/<package_name>/hooks.py`, under `register_config_loader`.

* **Copy changes made to `DataCatalog`**. Likewise, if you have `DataCatalog` defined with `ProjectContext._create_catalog`, you should copy-paste the contents into `register_catalog`.

* **Optional**: If you have plugins such as [Kedro-Viz](https://github.com/kedro-org/kedro-viz) installed, it's likely that Kedro 0.17.0 won't work with their older versions, so please either upgrade to the plugin's newest version or follow their migration guides.

Page 16 of 21

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.