Kedro

Latest version: v0.19.6

Safety actively analyzes 634645 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 17 of 21

0.17.7

Not secure
Major features and improvements
* `pipeline` now accepts `tags` and a collection of `Node`s and/or `Pipeline`s rather than just a single `Pipeline` object. `pipeline` should be used in preference to `Pipeline` when creating a Kedro pipeline.
* `pandas.SQLTableDataSet` and `pandas.SQLQueryDataSet` now only open one connection per database, at instantiation time (therefore at catalog creation time), rather than one per load/save operation.
* Added new command group, `micropkg`, to replace `kedro pipeline pull` and `kedro pipeline package` with `kedro micropkg pull` and `kedro micropkg package` for Kedro 0.18.0. `kedro micropkg package` saves packages to `project/dist` while `kedro pipeline package` saves packages to `project/src/dist`.

Bug fixes and other changes
* Added tutorial documentation for [experiment tracking](https://docs.kedro.org/en/0.17.7/08_logging/02_experiment_tracking.html).
* Added [Plotly dataset documentation](https://docs.kedro.org/en/0.17.7/03_tutorial/05_visualise_pipeline.html#visualise-plotly-charts-in-kedro-viz).
* Added the upper limit `pandas<1.4` to maintain compatibility with `xlrd~=1.0`.
* Bumped the `Pillow` minimum version requirement to 9.0 (Python 3.7+ only) following [CVE-2022-22817](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-22817).
* Fixed `PickleDataSet` to be copyable and hence work with the parallel runner.
* Upgraded `pip-tools`, which is used by `kedro build-reqs`, to 6.5 (Python 3.7+ only). This `pip-tools` version is compatible with `pip>=21.2`, including the most recent releases of `pip`. Python 3.6 users should continue to use `pip-tools` 6.4 and `pip<22`.
* Added `astro-iris` as alias for `astro-airlow-iris`, so that old tutorials can still be followed.
* Added details about [Kedro's Technical Steering Committee and governance model](https://docs.kedro.org/en/0.17.7/14_contribution/technical_steering_committee.html).

0.17.6

Not secure
Major features and improvements
* Added `pipelines` global variable to IPython extension, allowing you to access the project's pipelines in `kedro ipython` or `kedro jupyter notebook`.
* Enabled overriding nested parameters with `params` in CLI, i.e. `kedro run --params="model.model_tuning.booster:gbtree"` updates parameters to `{"model": {"model_tuning": {"booster": "gbtree"}}}`.
* Added option to `pandas.SQLQueryDataSet` to specify a `filepath` with a SQL query, in addition to the current method of supplying the query itself in the `sql` argument.
* Extended `ExcelDataSet` to support saving Excel files with multiple sheets.
* Added the following new datasets:

| Type | Description | Location |
| ------------------------- | ---------------------------------------------------------------------------------------------------------------------- | ------------------------------ |
| `plotly.JSONDataSet` | Works with plotly graph object Figures (saves as json file) | `kedro.extras.datasets.plotly` |
| `pandas.GenericDataSet` | Provides a 'best effort' facility to read / write any format provided by the `pandas` library | `kedro.extras.datasets.pandas` |
| `pandas.GBQQueryDataSet` | Loads data from a Google Bigquery table using provided SQL query | `kedro.extras.datasets.pandas` |
| `spark.DeltaTableDataSet` | Dataset designed to handle Delta Lake Tables and their CRUD-style operations, including `update`, `merge` and `delete` | `kedro.extras.datasets.spark` |

Bug fixes and other changes
* Fixed an issue where `kedro new --config config.yml` was ignoring the config file when `prompts.yml` didn't exist.
* Added documentation for `kedro viz --autoreload`.
* Added support for arbitrary backends (via importable module paths) that satisfy the `pickle` interface to `PickleDataSet`.
* Added support for `sum` syntax for connecting pipeline objects.
* Upgraded `pip-tools`, which is used by `kedro build-reqs`, to 6.4. This `pip-tools` version requires `pip>=21.2` while [adding support for `pip>=21.3`](https://github.com/jazzband/pip-tools/pull/1501). To upgrade `pip`, please refer to [their documentation](https://pip.pypa.io/en/stable/installing/#upgrading-pip).
* Relaxed the bounds on the `plotly` requirement for `plotly.PlotlyDataSet` and the `pyarrow` requirement for `pandas.ParquetDataSet`.
* `kedro pipeline package <pipeline>` now raises an error if the `<pipeline>` argument doesn't look like a valid Python module path (e.g. has `/` instead of `.`).
* Added new `overwrite` argument to `PartitionedDataSet` and `MatplotlibWriter` to enable deletion of existing partitions and plots on dataset `save`.
* `kedro pipeline pull` now works when the project requirements contains entries such as `-r`, `--extra-index-url` and local wheel files ([Issue 913](https://github.com/kedro-org/kedro/issues/913)).
* Fixed slow startup because of catalog processing by reducing the exponential growth of extra processing during `_FrozenDatasets` creations.
* Removed `.coveragerc` from the Kedro project template. `coverage` settings are now given in `pyproject.toml`.
* Fixed a bug where packaging or pulling a modular pipeline with the same name as the project's package name would throw an error (or silently pass without including the pipeline source code in the wheel file).
* Removed unintentional dependency on `git`.
* Fixed an issue where nested pipeline configuration was not included in the packaged pipeline.
* Deprecated the "Thanks for supporting contributions" section of release notes to simplify the contribution process; Kedro 0.17.6 is the last release that includes this. This process has been replaced with the [automatic GitHub feature](https://github.com/kedro-org/kedro/graphs/contributors).
* Fixed a bug where the version on the tracking datasets didn't match the session id and the versions of regular versioned datasets.
* Fixed an issue where datasets in `load_versions` that are not found in the data catalog would silently pass.
* Altered the string representation of nodes so that node inputs/outputs order is preserved rather than being alphabetically sorted.
* Update `APIDataSet` to accept `auth` through `credentials` and allow any iterable for `auth`.

0.17.5

Not secure
Major features and improvements
* Added new CLI group `registry`, with the associated commands `kedro registry list` and `kedro registry describe`, to replace `kedro pipeline list` and `kedro pipeline describe`.
* Added support for dependency management at a modular pipeline level. When a pipeline with `requirements.txt` is packaged, its dependencies are embedded in the modular pipeline wheel file. Upon pulling the pipeline, Kedro will append dependencies to the project's `requirements.in`. More information is available in [our documentation](https://docs.kedro.org/en/0.17.5/06_nodes_and_pipelines/03_modular_pipelines.html).
* Added support for bulk packaging/pulling modular pipelines using `kedro pipeline package/pull --all` and `pyproject.toml`.
* Removed `cli.py` from the Kedro project template. By default all CLI commands, including `kedro run`, are now defined on the Kedro framework side. These can be overridden in turn by a plugin or a `cli.py` file in your project. A packaged Kedro project will respect the same hierarchy when executed with `python -m my_package`.
* Removed `.ipython/profile_default/startup/` from the Kedro project template in favour of `.ipython/profile_default/ipython_config.py` and the `kedro.extras.extensions.ipython`.
* Added support for `dill` backend to `PickleDataSet`.
* Imports are now refactored at `kedro pipeline package` and `kedro pipeline pull` time, so that _aliasing_ a modular pipeline doesn't break it.
* Added the following new datasets to support basic Experiment Tracking:

| Type | Description | Location |
| ------------------------- | -------------------------------------------------------- | -------------------------------- |
| `tracking.MetricsDataSet` | Dataset to track numeric metrics for experiment tracking | `kedro.extras.datasets.tracking` |
| `tracking.JSONDataSet` | Dataset to track data for experiment tracking | `kedro.extras.datasets.tracking` |

Bug fixes and other changes
* Bumped minimum required `fsspec` version to 2021.04.
* Fixed the `kedro install` and `kedro build-reqs` flows when uninstalled dependencies are present in a project's `settings.py`, `context.py` or `hooks.py` ([Issue 829](https://github.com/kedro-org/kedro/issues/829)).
* Imports are now refactored at `kedro pipeline package` and `kedro pipeline pull` time, so that _aliasing_ a modular pipeline doesn't break it.

Minor breaking changes to the API
* Pinned `dynaconf` to `<3.1.6` because the method signature for `_validate_items` changed which is used in Kedro.

0.17.4

Not secure
Major features and improvements
* Added the following new datasets:

| Type | Description | Location |
| ---------------------- | ----------------------------------------------------------- | ------------------------------ |
| `plotly.PlotlyDataSet` | Works with plotly graph object Figures (saves as json file) | `kedro.extras.datasets.plotly` |

Bug fixes and other changes
* Defined our set of Kedro Principles! Have a read through [our docs](https://docs.kedro.org/en/0.17.4/12_faq/03_kedro_principles.html).
* `ConfigLoader.get()` now raises a `BadConfigException`, with a more helpful error message, if a configuration file cannot be loaded (for instance due to wrong syntax or poor formatting).
* `run_id` now defaults to `save_version` when `after_catalog_created` is called, similarly to what happens during a `kedro run`.
* Fixed a bug where `kedro ipython` and `kedro jupyter notebook` didn't work if the `PYTHONPATH` was already set.
* Update the IPython extension to allow passing `env` and `extra_params` to `reload_kedro` similar to how the IPython script works.
* `kedro info` now outputs if a plugin has any `hooks` or `cli_hooks` implemented.
* `PartitionedDataSet` now supports lazily materializing data on save.
* `kedro pipeline describe` now defaults to the `__default__` pipeline when no pipeline name is provided and also shows the namespace the nodes belong to.
* Fixed an issue where spark.SparkDataSet with enabled versioning would throw a VersionNotFoundError when using databricks-connect from a remote machine and saving to dbfs filesystem.
* `EmailMessageDataSet` added to doctree.
* When node inputs do not pass validation, the error message is now shown as the most recent exception in the traceback ([Issue 761](https://github.com/kedro-org/kedro/issues/761)).
* `kedro pipeline package` now only packages the parameter file that exactly matches the pipeline name specified and the parameter files in a directory with the pipeline name.
* Extended support to newer versions of third-party dependencies ([Issue 735](https://github.com/kedro-org/kedro/issues/735)).
* Ensured consistent references to `model input` tables in accordance with our Data Engineering convention.
* Changed behaviour where `kedro pipeline package` takes the pipeline package version, rather than the kedro package version. If the pipeline package version is not present, then the package version is used.
* Launched [GitHub Discussions](https://github.com/kedro-org/kedro/discussions/) and [Kedro Discord Server](https://discord.gg/akJDeVaxnB)
* Improved error message when versioning is enabled for a dataset previously saved as non-versioned ([Issue 625](https://github.com/kedro-org/kedro/issues/625)).

Minor breaking changes to the API

0.17.3

Not secure
Major features and improvements
* Kedro plugins can now override built-in CLI commands.
* Added a `before_command_run` hook for plugins to add extra behaviour before Kedro CLI commands run.
* `pipelines` from `pipeline_registry.py` and `register_pipeline` hooks are now loaded lazily when they are first accessed, not on startup:

python
from kedro.framework.project import pipelines

print(pipelines["__default__"]) pipeline loading is only triggered here


Bug fixes and other changes
* `TemplatedConfigLoader` now correctly inserts default values when no globals are supplied.
* Fixed a bug where the `KEDRO_ENV` environment variable had no effect on instantiating the `context` variable in an iPython session or a Jupyter notebook.
* Plugins with empty CLI groups are no longer displayed in the Kedro CLI help screen.
* Duplicate commands will no longer appear twice in the Kedro CLI help screen.
* CLI commands from sources with the same name will show under one list in the help screen.
* The setup of a Kedro project, including adding src to path and configuring settings, is now handled via the `bootstrap_project` method.
* `configure_project` is invoked if a `package_name` is supplied to `KedroSession.create`. This is added for backward-compatibility purpose to support a workflow that creates `Session` manually. It will be removed in `0.18.0`.
* Stopped swallowing up all `ModuleNotFoundError` if `register_pipelines` not found, so that a more helpful error message will appear when a dependency is missing, e.g. [Issue 722](https://github.com/kedro-org/kedro/issues/722).
* When `kedro new` is invoked using a configuration yaml file, `output_dir` is no longer a required key; by default the current working directory will be used.
* When `kedro new` is invoked using a configuration yaml file, the appropriate `prompts.yml` file is now used for validating the provided configuration. Previously, validation was always performed against the kedro project template `prompts.yml` file.
* When a relative path to a starter template is provided, `kedro new` now generates user prompts to obtain configuration rather than supplying empty configuration.
* Fixed error when using starters on Windows with Python 3.7 (Issue [722](https://github.com/kedro-org/kedro/issues/722)).
* Fixed decoding error of config files that contain accented characters by opening them for reading in UTF-8.
* Fixed an issue where `after_dataset_loaded` run would finish before a dataset is actually loaded when using `--async` flag.

0.17.2

Not secure
Major features and improvements
* Added support for `compress_pickle` backend to `PickleDataSet`.
* Enabled loading pipelines without creating a `KedroContext` instance:

python
from kedro.framework.project import pipelines

print(pipelines)


* Projects generated with kedro>=0.17.2:
- should define pipelines in `pipeline_registry.py` rather than `hooks.py`.
- when run as a package, will behave the same as `kedro run`

Bug fixes and other changes
* If `settings.py` is not importable, the errors will be surfaced earlier in the process, rather than at runtime.

Minor breaking changes to the API
* `kedro pipeline list` and `kedro pipeline describe` no longer accept redundant `--env` parameter.
* `from kedro.framework.cli.cli import cli` no longer includes the `new` and `starter` commands.

Page 17 of 21

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.