Plateau

Latest version: v4.4.0

Safety actively analyzes 681866 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 7 of 8

3.6.0

==========================

New functionality
^^^^^^^^^^^^^^^^^
- The partition on shuffle algorithm in ``kartothek.io.dask.dataframe.update_dataset_from_ddf`` now supports
producing deterministic buckets based on hashed input data.

Bug fixes
^^^^^^^^^
- Fix addition of bogus index columns to Parquet files when using `sort_partitions_by`.
- Fix bug where ``partition_on`` in write path drops empty DataFrames and can lead to datasets without tables.

3.5.1

==========================
- Fix potential ``pyarrow.lib.ArrowNotImplementedError`` when trying to store or pickle empty
``kartothek.core.index.ExplicitSecondaryIndex`` objects
- Fix pickling of ``kartothek.core.index.ExplicitSecondaryIndex`` unloaded in
`dispatch_metapartitions_from_factory`

3.5.0

==========================

New functionality
^^^^^^^^^^^^^^^^^
- Add support for pyarrow 0.15.0
- Additional functions in ``kartothek.serialization`` module for dealing with predicates
* ``kartothek.serialization.check_predicates``
* ``kartothek.serialization.filter_predicates_by_column``
* ``kartothek.serialization.columns_in_predicates``
- Added available types for type annotation when dealing with predicates
* ``kartothek.serialization.PredicatesType``
* ``kartothek.serialization.ConjunctionType``
* ``kartothek.serialization.LiteralType``
- Make ``kartothek.io.*read_table*`` methods use default table name if unspecified
- ``MetaPartition.parse_input_to_metapartition`` accepts dicts and list of tuples equivalents as ``obj`` input
- Added `secondary_indices` as a default argument to the `write` pipelines

Bug fixes
^^^^^^^^^
- Input to ``normalize_args`` is properly normalized to ``list``
- ``MetaPartition.load_dataframes`` now raises if table in ``columns`` argument doesn't exist
- require ``urlquote>=1.1.0`` (where ``urlquote.quoting`` was introduced)
- Improve performance for some cases where predicates are used with the `in` operator.
- Correctly preserve :class:``kartothek.core.index.ExplicitSecondaryIndex`` dtype when index is empty
- Fixed DeprecationWarning in pandas ``CategoricalDtype``
- Fixed broken docstring for `store_dataframes_as_dataset`
- Internal operations no longer perform schema validations. This will improve
performance for batched partition operations (e.g. `partition_on`) but will
defer the validation in case of inconsistencies to the final commit. Exception
messages will be less verbose in these cases as before.
- Fix an issue where an empty dataframe of a partition in a multi-table dataset
would raise a schema validation exception
- Fix an issue where the `dispatch_by` keyword would disable partition pruning
- Creating dataset with non existing columns as explicit index to raise a ValueError

Breaking changes
^^^^^^^^^^^^^^^^
- Remove support for pyarrow < 0.13.0
- Move the docs module from `io_components` to `core`

3.4.0

==========================
- Add support for pyarrow 0.14.1
- Use urlquote for faster quoting/unquoting

3.3.0

==========================
- Fix rejection of bool predicates in ``kartothek.serialization.filter_array_like`` when bool columns contains
``None``
- Streamline behavior of `store_dataset_from_ddf` when passing empty ddf.
- Fix an issue where a segmentation fault may be raised when comparing MetaPartition instances
- Expose a ``date_as_object`` flag in ``kartothek.core.index.as_flat_series``

3.2.0

==========================
- Fix gh:66 where predicate pushdown may evalute false results if evaluated
using improper types. The behavior now is to raise in these situations.
- Predicate pushdown and ``kartothek.serialization.filter_array_like`` will now properly handle pandas Categoricals.
- Add ``kartothek.io.dask.bag.read_dataset_as_dataframe_bag``
- Add ``kartothek.io.dask.bag.read_dataset_as_metapartitions_bag``

Page 7 of 8

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.