Plateau

Latest version: v4.4.0

Safety actively analyzes 723973 Python packages for vulnerabilities to keep your Python projects secure.

Page 6 of 8

3.8.2

==========================

Improvements
^^^^^^^^^^^^

* Read performance improved for, especially for partitioned datasets and queries with empty payload columns.

Bug fixes
^^^^^^^^^
* GH262: Raise an exception when trying to partition on a column with null values to prevent silent data loss
* Fix multiple index creation issues (cutting data, crashing) for ``uint`` data
* Fix index update issues for some types resulting in ``TypeError: Trying to update an index with different types...``
messages.
* Fix issues where index creation with empty partitions can lead to ``ValueError: Trying to create non-typesafe index``

3.8.1

==========================

Improvements
^^^^^^^^^^^^

* Only fix column odering when restoring ``DataFrame`` if the ordering is incorrect.

Bug fixes
^^^^^^^^^
* GH248 Fix an issue causing a ValueError to be raised when using `dask_index_on` on non-integer columns
* GH255 Fix an issue causing the python interpreter to shut down when reading an
empty file (see also https://issues.apache.org/jira/browse/ARROW-8142)

3.8.0

==========================

Improvements
^^^^^^^^^^^^

* Add keyword argument `dask_index_on` which reconstructs a dask index from an kartothek index when loading the dataset
* Add method ``kartothek.core.index.IndexBase.observed_values`` which returns an array of all observed values of the index column
* Updated and improved documentation w.r.t. guides and API documentation

Bug fixes
^^^^^^^^^
* GH227 Fix a Type error when loading categorical data in dask without
specifying it explicitly
* No longer trigger the SettingWithCopyWarning when using bucketing
* GH228 Fix an issue where empty header creation from a pyarrow schema would not
normalize the schema which causes schema violations during update.
* Fix an issue where ``kartothek.io.eager.create_empty_dataset_header``
would not accept a store factory.

3.7.0

==========================

Improvements
^^^^^^^^^^^^

* Support for pyarrow 0.16.0
* Decrease scheduling overhead for dask based pipelines
* Performance improvements for categorical data when using pyarrow>=0.15.0
* Dask is now able to calculate better size estimates for the following classes:
* ``kartothek.core.dataset.DatasetMetadata``
* ``kartothek.core.factory.DatasetFactory``
* ``kartothek.io_components.metapartition.MetaPartition``
* ``kartothek.core.index.ExplicitSecondaryIndex``
* ``kartothek.core.index.PartitionIndex``
* ``kartothek.core.partition.Partition``
* ``kartothek.core.common_metadata.SchemaWrapper``

3.6.2

==========================

Improvements
^^^^^^^^^^^^

* Add more explicit typing to ``kartothek.io.eager``.

Bug fixes
^^^^^^^^^
* Fix an issue where ``kartothek.io.dask.dataframe.update_dataset_from_ddf`` would create a column named "_KTK_HASH_BUCKET" in the dataset

3.6.1

==========================

Bug fixes
^^^^^^^^^
* Fix a regression introduced in 3.5.0 where predicates which allow multiple
values for a field would generate duplicates

Page 6 of 8

Releases

Has known vulnerabilities

Previous Next

Plateau

Page 6 of 8

3.8.2

3.8.1

3.8.0

3.7.0

3.6.2

3.6.1

Page 6 of 8

Links

Releases