Sklearn2pmml

Latest version: v0.116.2

Safety actively analyzes 724259 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 5

4.3

See [SkLearn2PMML-423](https://github.com/jpmml/sklearn2pmml/issues/423#issuecomment-2264552688)

Complex downgrades will be implemented based on customer needs.

Minor improvements and fixes

None.

0.116.2

Breaking changes

* Refactored `ExpressionTransformer` class to verify that the target PMML element for `ExpressionTransformer.map_missing_to`, `ExpressionTransformer.default_value` and `ExpressionTransformer.invalid_value_treatment` is clearly known. If any disambiguities are found, then an error is raised, which suggests that the expression should be refactored from inline string representation to UDF representation.

Previously, it was assumed that the target PMML element is the "outermost" transformer element (typically, the `Apply` element).

Before:

python
from sklearn2pmml.preprocessing import ExpressionTransformer

Ambiguous, because yields a hierarchy of Apply elements,
where the target Apply element (`Applyfunction="greaterOrEqual"`)
is shielded by non-target Apply element (`Applyfunction="if"`).
Furthermore, the map_missing_to value should be boolean not integer
transformer = ExpressionTransformer("1 if X[0] >= 0 else 0", map_missing_to = 0)

After:

python
from sklearn2pmml.preprocessing import ExpressionTransformer
from sklearn2pmml.util import Expression

def _binarize(x):
return (1 if x >= 0 else 0)

Unambiguous, because yields a single Apply element (`Applyfunction="_binarize"`)
transformer = ExpressionTransformer(Expression("_binarize(X[0])", function_defs = [_binarize]), map_missing_to = 0)

See [SkLearn2PMML-446](https://github.com/jpmml/sklearn2pmml/issues/446)

* Refactored the PMML representation of UDFs.

Previously, they were translated to `DerivedField` elements, whereas now they are translated to `DefineFunction` elements.

New features

None.

Minor improvements and fixes

* Improved the parsing of Python statements.

See [SkLearn2PMML-447](https://github.com/jpmml/sklearn2pmml/issues/446)

* Ensured compatibility with Python 3.13.

0.116.1

Breaking changes

* Refactored the cast to Python string data types (`str` and `"unicode"`).

Previously, the cast was implemented using data container-native `apply` methods (eg. `X.astype(str)`). However, these methods act destructively with regards to constant values such as `None`, `numpy.nan`, `pandas.NA` and `pandas.NaT`, by replacing them with the corresponding string literals. For example, a `None` constant becomes a `"None"` string.

The `sklearn2pmml.util.cast()` utility function that implements casts across SkLearn2PMML transformers now contains an extra masking logic to detect and preserve missing value constants unchanged. This is critical for the correct functioning of downstream missing value-aware steps such as imputers, encoders and expression transformers.

See [SkLearn2PMML-445](https://github.com/jpmml/sklearn2pmml/issues/445)

New features

* Added `LagTransformer.block_indicators` and `RollingAggregateTransformer.block_indicators` attributes.

These attributes enhance the base transformation with "group by" functionality.

For example, calculating a moving average over a mixed stock prices dataset:

python
from sklearn2pmml.preprocessing import RollingAggregateTransformer

mapper = DataFrameMapper([
(["stock", "price"], RollingAggregateTransformer(function = "avg", n = 100, block_indicators = ["stock"]))
], input_df = True, df_out = True)

* Added package up-to-date check.

The Java side of the package computes the timedelta between the current timestamp and the package build timestamp before doing any actual work. If this timedelta is greater than 180 days (6 months) a warning is issued. If this timedelta is greater than 360 days (12 months) an error is raised.

Minor improvements and fixes

* Added `LagTransformer.get_feature_names_out()` and `RollingAggregateTransformer.get_feature_names_out()` methods.

* Fixed the cast of wildcard features.

Previously, if a `CastTransformer` transformer was applied to a wildcard feature, then the newly assigned operational type was not guaranteed to stick.

See [SkLearn2PMML-445](https://github.com/jpmml/sklearn2pmml/issues/445#issuecomment-2737638764)

0.116.0

Breaking changes

* Renamed `sklearn2pmml.preprocessing.Aggregator` class to `AggregateTransformer`.

In order to support archived pipeline objects, the SkLearn2PMML package shall keep recognizing the old name variant alongside the new one.

New features

* Added support for [`sklearn.model_selection.FixedThresholdClassifier`](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.FixedThresholdClassifier.html) and [`sklearn.model_selection.TunedThresholdClassifierCV`](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TunedThresholdClassifierCV.html) classes.

The post-fit tuned target is exposed in the model schema as an extra `thresholded(<target field name>)` output field.

* Added support for `sklearn2pmml.preprocessing.LagTransformer` class.

Implements a "shift" operation using PMML's [`Lag`](https://dmg.org/pmml/v4-4-1/Transformations.html#lag) element.

* Added support for `sklearn2pmml.preprocessing.RollingAggregateTransformer` class.

Implements a "rolling aggregate" operation using PMML's [`Lag`](https://dmg.org/pmml/v4-4-1/Transformations.html#lag) element.

The PMML implementation differs from Pandas' default implementation in that it excludes the curent row. For example, when using a window size of five, then PMML considers five rows preceding the current row (ie. `X.rolling(window = 5, closed = "left")`), whereas Pandas considers four rows preceding the current row plus the current row (ie. `X.rolling(window = 5, closed = "right")`).

A Pandas-equivalent "rolling aggregate" operation can be emulated using `AggregateTransformer` and `LagTransformer` transformers directly.

Minor improvements and fixes

None.

0.115.0

Breaking changes

None.

New features

* Added support for `sklearn2pmml.preprocessing.StringLengthTransformer` class.

Minor improvements and fixes

* Fixed the `StringNormalizer.transform(X)` method to preserve the original data container shape.

See [SkLearn2PMML-443](https://github.com/jpmml/sklearn2pmml/issues/443)

* Ensured compatibility with PCRE2 0.5.0.

The 0.5.X development branch underwent breaking changes, with the goal of migrating from proprietary API to Python RE-compatible API. For example, the compiler pattern object now provides both `search(x)` and `sub(replacement, x)` conveniene methods.

* Ensured compatibility with BorutaPy 0.4.3, Category-Encoders 2.6.4, CHAID 5.4.2, Hyperopt-sklearn 1.0.3, Imbalanced-Learn 0.13.0, InterpretML 0.6.9, OptBinning 0.20.1, PyCaret 3.3.2, Scikit-Lego 0.9.4, Scikit-Tree 0.8.0 and TPOT 0.12.2.

0.114.0

Breaking changes

* Required Java 11 or newer.

New features

None.

Minor improvements and fixes

None.

Page 1 of 5

Releases

Has known vulnerabilities

Sklearn2pmml

Page 1 of 5

4.3

0.116.2

0.116.1

0.116.0

0.115.0

0.114.0

Page 1 of 5

Links

Releases