Sklearn2pmml

Latest version: v0.111.1

Safety actively analyzes 682404 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 4

4.3

See [SkLearn2PMML-423](https://github.com/jpmml/sklearn2pmml/issues/423#issuecomment-2264552688)

Complex downgrades will be implemented based on customer needs.

Minor improvements and fixes

None.

0.111.0

Breaking changes

* Assume `re` as the default regular expression (RE) flavour.

* Removed support for multi-column mode from `StrngNormalizer` class.
String transformations are unique and rare enough, so that they should be specified on a column-by-column basis.

New features

* Added `MatchesTransformer.re_flavour` and `ReplaceTransformer.re_flavour` attributes.
The Python environment allows to choose between different RE engines, which vary by RE syntax to a material degree.
Unambiguous identification of the RE engine improves the portability of RE transformers between applications (train vs. deployment) and environments.

Supported RE flavours:

| RE flavour | Implementation |
|---|---|
| `pcre` | [PCRE](https://pypi.org/project/python-pcre/) package |
| `pcre2`| [PCRE2](https://pypi.org/project/pcre2/) package |
| `re` | Built-in `re` module |

PMML only supports Perl Compatible Regular Expression (PCRE) syntax.

It is recommended to use some PCRE-based RE engine on Python side as well to minimize the chance of "communication errors" between Python and PMML environments.

* Added `sklearn2pmml.preprocessing.regex.make_regex_engine(pattern, re_flavour)` utility function.

This utility function pre-compiles and wraps the specified RE pattern into a `sklearn2pmml.preprocessing.regex.RegExEngine` object.

The `RegExEngine` class provides `matches(x)` and `replace(replacement, x)` methods, which correspond to PMML's [`matches`](https://dmg.org/pmml/v4-4-1/BuiltinFunctions.html#matches) and [`replace`](https://dmg.org/pmml/v4-4-1/BuiltinFunctions.html#replace) built-in functions, respectively.

For example, unit testing a RE engine:

python
from sklearn2pmml.preprocessing.regex import make_regex_engine

regex_engine = make_regex_engine("B+", re_flavour = "pcre2")

assert regex_engine.matches("ABBA") == True
assert regex_engine.replace("c", "ABBA") == "AcA"


See [SkLearn2PMML-228](https://github.com/jpmml/sklearn2pmml/issues/228)

* Refactored `StringNormalizer.transform(X)` and `SubstringTransformer.transform(X)` methods to support Pandas' Series input and output.

See [SkLearn2PMML-434](https://github.com/jpmml/sklearn2pmml/issues/434)

Minor improvements and fixes

* Ensured compatibility wth Scikit-Learn 1.5.1 and 1.5.2.

0.110.0

Breaking changes

None.

New features

* Added `pmml_schema` parameter to the `sklearn2pmml.sklearn2pmml(estimator, pmml_path)` utility function.

This parameter allows downgrading PMML schema version from the default 4.4 version to any 3.X or 4.X version.
However, the downgrade is "soft", meaning that it only succeeds if the in-memory PMML document is naturally compatible with the requested PMML schema version.
The downgrade fails if there are structural changes needed.

Exprting a pipeline into a PMML schema version 4.3 document:

python
from sklearn2pmml import sklearn2pmml

pipeline = Pipeline([...])
pipeline.fit(X, y)

0.109.0

Breaking changes

None.

New features

* Added support for Scikit-Learn 1.5.X.

* Added support for `yeo-johnson` power transform method in [`PowerTransformer`](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PowerTransformer.html) class.

This method is the default for this transformer.

Minor improvements and fixes

* Fixed the initialization of Python expression evaluation environments.

The environment is "pre-loaded" with a small number of Python built-in (`math`, `re`) and third-party (`numpy`, `pandas`, `scipy` and optionally `pcre`) module imports.

All imports use canonical module names (eg. `import numpy`). There is **no** module name aliasing taking place (eg. `import numpy as np`).
Therefore, the evaluatable Python expressions must also spell out canonical module names.

See [SkLearn2PMML-421](https://github.com/jpmml/sklearn2pmml/issues/421)

* Added support for `log` link function in `ExplainableBoostingRegressor` class.

See [SkLearn2PMML-422](https://github.com/jpmml/sklearn2pmml/issues/422)

0.108.0

Breaking changes

None.

New features

* Added support for [`interpret.glassbox.ClassificationTree`](https://interpret.ml/docs/python/api/ClassificationTree.html) and [`interpret.glassbox.RegressionTree`](https://interpret.ml/docs/python/api/RegressionTree.html) classes.

* Added support for [`interpret.glassbox.LinearRegression`](https://interpret.ml/docs/python/api/LinearRegression.html) and [`interpret.glassbox.LogisticRegression`](https://interpret.ml/docs/python/api/LogisticRegression.html) classes.

* Added support for [`interpret.glassbox.ExplainableBoostingClassifier`](https://interpret.ml/docs/python/api/ExplainableBoostingClassifier.html) and [`interpret.glassbox.ExplainableBoostingRegressor`](https://interpret.ml/docs/python/api/ExplainableBoostingRegressor.html) classes.

See [InterpretML-536](https://github.com/interpretml/interpret/issues/536)

Minor improvements and fixes

* Ensured compatibility with Scikit-Learn 1.4.2.

0.107.1

Breaking changes

None.

New features

* Added support for [`H2OExtendedIsolationForestEstimator`](https://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/modeling.html#h2oextendedisolationforestestimator) class.

This class implements the isolation forest algorithm using oblique tree models.
It is claimed to outperform the [`H2OIsolationForestEstimator`](https://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/modeling.html#h2oisolationforestestimator) class, which does the same using plain (ie. non-oblique) tree models.

* Made `lightgbm.Booster` class directly exportable to PMML.

The SkLearn2PMML package now supports both LightGBM [Training API](https://lightgbm.readthedocs.io/en/latest/Python-API.html#training-api) and [Scikit-Learn API](https://lightgbm.readthedocs.io/en/latest/Python-API.html#scikit-learn-api):

python
from lightgbm import train, Dataset
from sklearn2pmml import sklearn2pmml

ds = Dataset(data = X, label = y)

booster = train(params = {...}, train_set = ds)

sklearn2pmml(booster, "LightGBM.pmml")


* Made `xgboost.Booster` class directly exportable to PMML.

The SkLearn2PMML package now supports both XGBoost [Learning API](https://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.training) and [Scikit-Learn API](https://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.sklearn):

python
from xgboost import train, DMatrix
from sklearn2pmml import sklearn2pmml

dmatrix = DMatrix(data = X, label = y)

booster = train(params = {...}, dtrain = dmatrix)

sklearn2pmml(booster, "XGBoost.pmml")


* Added `xgboost.Booster.fmap` attribute.

This attribute allows overriding the embedded feature map with a user-defined feature map.

The main use case is refining the category levels of categorical levels.

A suitable feature map object can be generated from the training dataset using the `sklearn2pmml.xgboost.make_feature_map(X)` utility function:

python
from xgboost import train, DMatrix
from sklearn2pmml.xgboost import make_feature_map

Enable categorical features
dmatrix = DMatrix(X, label = y, enable_categorical = True)

Generate a feature map with detailed description of all continuous and categorical features in the dataset
fmap = make_feature_map(X)

booster = train(params = {...}, dtrain = dmatrix)
booster.fmap = fmap


* Added `input_float` conversion option for XGBoost models.

Minor improvements and fixes

None.

Page 1 of 4

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.