Rulekit

Latest version: v2.1.24.0

Safety actively analyzes 685525 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 8

3.3

More changes have been made for survival rules.

First, there is a new class `rulekit.kaplan_meier.KaplanMeierEstimator`, which represents Kaplan-Meier estimator rules. In the future, prediction arrays for survival problems will probably be moved from dictionary arrays to arrays of such objects, but this would be a breaking change unfortunately

In addition, one can now easily access the Kaplan-Meier curve of the entire training dataset using the `rulekit.survival.SurvivalRules.get_train_set_kaplan_meier` method.

Such curves can be easily plotted using the charting package of your choice.

python
import pandas as pd
import matplotlib.pyplot as plt
from rulekit.arff import read_arff
from rulekit.survival import SurvivalRules
from rulekit.rules import RuleSet, SurvivalRule
from rulekit.kaplan_meier import KaplanMeierEstimator this is a new class

DATASET_URL: str = (
'https://raw.githubusercontent.com/'
'adaa-polsl/RuleKit/master/data/bmt/'
'bmt.arff'
)
df: pd.DataFrame = read_arff(DATASET_URL)
X, y = df.drop('survival_status', axis=1), df['survival_status']

surv = SurvivalRules(survival_time_attr='survival_time')
surv.fit(X, y)

ruleset: RuleSet[SurvivalRule] = reg.model
rule: SurvivalRule = ruleset.rules[0]

you can now easily access Kaplan-Meier estimator of the rules
rule_estimator: KaplanMeierEstimator = rule.kaplan_meier_estimator
plt.step(
rule_estimator.times,
rule_estimator.probabilities,
label='First rule'
)
you can also access training dataset Kaplan-Meier estimator easily
train_dataset_estimator: KaplanMeierEstimator = surv.get_train_set_kaplan_meier()
plt.step(
train_dataset_estimator.times,
train_dataset_estimator.probabilities,
label='Training dataset'
)
plt.legend(title='Kaplan-Meier curves:')


4. Changes in expert rules induction for regression and survival `❗BREAKING CHANGES`

> Note that those changes will likely be reverted on the next version and are caused by a known bug in the original RuleKit library. Fixing it is beyond the scope of this package, which is merely a wrapper for it.

Since this version, there has been a change in the way expert rules and conditions for regression and survival problems are communicated. All you have to do is remove conclusion part of those rules (everything after **THEN**).

Expert rules before:

python
expert_rules = [
(
'rule-0',
'IF [[CD34kgx10d6 = (-inf, 10.0)]] AND [[extcGvHD = {0}]] THEN survival_status = {NaN}'
)
]

expert_preferred_conditions = [
(
'attr-preferred-0',
'inf: IF [CD34kgx10d6 = Any] THEN survival_status = {NaN}'
)
]


expert_forbidden_conditions = [
('attr-forbidden-0', 'IF [ANCrecovery = Any] THEN survival_status = {NaN}')
]

And now:
python
expert_rules = [
(
'rule-0',
'IF [[CD34kgx10d6 = (-inf, 10.0)]] AND [[extcGvHD = {0}]] THEN'
)
]

expert_preferred_conditions = [
(
'attr-preferred-0',
'inf: IF [CD34kgx10d6 = Any] THEN'
)
]


expert_forbidden_conditions = [
('attr-forbidden-0', 'IF [ANCrecovery = Any] THEN')
]


Other changes

* Fix expert rules parsing.
* Conditions printed in the order they had been added to the rule.
* Fixed bug when using `sklearn.base.clone` function with RuleKit model classes.
* Update tutorials in the documentation.

3.2

You can now access rules decision attribute value via `rulekit.rules.RegressionRule.conclusion_value` field. Example below:

python
import pandas as pd
from rulekit.arff import read_arff
from rulekit.regression import RuleRegressor
from rulekit.rules import RuleSet, RegressionRule

DATASET_URL: str = (
'https://raw.githubusercontent.com/'
'adaa-polsl/RuleKit/master/data/methane/'
'methane-train.arff'
)
df: pd.DataFrame = read_arff(DATASET_URL)
X, y = df.drop('MM116_pred', axis=1), df['MM116_pred']

reg = RuleRegressor()
reg.fit(X, y)

ruleset: RuleSet[RegressionRule] = reg.model
rule: RegressionRule = ruleset.rules[0]
print('Decision value of the first rule: ', rule.conclusion_value)

3.1

You can now access rules decision class via `rulekit.rules.ClassificationRule.decision_class` field. Example below:

python
import pandas as pd
from rulekit.arff import read_arff
from rulekit.classification import RuleClassifier
from rulekit.rules import RuleSet, ClassificationRule

DATASET_URL: str = (
'https://raw.githubusercontent.com/'
'adaa-polsl/RuleKit/refs/heads/master/data/seismic-bumps/'
'seismic-bumps.arff'
)
df: pd.DataFrame = read_arff(DATASET_URL)
X, y = df.drop('class', axis=1), df['class']

clf: RuleClassifier = RuleClassifier()
clf.fit(X, y)

RuleSet class became generic now
ruleset: RuleSet[ClassificationRule] = clf.model
rule: ClassificationRule = ruleset.rules[0]
print('Decision class of the first rule: ', rule.decision_class)

2.1.24

2.1.24.0

1. Revert breaking changes in expert rules induction for regression and survival
The latest version 2.1.21.0 introduced some groundbreaking changes, which you can read more
about them in the latest [release note](https://github.com/adaa-polsl/RuleKit-python/releases/tag/v2.1.21.0).
Now rules and expert conditions can be defined in both the old and new formats, see example below.

python
both variants will work the same
expert_rules = [
(
'rule-0',
'IF [[CD34kgx10d6 = (-inf, 10.0)]] AND [[extcGvHD = {0}]] THEN survival_status = {NaN}'
),
(
'rule-0',
'IF [[CD34kgx10d6 = (-inf, 10.0)]] AND [[extcGvHD = {0}]] THEN'
),
]


2. Upgrade to new version of RuleKit

In the new version of the Java RuleKit library, many bugs regarding expert induction have been corrected.

Other changes

* Improve flake8 score
* Add more unit tests.

2.1.23

Fixed multiple bugs in the expert mode:
- crash when adjusting conditions with nominal attributes.
- `mincov` parameter not set to the number of uncovered examples in the expert classification rules
- complementary conditions not supported in the expert knowledge

Page 1 of 8

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.