A major update to the BOOMER algorithm that introduces the following changes.
{warning}
This release comes with several API changes. For an updated overview of the available parameters and command line arguments, please refer to the [documentation](https://mlrl-boomer.readthedocs.io/en/0.10.0/).
Algorithmic Enhancements
- **The project does now provide a Separate-and-Conquer (SeCo) algorithm** based on traditional rule learning techniques that are particularly well-suited for learning interpretable models.
- **Space-efficient data structures are now used for storing feature values**, depending on whether the feature is numerical, ordinal, nominal, or binary. This also enables to use optimized code paths for dealing with these different types of features.
- **The implementation of feature binning has been reworked** in a way that avoids redundant code and results in a reduction of training times due to the use of the data structures mentioned above.
- **The value to be used for sparse elements of a feature matrix** can now be specified via the C++ or Python API.
- **Nominal and ordinal feature values are now represented as integers** to avoid issues due to limited floating point precision.
- **Safe comparisons of floating point values** are now used to avoid issues due to limited floating point precision.
- **Fundamental data structures for vectors and matrices have been reworked** to ease reusing existing functionality and avoiding redundant code.
Additions to the Command Line API
- **Information about the program can now be printed** via the argument `-v` or `--version`.
- **Data characteristics do now include the number of ordinal features** when printed on the console or written to a file via the command line argument `--print-data-characteristics` or `--store-data-characteristics`.
Bugfixes
- An issue has been fixed that caused the number of numerical and nominal features to be swapped when using the command line arguments `--print-data-characteristics` or `--store-data-characteristics`.
- The correct directory is now used for loading and saving parameter settings when using the command line arguments `--parameter-dir` and `--store-parameters`.
API Changes
- The option `num_threads` of the parameters `--parallel-rule-refinement`, `--parallel-statistic-update` and `--parallel-prediction` has been renamed to `num_preferred_threads`.
Quality-of-Life Improvements
- The documentation has been updated to a more modern theme supporting light and dark theme variants.
- A build option that allows disabling multi-threading support via OpenMP at compile-time has been added.
- The groundwork for GPU support was laid. It can be disabled at compile-time via a build option.
- Added support for unit testing the project's C++ code. Compilation of the tests can be disabled via a build option.
- The Python code is now checked for common issues by applying `pylint` via continuous integration.
- The Makefile has been replaced with wrapper scripts triggering a [SCons](https://scons.org/) build.
- Development versions of wheel packages are now regularly built via continuous integration, uploaded as artifacts, and published on [Test-PyPI](https://test.pypi.org/).
- Continuous integration is now used to maintain separate branches for major, feature, and bugfix releases and keep them up-to-date.
- The runtime of continuous integration jobs has been optimized by running individual steps only if necessary, caching files across subsequent runs, and making use of parallelization.
- When tests are run via continuous integration, a summary of the test results is now added to merge requests and Github workflows.
- Markdown files are now used for writing the documentation.
- A consistent style is now enforced for Markdown files by applying the tool `mdformat` via continuous integration.
- C++ 17 or newer is now required for compiling the project.