Added
- The global cache helper now support algorithms with multiple action methods by specifying the name of the action
method you want to cache.
(https://github.com/mad-lab-fau/tpcp/pull/118)
- Global disk cache helper should now be able to cache the action methods of algorithm classes defined in the main
script.
(https://github.com/mad-lab-fau/tpcp/pull/118)
- There are new builtin `FloatAggregator` and `MacroFloatAggregator` that should cover many of the use cases that
previously required custom aggregators.
(https://github.com/mad-lab-fau/tpcp/pull/118)
- Scorers now support passing a `final_aggregator`. This is called after all scoring and aggregation happens and allows
to implement complicated "meta" aggregation that depends on the results of all scores of all datapoints.
Note, that we are not sure yet, if this should be used more as an escape hedge and overusing it should be considered
an anti-pattern, or if it is exactly the other way around.
We need to experiment in a couple of real-life applications to figure this out.
(https://github.com/mad-lab-fau/tpcp/pull/120)
- Dataset classes now have a proper `__equals__` implementation.
(https://github.com/mad-lab-fau/tpcp/pull/120)
Changed
- Relative major overhall of how aggregator in scoring functions work. Before, aggregators were classes that were
initialized with the value of a score. Now they are instances of a class that is called with the value of a score.
This change allows it to create "configurable" aggregators that get the configuration at initialization time.
(https://github.com/mad-lab-fau/tpcp/pull/118)
This comes with a couple of breaking changes:
- The most "user-facing" one is that the `NoAgg` aggregator is now called `no_agg` indicating that it is an instance
of a class and not a class itself.
- All custom aggregators need to be rewritten, but you will likely find, that they are much simpler now.
(see the reworked examples for custom aggregators)
Fixed
- Fixed massive performance regression in version 0.34.1 affecting people that had tensorflow or torch installed, but
did not use it in their code.
The reason for that was, that we imported the two modules in the global scope, which caused importing tpcp to be very
slow.
This was particularly noticeable in case of multiprocessing, as the module was imported in every worker process.
We now only import the module, within the clone function and only, if you had imported it before.
(https://github.com/mad-lab-fau/tpcp/pull/118)
- The custom hash function now has a different way of hashing functions and classes defined in local scopes.
This should prevent strange pickling errors from just using "tpcp" normally.
(https://github.com/mad-lab-fau/tpcp/pull/118)
Removed
- `score` functions implemented directly as method on the pipeline class are no longer supported.
Score functions now need to be independent functions that take a pipeline instance as their first argument.
For this reason, it is also no longer supported to pass `None` as argument to `scoring` in any validate or optimize
method.
(https://github.com/mad-lab-fau/tpcp/pull/120)