Significantly increased predictiveness by doing cross validation on the training data. A model is fitted for each fold combination. These models are then merged into a final model. This increases predictiveness by reducing model variance (an effect similar to bagging). The drawback is that training takes longer time with the default number of randomly selected cv folds (5). However, the user can specify the amount of randomly selected cv folds or directly specify the folds and how the particular training observations will be used in each fold. The latter can be used to emulate the train/validation split used in prior versions of APLR should the user choose to do so.
Other changes:
- Feature importance calculation methodology should now be more realistic.
Deprecated:
- Pruning of terms.
- "rankability" validation_tuning_metric.
- validation_ratio and validation_indexes fields as well as related methods.