💡 New features 💡
Dependencies
* RSMTool is now compatible with [SKLL v3.0](https://skll.readthedocs.io/en/latest:) and, therefore, [scikit-learn](https://scikit-learn.org) v1.0.2.
* RSMTool now supports Python 3.10, in addition to 3.8 and 3.9. **Python 3.7 is no longer supported.**
* [tqdm](https://tqdm.github.io) is now a required dependency.
Native cross-validation support
* Add native support for cross-validation experiments to RSMTool. Using a single train-test split may lead to biased estimates of performance since those estimates will depend on the specific characteristics of that split. However, using cross-validation instead can provide more accurate estimates of scoring model performance since those estimates are averaged over multiple train-test splits that are randomly selected based on the data.
* Add new command-line utility [`rsmxval`](https://github.com/EducationalTestingService/rsmtool/blob/main/rsmtool/rsmxval.py) to run cross-validation experiments. Underlyingly, it leverages the RSMTool API functions `run_experiment()`, `run_evaluation()`, and `run_summary()` to generate multiple useful reports for the users.
* Add support for [automated configuration generation](https://rsmtool.readthedocs.io/en/stable/automated_configuration.html) to `rsmxval` in both batch and interactive mode.
* Add [comprehensive documentation](https://rsmtool.readthedocs.io/en/stable/advanced_usage.html#rsmxval-run-cross-validation-experiments) on how to run cross-validation experiments.
* Add comprehensive [functional tests](https://github.com/EducationalTestingService/rsmtool/blob/main/rsmtool/test_experiment_rsmxval.py) for cross-validation.
API Changes
* Add two new logging functions in [`rsmtool.utils.logging`](https://github.com/EducationalTestingService/rsmtool/blob/main/rsmtool/utils/logging.py). These are only meant to be used by RSMTool developers, not users.
* Factor out the code that was used to write a dataframe to disk into a separate utility method [`DataWriter.write_frame_to_disk()`](https://github.com/EducationalTestingService/rsmtool/blob/main/rsmtool%2Fwriter.py#L22) so that it an also be used by `rsmxval`. This can prove useful to advanced RSMTool users as well.
* Add new cross-validation specific utility functions to [`rsmtool.utils.cross_validation`](https://github.com/EducationalTestingService/rsmtool/blob/main/rsmtool%2Futils%2Fcross_validation.py).
* Convert several class or static methods in various classes to instance methods in order to allow for passing and using an optional logger instance.
* Tweak the [`check_scaled_coefficients()`](https://github.com/EducationalTestingService/rsmtool/blob/main/rsmtool/test_utils.py#L1084) test utility function to take the output directory as an argument instead of taking an experiment name to allow its usage for `rsmxval` functional tests.
🛠Bugfixes & Improvements ðŸ›
* Fix the behavior of the [`use_thumbnails`](https://rsmtool.readthedocs.io/en/stable/usage_rsmtool.html#use-thumbnails-optional) option in RSMTool configuration files. It was generating both the thumbnail as well as the full-sized figure due to the behavior of Matplotlib’s `savefig()`. The solution was to turn off interactive plotting in all header notebooks.
* Replace deprecated methods and keywords in RSMTool code as recommended by the latest versions of pandas, numpy, and scikit-learn.
* Fix several duplicate target warnings when compiling the documentation. Make sure included RST files have an extension of `.rst.inc` so that they are not compiled twice. Turn all web links into anonymous references so that there are no conflicts with the same target names.
* Make feature boxplots for subgroups in reports more flexible in terms of the number of features. Specifically, if the experiment has more than 150 features, no boxplots are shown. Previously this limit was 30. In addition, the message that the boxplots have been omitted is displayed more prominently when it happens. Finally, if the number of features is > 30 but <=150, a new message asking the user to enable thumbnails is shown.
* Update Gitlab CI plan to use Python 3.8 and Azure Pipelines to use Python 3.10. Add new cross-validation tests to both CI plans.