Skll

Latest version: v5.0.1

Safety actively analyzes 689579 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 12

1.5.1

This is primarily a bug fix release.

Bugfixes

- Generate the "folds_file" warnings only when "folds_file" is specified (issue 404, PR 405).
- Modify `Learner.save()` to deal properly with reading in and re-saving older models (issue 406, PR 407).
- Fix regression that caused the output directories to not be automatically created (issue 408, PR 409).

1.5

This is a major new release of SKLL.

What's new
- Several new scikit-learn learners included along with reasonable default parameter grids for tuning, where appropriate (issues 256 & 375, PR 377).
- `BayesianRidge`
- `DummyRegressor`
- `HuberRegressors`
- `Lars`
- `MLPRegressor`
- `RANSACRegressor`
- `TheilSenRegressor`
- `DummyClassifier`
- `MLPClassifier`
- `RidgeClassifier`
- Allow computing any number of additional evaluation metrics in addition to the tuning objective (issue 350, PR 384).
- Rename `cv_folds_file` configuration option to `folds_file`. The former is still supported with a deprecation warning but will be removed in the next release (PR 367).
- Add a new configuration option [`use_folds_file_for_grid_search`](http://skll.readthedocs.io/en/latest/run_experiment.html#use-folds-file-for-grid-search-optional) which controls whether the inner-loop grid-search in a cross-validation experiment with a custom folds file also uses the folds from the file. It's set to True by default. Setting it to False means that the inner loop uses regular 3-fold cross-validation and ignores the file (PR 367).
- Also add a keyword argument called `use_custom_folds_for_grid_search` to the `Learner.cross_validate()` method (PR 367).
- Learning curves can now be plotted from existing summary files using the new [`plot_learning_curves`](http://skll.readthedocs.io/en/latest/utilities.html#plot-learning-curves) command line utility (issue 346, PR 396).
- Overhaul logging in SKLL. All messages are now logged both to the console (if running interactively) and to log files. Read more about the SKLL log files in the [Output Files section](http://skll.readthedocs.io/en/latest/run_experiment.html#output-files) of the documentation (issue 369, PR 380).
- `neg_log_loss` is now available as an objective function for classification (issue 327, PR 392).

Changes
- SKLL now supports Python 3.6. Although Python 3.4 and 3.5 will still work, 3.6 is now the officially supported Python 3 version. Python 2.7 is still supported. (issue 355, PR 360).
- The required version of scikit-learn has been bumped up to 0.19.1 (issue 328, PR 330).
- The learning curve y-limits are now computed a bit more intelligently (issue 389, PR 390).
- Raise a warning if ablation flag is used for an experiment that uses `train_file`/`test_file` - this is not supported (issue 313, PR 392).
- Raise a warning if both `fixed_parameters` and `param_grids` are specified (issue 185, PR 297).
- Disable grid search if no default parameter grids are available in SKLL and the user doesn't provide parameter grids either (issue 376, PR 378).
- SKLL has a copy of scikit-learn's `DictVectorizer` because it needs some custom functionality. _Most_ (but not all) of our modifications have now been merged into scikit-learn so our custom version is now significantly condensed down to just a single method (issue 263, PR 374).
- Improved outputs for cross-validation tasks (issues 349 & 371, PRs 365 & 372)
- When a folds file is specified, the log erroneously showed the full dictionary.
- Show number of cross-validation folds in results to be <n> via folds file if a folds file is specified.
- Show grid search folds in results to be <n> via folds file if the grid search ends up using the folds file.
- Do not show the stratified folds information in results when a folds file is specified.
- Show the value of `use_folds_file_for_grid_search` in results when appropriate.
- Show grid search related information in results only when we are actually doing grid search.
- The Travis CI plan was broken up into multiple jobs in order to get around the 50 minute limit (issue 385, PR 387).
- For the conda package, some of the dependencies are now sourced from the `conda-forge` channel.

Bugfixes
- Fix the bug that was causing the inner grid-search loop of a cross-validation experiment to use a single job instead of the number specified via `grid_search_jobs` (issue 363, PR 367).
- Fix unbound variable in `readers.py` (issue 340, PR 392).
- Fix bug when running a learning curve experiment via `gridmap` (issue 386, PR 390).
- Fix a mismatch between the default number of grid search folds and the default number of slots requested via `gridmap` (issue 342, PR 367).

Documentation
- Update documentation and tests for all of the above changes and new features.
- Update tutorial and installation instructions (issues 383 and 394, PR 399).
- Standardize all of the function and method docstrings to be NumPy style. Add docstrings where missing (issue 373, PR 397).

1.3

This is a major new release of SKLL.

New features
- You can now generate learning curves for multiple learners, multiple feature sets, and multiple objectives in a single experiment by using `task=learning_curve` in the configuration file. See [documentation](http://skll.readthedocs.io/en/latest/run_experiment.html#learning-curve) for more details (issue 221, PR 332).

Changes
- The required version of scikit-learn has been bumped up to 0.18.1 (issue 328, PR 330).
- SKLL now uses the MKL backend on macOS/Linux instead of OpenBLAS when used as a `conda` package.

Bugfixes
- Fix deprecation warning when using `Learner.model_params()` (issue 325, PR 329).
- Update the definitions of SKLL F1 metrics as a result of scikit-learn upgrade (issue 325, PR 330).
- Bring documentation for SVC parameter grids up to date with the code (issue 334, PR 337).
- Update documentation to make it clear that the SKLL `conda` package is only available for Python 3.4. For other Python versions, users should use `pip`.

1.2.1

This is primarily a bug fix release but also adds a major new API feature.

New API Feature:
- If you use the SKLL API, you can now create `FeatureSet` instances _directly_ from `pandas` data frames (issue 261, PR 292).

Bugfixes:
- Correctly parse floats in scientific notation, e.g., when specifying parameter grids and/or fixed parameters (issue 318, PR 320)
- `print_model_weights` now correctly handles models trained with `fit_intercept=False` (issue 322, PR 323).

1.2

This release includes major changes as well as a number of bugfixes.

Changes:
- The required version of scikit-learn has been bumped up to 0.17.1 (issue 273, PRs 288 and 308)
- You can now optionally save cross-validation folds to a file for later analysis (issue 259, PR 262)
- Update documentation to be clear about when two `FeatureSet` instances are deemed equal (issue 272, PR 294)
- You can now specify multiple objective functions for parameter tuning (issue 115, PR 291)

Bugfixes:
- Use a fixed random state when doing non-stratified k-fold cross-validation (issue 247, PR 286)
- Fix errors when using reusing relative paths in output section (issue 252, PR 287)
- `print_model_weights` now works correctly for multi-class logistic regression models (issue 274, PR 267)
- Correctly raise an `IOError` if the config file is not correctly specified (issue 275, PR 281)
- The `evaluate` task does not crash when the test data has labels that were not seen in training data (issue 279, PR 290)
- The `fit()` method for rescaled versions of learners now works correctly when not doing grid search (issue 304, PR 306)
- Fix minor typos in the documentation and tutorial.

1.1.1

This is a minor bugfix release. It fixes:
- Issue where a `FileExistsError` would be raised when processing many configs (PR 260)
- Instance of `cv_folds` instead of `num_cv_folds` in the documentation (PR 248).
- Crash with `print_model_weights` and Logistic Regression models without intercepts (issue 250, PR 251)
- Division by zero error when there was only one example (issue 253, PR 254)

Page 3 of 12

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.