Autogluon

Latest version: v1.2

Safety actively analyzes 723158 Python packages for vulnerabilities to keep your Python projects secure.

Page 5 of 6

0.2.0

Not secure

v0.2.0 introduces numerous optimizations that reduce Tabular average inference time by **4x** and average disk usage by **10x** compared to v0.1.0, as well as a refactored ImagePredictor API to better align with the other tasks and a **20x** inference speedup in Vision tasks. This release contains **42** commits from **9** contributors.

This release is non-breaking when upgrading from v0.1.0, with four exceptions:
1. `ImagePredictor.predict` and `ImagePredictor.predict_proba` have [different output formats](https://auto.gluon.ai/0.2.0/tutorials/image_prediction/beginner.html#predict-on-a-new-image).
2. [`TabularPredictor.evaluate`](https://auto.gluon.ai/0.2.0/api/autogluon.task.html#autogluon.tabular.TabularPredictor.evaluate) and [`TabularPredictor.evaluate_predictions`](https://auto.gluon.ai/0.2.0/api/autogluon.task.html#autogluon.tabular.TabularPredictor.evaluate_predictions) have [different output formats](https://auto.gluon.ai/0.2.0/tutorials/tabular_prediction/tabular-quickstart.html).
3. Custom dictionary inputs to [`TabularPredictor.fit`](https://auto.gluon.ai/0.2.0/api/autogluon.task.html#autogluon.tabular.TabularPredictor.fit)'s `hyperparameter_tune_kwargs` argument now have a [different format](https://github.com/awslabs/autogluon/pull/1002).
4. Models trained in v0.1.0 should only be loaded with v0.1.0. Loading models trained in different versions of AutoGluon is not supported.

See the full commit change-log here: https://github.com/awslabs/autogluon/compare/v0.1.0...v0.2.0

Thanks to the [**9 contributors**](https://github.com/awslabs/autogluon/graphs/contributors?from=2021-02-27&to=2021-04-27&type=c) that contributed to the v0.2.0 release!

Special thanks to the 3 first-time contributors! taesup-aws, ValerioPerrone, lukemorrill

Full Contributor List (ordered by of commits):

Innixma, zhreshold, gradientsky, jwmueller, mseeger, sxjscience, taesup-aws, ValerioPerrone, lukemorrill

Major Changes

Tabular

- Reduced overall inference time on `best_quality` preset by **4x** (and **2x** on others). innixma, gradientsky
- Reduced overall disk usage on `best_quality` preset by **10x**. innixma
- Reduced training time and inference time of K-Nearest-Neighbor models by **250x**, and reduced disk usage by **10x** via:
- Efficient out-of-fold implementation (10x training & inference speedup, 10x reduced disk usage) on `best_quality` preset. innixma (1022)
- [Experimental] Integration of the [scikit-learn-intelex](https://intel.github.io/scikit-learn-intelex/) package (25x training & inference speedup). innixma (#1049)
- This is currently not installed by default. Try it via `pip install autogluon.tabular[all,skex]` or `pip install "scikit-learn-intelex<2021.3"`. Once installed, AutoGluon will automatically use it.
- Reduced training time, inference time, and disk usage of RandomForest and ExtraTrees models by **10x** via efficient out-of-fold implementation. innixma (1066, 1082)
- Reduced training time by 30% and inference time by 75% on the FastAI neural network model. gradientsky (977)
- Added `quantile` as a new `problem_type` to support quantile regression problems. taesup-aws, jwmueller (1005, 1040)
- Try it out with the [quantile regression example script](https://github.com/awslabs/autogluon/blob/master/examples/tabular/example_quantile_regression.py)!
- [Experimental] Added GPU accelerated RandomForest, K-Nearest-Neighbors and Linear models via integration with [NVIDIA RAPIDS](https://rapids.ai/). innixma (#995, 997, 1000)
- This is not enabled by default. Try it out by first [installing RAPIDS](https://rapids.ai/start.html) and then installing AutoGluon.
- Currently, the models need to be specially passed to the `.fit` hyperparameters argument. Refer to the below kaggle kernel for an example or check out [RAPIDS official AutoGluon example](https://github.com/rapidsai/cloud-ml-examples/tree/main/aws/autogluon).
- See how to use AutoGluon + RAPIDS to get top 1% on the Otto kaggle competition with an [interactive kaggle kernel](https://www.kaggle.com/innixma/autogluon-rapids-top-1)!
- [Experimental] Added option to specify early stopping rounds for models LightGBM, CatBoost, and XGBoost via a new model parameter `ag.early_stop`. innixma (1037)
- Try it out via `hyperparameters={'XGB': {'ag.early_stop': 500}}`.
- The API for this may change in future releases as we try to optimize usage of early stopping in AutoGluon.
- [Experimental] Added adaptive early stopping to LightGBM. This will attempt to choose when to stop training the model more smartly than using an early stopping rounds value. innixma (1042)
- Re-ordered model training priority to perform better when `time_limit` is small. For `time_limit=3600` on datasets with over 100,000 rows, v0.2.0 has a **65%** win-rate over v0.1.0. innixma (1059, 1084)
- Adjusted time allocation to stack layers when performing multi-layer stacking to allow for longer training on earlier layers. innixma (1075)
- Updated CatBoost to v0.25. innixma (1064)
- Added `extra_metrics` argument to [`.leaderboard`](https://auto.gluon.ai/0.2.0/api/autogluon.task.html#autogluon.tabular.TabularPredictor.leaderboard). innixma (1058)
- Added feature group importance support to [`.feature_importance`](https://auto.gluon.ai/0.2.0/api/autogluon.task.html#autogluon.tabular.TabularPredictor.feature_importance). innixma (989)
- Now, users can get the combined importance of a group of features.
- `predictor.feature_importance(test_data, features=['A', 'B', 'C', ('AB', ['A', 'B'])])`
- **[BREAKING]** Refactored [`.evalute`](https://auto.gluon.ai/0.2.0/api/autogluon.task.html#autogluon.tabular.TabularPredictor.evaluate) and [`.evaluate_predictions`](https://auto.gluon.ai/0.2.0/api/autogluon.task.html#autogluon.tabular.TabularPredictor.evaluate_predictions) to be easier to use and share the same code logic. innixma (1080)
- The output type has changed and the sign of the metric score has been flipped in some circumstances.

Vision

- Reduced inference time by **20x** via various optimizations in inference batching. zhreshold
- Fixed a problem when loading saved models on cpu-only machines when models are trained on GPU. zhreshold
- Improved model fitting performance by up to 10% for ObjectDetector when `presets` is empty. zhreshold
- **[BREAKING]** Refactored `predict` and `predict_proba` methods in `ImagePredictor` to have the same output formats as `TabularPredictor` and `TextPredictor`. zhreshold (1044)
- This change is **BREAKING**. Previous users of v0.1.0 should ensure they update to use the new formats if they made use of the old `predict` and `predict_proba` when switching to v0.2.0.
- Added improved support for CSV and pandas DataFrame input to `ImagePredictor`. zhreshold (1010)
- See our new [data preparation tutorial](https://auto.gluon.ai/0.2.0/tutorials/image_prediction/dataset.html) to give it a try!
- Added early stopping strategies that significantly improve training efficiency. zhreshold (1039)

General

- [Experimental] Added new hyperparameter tuning method: constrained bayesian optimization. ValerioPerrone (1034)
- General HPO code improvement / cleanup. mseeger, gradientsky (971, 1002, 1050)
- Fixed ENAS issue when passing in custom datasets. lukemorrill (1015)
- Fixed incorrect dependency link between `autogluon.mxnet` and `autogluon.extra` causing crash on import. innixma (1032)
- Various minor updates and fixes. innixma, jwmueller, zhreshold, sxjscience (990, 996, 998, 1007, 1035, 1052, 1055, 1057, 1072, 1081, 1088)

0.1.0

Not secure

v0.1.0 is our largest release yet, containing **173** commits from **20** contributors over the course of 5 months.

**This release is API breaking** from past releases, as AutoGluon is now a namespace package. Please refer to our [documentation](https://auto.gluon.ai/stable/index.html) for using v0.1.0. New GitHub issues based on versions earlier then v0.1.0 will not be addressed, and we recommend all users to upgrade to v0.1.0 as soon as possible.

See the full commit change-log here: https://github.com/awslabs/autogluon/compare/v0.0.15...v0.1.0

Try it out yourself in 5 minutes with our [Colab Tutorial](https://colab.research.google.com/drive/1oT3zNsj9et8s1bJNx7VeHOn_mfpBoe4q?usp=sharing).

Special thanks to the [**20 contributors**](https://github.com/awslabs/autogluon/graphs/contributors?from=2020-10-21&to=2021-03-01&type=c) that contributed to the v0.1.0 release! Contributor List:

innixma, gradientsky, sxjscience, jwmueller, zhreshold, mseeger, daikikatsuragawa, Chudbrochil, adrienatallah, jonashaag, songqiang, larroy, sackoh, muhyun, rschmucker, aaronkl, kaixinbaba, sflender, jojo19893, mak-454

Major Changes

General

- MacOS is now fully supported.
- Windows is now experimentally supported. Installation instructions for Windows are still in progress.
- Python 3.8 is now supported.
- Overhauled API. APIs between TabularPredictor, TextPredictor, and ImagePredictor are now much more consistent. innixma, sxjscience, zhreshold, jwmueller, gradientsky
- Updated AutoGluon to a namespace package, now individual modules can be separately installed to improve flexibility. As an example, to only install HPO related functionality, you can get a minimal install via `pip install autogluon.core`. For a full list of available submodules, see this [link](https://pypi.org/user/innixma/). gradientsky (#694)
- Significantly improved robustness of HPO scheduling to avoid errors for user. mseeger, gradientsky, rschmucker, innixma (713, 735, 750, 754, 824, 920, 924)
- mxnet is no longer a required dependency in AutoGluon. mseeger (726)
- Various dependency version upgrades.

Tabular

- Major API refactor. innixma (768, 855, 869)
- Multimodal Tabular + Text support ([Tutorial](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-multimodal-text-others.html)). Now Tabular can train a multi-modal Tabular + Text transformer model alongside its standard models, and achieve state-of-the-art results on multi-modal tabular + text datasets with 3 lines of code. sxjscience, Innixma (#740, 752, 756, 770, 776, 794, 802, 848, 852, 867, 869, 871, 877)
- GPU support for LightGBM, CatBoost, XGBoost, MXNet neural network, and FastAI neural network models. Specify `ag_args_fit={'num_gpus': 1}` in `TabularPredictor.fit()` to enable. innixma (896)
- `sample_weight` support. Tabular can now handle user-defined sample weights for imbalanced datasets. jwmueller (942, 962)
- Multi-label prediction support ([Tutorial](https://auto.gluon.ai/stable/tutorials/tabular_prediction/tabular-multilabel.html)). Tabular can now predict across multiple label columns. jwmueller (#953)
- Added student model ensembling in model distillation. innixma (937)
- Generally improved accuracy and robustness due to a variety of internal improvements and the addition of new models. (v0.1.0 gets a better score on over 70% of datasets in benchmarking compared to v0.0.15!)
- New model: XGBoost. sackoh (691)
- New model: FastAI Tabular Neural Network. gradientsky (742, 748, 826, 839, 842)
- New model: TextPredictorModel (Multi-modal transformer) (Requires GPU). sxjscience (770)
- New experimental model: TabTransformer (Tabular transformer model ([paper](https://arxiv.org/pdf/2012.06678.pdf))). Chudbrochil (#723)
- New experimental model: FastText. songqiang (580)
- View all available models in our documentation: https://auto.gluon.ai/stable/api/autogluon.tabular.models.html
- New advanced functionality: Extract out-of-fold predictions from a fit TabularPredictor ([docs](https://auto.gluon.ai/stable/api/autogluon.task.html#autogluon.tabular.TabularPredictor.get_oof_pred_proba)). innixma (779)
- Greatly optimized and expanded upon feature importance calculation functionality. Now `predictor.feature_importance()` returns confidence bounds on importance values. innixma (803)
- New experimental functionality: `predictor.fit_extra()` enables the fitting of additional models on top of an already fit `TabularPredictor` object ([docs](https://auto.gluon.ai/stable/api/autogluon.task.html#autogluon.tabular.TabularPredictor.fit_extra)). innixma (768)
- Per-model HPO support. Now you can specify `hyperparameter_tune_kwargs` in a model's hyperparameters via `'ag_args': {'hyperparameter_tune_kwargs': hpo_args}`. innixma (883)
- Sped up preprocessing runtimes by 100x+ on large (10M+ row) datasets by subsampling data during feature duplicate resolution. Innixma (950)
- Added [SHAP notebook tutorials](https://github.com/awslabs/autogluon/tree/master/examples/tabular/interpret). jwmueller (#720)
- Heavily optimized CatBoost inference speed during online-inference. innixma (724)
- KNN models now respect time_limit. innixma (845)
- Added stack ensemble visualization method. muhyun (786)
- Added NLP token prefiltering logic for ngram generation. sflender (907)
- Added initial support for compression of model files to reduce disk usage. adrienatallah (940, 944)
- Numerous bug fixes. innixma, jwmueller, gradientsky (many...)

Text

- Major API refactor. sxjscience (876, 936, 972, 975)
- Support multi-GPU inference. sxjscience (873)
- Greatly improved user time_limit adherence. innixma (877)
- Fixed bug in model deserialization. jojo19893 (708)
- Numerous bug fixes. sxjscience (836, 847, 850, 861, 865, 963, 980)

Vision

- Major API refactor. zhreshold (733, 828, 882, 930, 946)
- Greatly improved user time_limit adherence. zhreshold

0.0.15

Not secure

Changes

- Restricted gluoncv install version to <0.9.0 to fix install issues related to namespace collisions (811).

0.0.14

Not secure

Changes

Tabular

- Complete overhaul of feature generation, major improvements to flexibility, speed, memory usage, and stability Innixma (584, 661).
- Revamped tabular tutorials jwmueller (636).
- Added fastai neural network tabular model (not used by default: requires Torch) gradientsky (627).
- Added LightGBM Extra Trees (LightGBM_XT) model Innixma (681).
- Updated model training priority for multiclass, moved neural networks to train ahead of trees Innixma (676).
- Added .persist_models(), .unpersist_models() methods to TabularPredictor Innixma (640).
- Improved neural network training time jwmueller (598).
- Added example for chunked inference daveharmon (634).
- Improved memory stability on large datasets Innixma (644).
- Reduced maximum memory usage of predictor.leaderboard() Innixma (648).
- Updated LightGBM to v3.x, resulting in ~2x speedup in most cases Innixma (662).
- Updated CatBoost to v0.24.x Innixma (664).
- Updated scikit-learn to <0.24 (from <0.23) Innixma (671).
- Updated pandas version to >=1.0 (from <1.0) Innixma (670).
- Added GPU support for CatBoost Innixma (682).
- Code cleanup Innixma (645, 665, 677, 680, 689).
- Bug Fixes Innixma, gradientsky, jwmueller (643, 666, 678, 688).

Text

- Bug Fixes sxjscience (651, 653).

General

- Upgraded to mxnet 1.7 (from 1.6) sxjscience (650).
- Updated all absolute imports to relative imports Innixma (637).
- Documentation Improvements aaronkl, rdimaio, jwmueller (638, 639, 679).
- Code cleanup tirkarthi (660).
- Bug Fixes Innixma, aaronkl (674, 686).

0.0.13

Not secure

Changes

Tabular

- Added model distillation jwmueller (547).
- Added FAISS KNN model brc7 (557).
- Refactored Feature Generation (Part 1) Innixma (578).
- Added extra_info argument to predictor.leaderboard Innixma (605).
- Optimized out-of-fold feature memory usage by 50% Innixma (588).
- Added confusion matrix to predictor.evaluate_predictions() output alan-aipe (571).
- Improved output directory generation robustness songqiang (620).
- Improved stability on large datasets by reducing maximum memory usage ratio of RF, XT, and KNN models Innixma (630).

Text

- Added TextPrediction Task sxjscience (556).

General

- Added mxnet 1.7 support sxjscience (546).
- Numerous bug fixes Innixma, jwmueller, sxjscience, zhreshold, yongzhengqi, (559, 568, 577, 590, 592, 597, 600, 604, 621, 625, 629).
- Documentation improvements jwmueller, sxjscience, songqiang, Bharat123rox (554, 561, 585, 609, 628, 631).

0.0.12

Not secure

Changes

General

- Removed gluonnlp from dependencies, gluonnlp can now be installed as an optional dependency to enable the text module (512).
- Documentation improvements (503, 529, 549).

Tabular

- Added custom model support (551).
- Added support for specifying `tuning_data` argument in `TabularPrediction.fit()` with test data without the label column to improve data preprocessing and final predictive accuracy on the test data (551).
- Fixed major defect added in 0.0.11 which caused the Tabular neural network model to crash during training when categorical features with many possible values were present (542).
- Disabled usage of text ngram features in KNN models to dramatically improve inference speed on NLP problems (531).
- Added `fit_weighted_ensemble()` function to `TabularPredictor` class. Now the user can train additional weighted ensembles post-fit using any subset of the existing trained models (550).
- Added `AG_args_fit` argument to enable advanced model training control such as per-model time limit and memory usage (531).
- Added `excluded_model_types` argument to `TabularPrediction.fit()` to enable simplified removal of model types without editing the `hyperparameters` argument (543).
- Added version check when loading a predictor, will log a warning if the predictor was trained on a different version of AutoGluon (536).
- Improved support for GPU on CatBoost (527).
- Moved CatBoost to lazy import to enable running Tabular without installing CatBoost (534).
- Added support for training models with no features, in order to get a best guess prediction based only on the average label value (537).
- Major refactor of internal `feature_types_metadata` object and `AutoFeatureGenerator` (548).
- Major refactor of internal variable names (551).

Core

- Minor scheduler cleanup (523, 540).

Page 5 of 6

Releases

Has known vulnerabilities

Previous Next

Autogluon

Page 5 of 6

0.2.0

0.1.0

0.0.15

0.0.14

0.0.13

0.0.12

Page 5 of 6

Links

Releases