Safety vulnerability ID: 51918
The information on this page was manually curated by our Cybersecurity Intelligence Team.
Autogluon 0.6.0 uses yaml.safe_load() to avoid a code execution vulnerability.
https://github.com/autogluon/autogluon/pull/1987
Latest version: 1.2
Fast and Accurate ML in 3 Lines of Code
We're happy to announce the AutoGluon 0.6 release. 0.6 contains major enhancements to Tabular, Multimodal, and Time Series
modules, along with many quality of life improvements and fixes.
As always, only load previously trained models using the same version of AutoGluon that they were originally trained on.
Loading models trained in different versions of AutoGluon is not supported.
This release contains [**230** commits from **25** contributors](https://github.com/awslabs/autogluon/graphs/contributors?from=2022-07-18&to=2022-11-15&type=c)!
See the full commit change-log here: https://github.com/awslabs/autogluon/compare/v0.5.2...v0.6.0
Special thanks to cheungdaven, suzhoum, BingzhaoZhu, liangfu, Harry-zzh, gidler, yongxinw, martinschaef,
giswqs, Jalagarto, geoalgo, lujiaying and leloykun who were first time contributors to AutoGluon this release!
Full Contributor List (ordered by of commits):
shchur, yinweisu, zhiqiangdon, Innixma, canerturkmen, gradientsky, sxjscience, FANGAreNotGnu, cheungdaven,
suzhoum, bryanyzhu, BingzhaoZhu, liangfu, Harry-zzh, Raldir, gidler, yongxinw, martinschaef, giswqs,
Jalagarto, geoalgo, lujiaying, leloykun, yiqings
This version supports Python versions 3.7 to 3.9. This is the last release that will support Python 3.7.
Changes
AutoMM
AutoGluon Multimodal (a.k.a AutoMM) supports three new features: 1) object detection, 2) named entity recognition, and 3) multimodal matching. In addition, the HPO backend of AutoGluon Multimodal has been upgraded to ray 2.0. It also supports fine-tuning billion-scale FLAN-T5-XL model on a single AWS g4.2x-large instance with improved parameter-efficient finetuning. Starting from 0.6, we recommend using autogluon.multimodal rather than autogluon.text or autogluon.vision and added deprecation warnings.
New features
- Object Detection
- Add new problem_type `"object_detection"`.
- Customers can run inference with pretrained object detection models and train their own model with three lines of code.
- Integrate with [open-mmlab/mmdetection](https://github.com/open-mmlab/mmdetection), which supports classic detection architectures like Faster RCNN, and more efficient and performant architectures like YOLOV3 and VFNet.
- See [tutorials](https://auto.gluon.ai/stable/tutorials/multimodal/object_detection/index.html) and [examples](https://github.com/awslabs/autogluon/tree/master/examples/automm/object_detection) for more detail.
- Contributors and commits: FANGAreNotGnu, bryanyzhu, zhiqiangdon, yongxinw, sxjscience (2025, 2061, 2131, 2181, 2196, 2215, 2244, 2265, 2290, 2311, 2312, 2337, 2349, 2353, 2360, 2362, 2365, 2380, 2381, 2391, 2393, 2400, 2419)
- Named Entity Recognition
- Add new problem_type `"ner"`.
- Customers can train models to extract named entities with three lines of code.
- The implementation supports any backbones in huggingface/transformer, including the recently [FLAN-T5 series](https://arxiv.org/abs/2210.11416) released by Google.
- See [tutorials](https://auto.gluon.ai/stable/tutorials/multimodal/text_prediction/ner.html) for more detail.
- Contributors and commits: cheungdaven (2183, 2232, 2220, 2282, 2295, 2301, 2337, 2346, 2361, 2372, 2394)
- Multimodal Matching
- Add new problem_type `"text_similarity"`, `"image_similarity"`, `"image_text_similarity"`.
- Users can now extract semantic embeddings with pretrained models for text-text, image-image, and text-image matching problems.
- Moreover, users can further finetune these models with relevance data.
- The semantic text embedding model can also be combined with BM25 to form a hybrid indexing solution.
- Internally, AutoGluon Multimodal implements a twin-tower architecture that is flexible in the choice of backbones for each tower. It supports image backbones in TIMM, text backbones in huggingface/transformers, and also the CLIP backbone.
- See [tutorials](https://auto.gluon.ai/stable/tutorials/multimodal/matching/index.html) for more detail.
- Contributors and commits: zhiqiangdon FANGAreNotGnu cheungdaven suzhoum sxjscience bryanyzhu (1975, 1994, 2142, 2179, 2186, 2217, 2235, 2284, 2297, 2313, 2326, 2337, 2347, 2357, 2358, 2362, 2363, 2375, 2378, 2404, 2416)
Other Enhancements
- Fix the FT-Transformer implementation and support Fastformer. BingzhaoZhu yiqings (1958, 2194, 2251, 2344, 2379)
- Support finetuning billion-scale FLAN-T5-XL in a single AWS g4.2x-large instance via improved parameter-efficient finetuning. See [tutorial](https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/efficient_finetuning_basic.html). Raldir sxjscience (#2032, 2108, 2285, 2336, 2352)
- Upgrade multimodal HPO to use ray 2.0 and also add [new tutorial](https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/hyperparameter_optimization.html). yinweisu suzhoum bryanyzhu (#2206, 2341)
- Further improvement on model distillation. Add [example](https://github.com/awslabs/autogluon/tree/master/examples/automm/distillation) and [tutorial](https://auto.gluon.ai/stable/tutorials/multimodal/advanced_topics/model_distillation.html). FANGAreNotGnu sxjscience (#1983, 2064, 2397)
- Revise the default presets of AutoMM for image classification problems. bryanyzhu (2351)
- Support backend=“automm” in autogluon.vision. bryanyzhu (2316)
- Add deprecated warning to autogluon.vision and autogluon.text and point the usage to autogluon.multimodal. bryanyzhu sxjscience (2268, 2315)
- Examples about [Kaggle: Feedback Prize prediction competition](https://www.kaggle.com/competitions/feedback-prize-effectiveness). We created [a solution](https://www.kaggle.com/code/mountpotatoq/autogluon-finetune-solutions) with AutoGluon Multimodal that obtained 152/1557 in the public leaderboard and 170/1557 in the private leaderboard, which is among the top 12% participants. The solution is public days before the DDL of the competition and obtained more than 3000 views. suzhoum MountPOTATO (#2129, 2168)
* Improve native inference speed. zhiqiangdon (2051, 2157, 2161, 2171)
* Other improvements, security/bug fixes. zhiqiangdon sxjscience FANGAreNotGnu, yinweisu Innixma tonyhoo martinschaef giswqs (1980, 1987, 1989, 2003, 2080, 2018, 2039, 2058, 2101, 2102, 2125, 2135, 2136, 2140, 2141, 2152, 2164, 2166, 2192, 2219, 2250, 2257, 2280, 2308, 2315, 2317, 2321, 2356, 2388, 2392, 2413, 2414, 2417, 2426)
Experimental Features
- Support 11B-scale model finetuning with DeepSpeed. Raldir (2032)
- Enable few-shot learning with 11B-scale model. Raldir (2197)
- ONNX export example of hf_text model. FANGAreNotGnu (2149)
Tabular
New features
- New experimental model `FT_TRANSFORMER`. bingzhaozhu, innixma (2085, 2379, 2389, 2410)
- You can access it via specifying the `FT_TRANSFORMER` key
in the `hyperparameters` dictionary or via `presets="experimental_best_quality"`.
- It is recommended to use GPU to train this model, but CPU training is also supported.
- If given enough training time, this model generally improves the ensemble quality.
- New experimental model compilation support via `predictor.compile_models()`. liangfu, innixma (2225, 2260, 2300)
- Currently only Random Forest and Extra Trees have compilation support.
- You will need to install extra dependencies for this to work: `pip install autogluon.tabular[all,skl2onnx]`.
- Compiling models dramatically speeds up inference time (~10x) when processing small batches of samples (<10000).
- Note that a known bug exists in the current implementation: Refitting models after compilation will fail
and cause a crash. To avoid this, ensure that `.compile_models` is called only at the very end.
- Added `predictor.clone(...)` method to allow perfectly cloning a predictor object to a new directory.
This is useful to preserve the state of a predictor prior to altering it
(such as prior to calling `.save_space`, `.distill`, `.compile_models`, or `.refit_full`. innixma (2071)
- Added simplified `num_gpus` and `num_cpus` arguments to `predictor.fit` to control total resources.
yinweisu, innixma (2263)
- Improved stability and effectiveness of HPO functionality via various refactors regarding our usage of ray.
yinweisu, innixma (1974, 1990, 2094, 2121, 2133, 2195, 2253, 2263, 2330)
- Upgraded dependency versions: XGBoost 1.7, CatBoost 1.1, Scikit-learn 1.1, Pandas 1.5, Scipy 1.9, Numpy 1.23.
innixma (2373)
- Added python version compatibility check when loading a fitted TabularPredictor.
Will now error if python versions are incompatible. innixma (2054)
- Added `fit_weighted_ensemble` argument to `predictor.fit`. This allows the user to disable the weighted ensemble.
innixma (2145)
Other Enhancements
- Improved logging clarity when using `infer_limit`. innixma (2014)
- Significantly improved HPO search space of XGBoost. innixma (2123)
- Fixed HPO crashing when tuning Random Forest, Extra Trees, or KNN. innixma (2070)
- Optimized roc_auc metric scoring speed by 7x. innixma (2318, 2331)
- Fixed bug with AutoMM Tabular model crashing if not trained last. innixma (2309)
- Refactored `Scorer` classes to be easier to use, plus added comprehensive unit tests for all metrics. innixma (2242)
- Sped up TextSpecial feature generation during preprocessing by 20% gidler (2095)
- imodels integration improvements Jalagarto (2062)
- Fix crash when calling feature importance in quantile_regression. leloykun (1977)
- Add FAQ section for missing value imputation. innixma (2076)
- Various minor fixes and cleanup innixma, yinweisu (1997, 2031, 2124, 2144, 2178, 2340, 2342, 2345, 2374)
Time Series
New features
- `TimeSeriesPredictor` now supports **static features** (a.k.a. time series metadata, static covariates) and **
time-varying covariates** (a.k.a. dynamic features or related time series). shchur canerturkmen (1986, 2238,
2276, 2287)
- AutoGluon-TimeSeries now uses **PyTorch** by default (for `DeepAR` and `SimpleFeedForward`), removing the dependency
on MXNet. canerturkmen (2074, 2205, 2279)
- New models! `AutoGluonTabular` relies on XGBoost, LightGBM and CatBoost under the hood via the `autogluon.tabular`
module. `Naive` and `SeasonalNaive` forecasters are simple methods that provide strong baselines with no increase in
training time. `TemporalFusionTransformerMXNet` brings the TFT transformer architecture to AutoGluon. shchur (2106,
2188, 2258, 2266)
- Up to 20x faster parallel and memory-efficient training for statistical (local) forecasting models like `ETS`, `ARIMA`
and `Theta`, as well as `WeightedEnsemble`. shchur canerturkmen (2001, 2033, 2040, 2067, 2072, 2073, 2180,
2293, 2305)
- Up to 3x faster training for GluonTS models with data caching. GPU training enabled by default on PyTorch models.
shchur (2323)
- More accurate validation for time series models with multi-window backtesting. shchur (2013, 2038)
- `TimeSeriesPredictor` now handles irregularly sampled time series with `ignore_index`. canerturkmen, shchur (1993,
2322)
- Improved and extended presets for more accurate forecasting. shchur (2304)
- 15x faster and more robust forecast evaluation with updates to `TimeSeriesEvaluator` shchur (2147, 2150)
- Enabled Ray Tune backend for hyperparameter optimization of time series models. shchur (2167, 2203)
More tutorials and examples
Improved documentation and new tutorials:
- Updated [Quickstart tutorial](https://auto.gluon.ai/stable/tutorials/timeseries/forecasting-quickstart.html)
- New! [In-depth tutorial](https://auto.gluon.ai/stable/tutorials/timeseries/forecasting-indepth.html)
- New! [Overview of available models and hyperparameters](https://auto.gluon.ai/stable/tutorials/timeseries/forecasting-model-zoo.html)
- Updated [API documentation](https://auto.gluon.ai/stable/api/autogluon.predictor.html#module-5)
shchur (2120, 2127, 2146, 2174, 2187, 2354)
Miscellaneous
shchur
- Deprecate passing quantile_levels to TimeSeriesPredictor.predict (2277)
- Use static features in GluonTS forecasting models (2238)
- Make sure that time series splitter doesn't trim training series shorter than prediction_length + 1 (2099)
- Fix hyperparameter overloading in HPO for time series models (2189)
- Clean up the TimeSeriesDataFrame public API (2105)
- Fix item order in GluonTS models predictions (2092)
- Implement hash_ts_dataframe_items (2060)
- Speed up TimeSeriesDataFrame.slice_by_timestep (2020)
- Various backend enhancements / refactoring / cleanup (2314, 2294, 2292, 2278, 1985)
canerturkmen
- Increase the number of samples used by DeepAR at prediction time (2291)
- revise timeseries presets to minimum context length of 10 (2065)
- Fix timeseries daily frequency inferred period (2100)
- Various backend enhancements / refactoring / cleanup (2286, 2302, 2240, 2093, 2098, 2044)
Scan your Python project for dependency vulnerabilities in two minutes
Scan your application