We’re excited to announce the release of **Optimum v1.24.0**. This update expands ONNX-based model capabilities and includes several improvements, bug fixes, and new contributions from the community.
:rocket: New Features & Enhancements
- `ORTQuantizer` now supports models with ONNX subfolders.
- **ONNX Runtime IO Binding support** for all supported Transformers models (no models left behind).
- **SD3 and Flux model support** added to `ORTDiffusionPipeline` enabling latest diffusion-based models.
- **Transformers v4.47 and v4.48 compatibility**, ensuring seamless integration with the latest advancements in Hugging Face's ecosystem.
- **ONNX export support** extended to various models, including Decision Transformer, ModernBERT, Megatron-BERT, Dinov2, OLMo, and many more (see details).
:wrench: Key Fixes & Optimizations
- **Dropped support for Python 3.8**
- **Bug fixes** in `ModelPatcher`, SDXL refiner export, and device checks for improved reliability.
:busts_in_silhouette: New Contributors
A huge thank you to our first-time contributors:
- gabe-l-hart
- ra9hur
- bndos
- mlynatom
- LoSealL
- sjrl
- guangy10
- LRL-ModelCloud
- pragyandev
Your contributions make Optimum better! :tada:
For a detailed list of all changes, please check out the **[full changelog](https://github.com/huggingface/optimum/compare/v1.23.3...v1.24.0)**.
:rocket: Happy optimizing!
What's Changed
<details>
* Onnx granite by gabe-l-hart in https://github.com/huggingface/optimum/pull/2043
* Drop python 3.8 by echarlaix in https://github.com/huggingface/optimum/pull/2086
* Update Dockerfile base image by echarlaix in https://github.com/huggingface/optimum/pull/2089
* add transformers 4.36 tests by echarlaix in https://github.com/huggingface/optimum/pull/2085
* [`fix`] Allow ORTQuantizer over models with subfolder ONNX files by tomaarsen in https://github.com/huggingface/optimum/pull/2094
* SD3 and Flux support by IlyasMoutawwakil in https://github.com/huggingface/optimum/pull/2073
* Remove datasets as required dependency by echarlaix in https://github.com/huggingface/optimum/pull/2087
* Add ONNX Support for Decision Transformer Model by ra9hur in https://github.com/huggingface/optimum/pull/2038
* Generate guidance for flux by IlyasMoutawwakil in https://github.com/huggingface/optimum/pull/2104
* Unbundle inputs generated by `DummyTimestepInputGenerator` by JingyaHuang in https://github.com/huggingface/optimum/pull/2107
* Pass the revision to SentenceTransformer models by bndos in https://github.com/huggingface/optimum/pull/2105
* Rembert onnx support by mlynatom in https://github.com/huggingface/optimum/pull/2108
* fix bug `ModelPatcher` returns empty outputs by LoSealL in https://github.com/huggingface/optimum/pull/2109
* Fix workflow to mark issues as stale by echarlaix in https://github.com/huggingface/optimum/pull/2110
* Remove doc-build by echarlaix in https://github.com/huggingface/optimum/pull/2111
* Downgrade stale bot to v8 and fix permissions by echarlaix in https://github.com/huggingface/optimum/pull/2112
* Update documentation color from google tpu section by echarlaix in https://github.com/huggingface/optimum/pull/2113
* Fix workflow to mark PRs as stale by echarlaix in https://github.com/huggingface/optimum/pull/2116
* Enable transformers v4.47 support by echarlaix in https://github.com/huggingface/optimum/pull/2119
* Add ONNX export support for MGP-STR by xenova in https://github.com/huggingface/optimum/pull/2099
* Add ONNX export support for OLMo and OLMo2 by xenova in https://github.com/huggingface/optimum/pull/2121
* Pass on `model_kwargs` when exporting a SentenceTransformers model by sjrl in https://github.com/huggingface/optimum/pull/2126
* Add ONNX export support for DinoV2, Hiera, Maskformer, PVT, SigLIP, SwinV2, VitMAE, and VitMSN models by xenova in https://github.com/huggingface/optimum/pull/2001
* move check_dummy_inputs_allowed to common export utils by eaidova in https://github.com/huggingface/optimum/pull/2114
* Remove CI macos runners by echarlaix in https://github.com/huggingface/optimum/pull/2129
* Enable GPTQModel by jiqing-feng in https://github.com/huggingface/optimum/pull/2064
* Skip private model loading for external contributors by echarlaix in https://github.com/huggingface/optimum/pull/2130
* fix sdxl refiner export by eaidova in https://github.com/huggingface/optimum/pull/2133
* Export to ExecuTorch: Initial Integration by guangy10 in https://github.com/huggingface/optimum/pull/2090
* Fix AutoModel can't load gptq model due to module prefix mismatch vs AutoModelForCausalLM by LRL-ModelCloud in https://github.com/huggingface/optimum/pull/2146
* Update docker files by echarlaix in https://github.com/huggingface/optimum/pull/2102
* Limit diffusers version by IlyasMoutawwakil in https://github.com/huggingface/optimum/pull/2150
* Add ONNX export support for ModernBERT by xenova in https://github.com/huggingface/optimum/pull/2131
* Allow GPTQModel to auto select Marlin or faster kernels for inference only ops by LRL-ModelCloud in https://github.com/huggingface/optimum/pull/2138
* fix device check by jiqing-feng in https://github.com/huggingface/optimum/pull/2136
* Replace check_if_xxx_greater with is_xxx_version by echarlaix in https://github.com/huggingface/optimum/pull/2152
* Add tf available and version by echarlaix in https://github.com/huggingface/optimum/pull/2154
* Add ONNX export support for `PatchTST` by xenova in https://github.com/huggingface/optimum/pull/2101
* fix infer task from model_name if model from sentence transformer by eaidova in https://github.com/huggingface/optimum/pull/2151
* Unpin diffusers and pass onnx exporters tests by IlyasMoutawwakil in https://github.com/huggingface/optimum/pull/2153
* Uncomment modernbert config by IlyasMoutawwakil in https://github.com/huggingface/optimum/pull/2155
* Skip optimum-benchmark when loading namespace modules by IlyasMoutawwakil in https://github.com/huggingface/optimum/pull/2159
* Fix PR doc upload by regisss in https://github.com/huggingface/optimum/pull/2161
* Move executorch to optimum-executorch by echarlaix in https://github.com/huggingface/optimum/pull/2165
* Adding Onnx Support For Megatron-Bert by pragyandev in https://github.com/huggingface/optimum/pull/2169
* Transformers 4.48 by IlyasMoutawwakil in https://github.com/huggingface/optimum/pull/2158
* Update ort CIs (slow, gpu, train) by IlyasMoutawwakil in https://github.com/huggingface/optimum/pull/2024
</details>