Optimum

Latest version: v1.24.0

Safety actively analyzes 714792 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 13 of 23

1.8.4

* Set onnx requirement by echarlaix regisss in https://github.com/huggingface/optimum/pull/1037

**Full Changelog**: https://github.com/huggingface/optimum/compare/v1.8.3...v1.8.4

1.8.3

* Fix Stable Diffusion model ONNX export by echarlaix in https://github.com/huggingface/optimum/pull/1020
* Add `optimum-neuron` extra by michaelbenayoun in https://github.com/huggingface/optimum/pull/1021

**Full Changelog**: https://github.com/huggingface/optimum/compare/v1.8.2...v1.8.3

1.8.2

Extended BetterTransformer support

Various improvements in the PyTorch BetterTransformer integration.

* [BT] add `BetterTransformer` support for ProphetNet by hirotasoshu in https://github.com/huggingface/optimum/pull/923
* Improve bettertransformer benchmark script by fxmarty in https://github.com/huggingface/optimum/pull/939
* Fix sdpa with batch size = 1, better benchmark by fxmarty in https://github.com/huggingface/optimum/pull/915
* Fix slow tests & sdpa dropout by fxmarty in https://github.com/huggingface/optimum/pull/974
* Remove getattr overhead in spda by fxmarty in https://github.com/huggingface/optimum/pull/934
* [`BT`] Improve docs by younesbelkada in https://github.com/huggingface/optimum/pull/944

ONNX merged seq2seq models

Instead of using two separate `decoder_model.onnx` and `decoder_with_past_model.onnx` models, a single decoder can be used for encoder-decoder models: `decoder_model_merged.onnx`. This allows to avoid duplicated weights in the two without/with past ONNX models.

By default, if available, the `decoder_model_merged.onnx` will be used in the ORTModel integration. This can be disabled with the option `--no-post-process` in the ONNX export CLI, and with `use_merged=False` in the `ORTModel.from_pretrained` method.

Example:


optimum-cli export onnx --model t5-small t5_onnx


will give:


└── t5_onnx
   ├── config.json
   ├── decoder_model_merged.onnx
   ├── decoder_model.onnx
   ├── decoder_with_past_model.onnx
   ├── encoder_model.onnx
   ├── generation_config.json
   ├── special_tokens_map.json
   ├── spiece.model
   ├── tokenizer_config.json
   └── tokenizer.json


And `decoder_model_merged.onnx` is enough to be used for inference. We strongly recommend to inspect the subgraphs with netron to understand what are the inputs/outputs, in case the exported model is to be used with an other engine than ONNX Runtime in the Optimum integration.

* Fix encoder-decoder ONNX merge by fxmarty in https://github.com/huggingface/optimum/pull/924
* Support the merge of decoder without/with past for encoder-decoder models in the ONNX export by fxmarty in https://github.com/huggingface/optimum/pull/926
* Support merged seq2seq models in ORTModel by fxmarty in https://github.com/huggingface/optimum/pull/930

New models in the ONNX export

* Add llama onnx export & onnxruntime support by nenkoru in https://github.com/huggingface/optimum/pull/975

Major bugfix

* Remove constant output in encoder-decoder ONNX models decoder with past by fxmarty in https://github.com/huggingface/optimum/pull/920
* Hash tensor data during deduplication by VikParuchuri in https://github.com/huggingface/optimum/pull/932

Potentially breaking changes

The TasksManager replaces legacy tasks names by the canonical ones used on the Hub and in [transformers metadata](https://huggingface.co/datasets/huggingface/transformers-metadata/blob/main/pipeline_tags.json):
- `sequence-classification` becomes `text-classification`,
- `causal-lm` becomes `text-generation`,
- `seq2seq-lm` becomes `text2text-generation`,
- `speech2seq-lm` and `audio-ctc` becomes `automatic-speech-recognition`,
- `default` becomes `feature-extraction`,
- `masked-lm` becomes `fill-mask`,
- `vision2seq-lm` becomes `image-to-text`

This should not break anything except if you rely on private methods and attributes from `TasksManager`.

* Allow to use a custom class in TasksManager & use canonical tasks names by fxmarty in https://github.com/huggingface/optimum/pull/967

What's Changed
* Update ort trainer to transformers 4.27.2 by JingyaHuang in https://github.com/huggingface/optimum/pull/917
* Compute Loss inside the training step. by AdamLouly in https://github.com/huggingface/optimum/pull/686
* Fix ORTModel MRO for whisper by fxmarty in https://github.com/huggingface/optimum/pull/919
* add ORTStableDiffusionPipeline reference in documentation by echarlaix in https://github.com/huggingface/optimum/pull/890
* Fix decoder ONNX model loading from the Hub by fxmarty in https://github.com/huggingface/optimum/pull/929
* `optimun-cli onnxruntime quantize / optimize` output argument is now required by michaelbenayoun in https://github.com/huggingface/optimum/pull/927
* Register mechanism for the Optimum CLI by michaelbenayoun in https://github.com/huggingface/optimum/pull/928
* Ensure backward compatibility of ORTModel by fxmarty in https://github.com/huggingface/optimum/pull/933
* Update the README by michaelbenayoun in https://github.com/huggingface/optimum/pull/925
* Update README by echarlaix in https://github.com/huggingface/optimum/pull/941
* Update readme by echarlaix in https://github.com/huggingface/optimum/pull/942
* Remove GC from README by michaelbenayoun in https://github.com/huggingface/optimum/pull/943
* Add user and token for CI by michaelbenayoun in https://github.com/huggingface/optimum/pull/945
* Update README by echarlaix in https://github.com/huggingface/optimum/pull/946
* `optimum-cli` print the help of subcommands by michaelbenayoun in https://github.com/huggingface/optimum/pull/940
* Remove from_transformers references from the documentation by fxmarty in https://github.com/huggingface/optimum/pull/935
* Turn command import into optional by JingyaHuang in https://github.com/huggingface/optimum/pull/936
* Auto-set use_merged to False if use_cache is passed as False by fxmarty in https://github.com/huggingface/optimum/pull/954
* Raise error with use_cache=False, use_io_binding=True by fxmarty in https://github.com/huggingface/optimum/pull/955
* Add an ORT training notebook by JingyaHuang in https://github.com/huggingface/optimum/pull/959
* Fix issue with doc build sometimes failing silently in GH workflows by regisss in https://github.com/huggingface/optimum/pull/960
* Fix typos by regisss in https://github.com/huggingface/optimum/pull/963
* Disable tests upon transformers 4.28 release by fxmarty in https://github.com/huggingface/optimum/pull/976

New Contributors
* hirotasoshu made their first contribution in https://github.com/huggingface/optimum/pull/923
* VikParuchuri made their first contribution in https://github.com/huggingface/optimum/pull/932

**Full Changelog**: https://github.com/huggingface/optimum/compare/v1.7.3...v1.8.2

1.8.1

* Fix OpenVINO Trainer for transformers >= v4.29.0 by echarlaix in https://github.com/huggingface/optimum-intel/pull/328

**Full Changelog**: https://github.com/huggingface/optimum-intel/compare/v1.8.0...v1.8.1

1.8

This release is fully compatible with SynapseAI 1.8.0, which is the latest version. Check out Habana's [documentation](https://docs.habana.ai/en/v1.8.0/) for more information about the new features.


DeepSpeed's gradient checkpointing

DeepSpeed's gradient checkpointing is now automatically used when setting `gradient_checkpointing=True` in a DeepSpeed run.

- Enable DeepSpeed activation checkpointing 142

1.8.0

Optimum INC CLI
Integration of the Intel Neural Compressor dynamic quantization to the Optimum command line interface. Example commands:
bash
optimum-cli inc --help
optimum-cli inc quantize --help
optimum-cli inc quantize --model distilbert-base-cased-distilled-squad --output int8_distilbert/

* Add Optimum INC CLI to apply dynamic quantization by echarlaix in https://github.com/huggingface/optimum-intel/pull/280

Levarage past key values for OpenVINO decoder models

Enable the possibility to use the pre-computed key / values in order to make inference faster. This will be enabled by default when exporting the model.

python
model = OVModelForCausalLM.from_pretrained(model_id, export=True)

To disable it, `use_cache` can be set to `False` when loading the model:
python
model = OVModelForCausalLM.from_pretrained(model_id, export=True, use_cache=False)

* Enable the possibility to use the pre-computed key / values for OpenVINO decoder models by echarlaix in https://github.com/huggingface/optimum-intel/pull/274

INC config summarizing optimizations details
* Add `INCConfig` by echarlaix in https://github.com/huggingface/optimum-intel/pull/263

Fixes

* Remove dynamic shapes restriction for GPU devices by helena-intel in https://github.com/huggingface/optimum-intel/pull/262
* Enable OpenVINO model caching for CPU devices by helena-intel in https://github.com/huggingface/optimum-intel/pull/281
* Fix the `.to()` method for causal langage models by helena-intel in https://github.com/huggingface/optimum-intel/pull/284
* Fix pytorch model saving for `transformers>=4.28.0` when optimized with `OVTrainer` echarlaix in https://github.com/huggingface/optimum-intel/pull/285
* Update for task name for ONNX and OpenVINO export for `optimum>=1.8.0` by echarlaix in https://github.com/huggingface/optimum-intel/pull/286

Page 13 of 23

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.