Optimum

Latest version: v1.24.0

Safety actively analyzes 714668 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 9 of 23

1.12.4

* Fix compatibility with `transformers` v4.37.0 by echarlaix in https://github.com/huggingface/optimum-intel/pull/515
* Fix compatibility with `transformers` v4.37.0 by echarlaix in https://github.com/huggingface/optimum-intel/pull/527

1.12.3

* Fix compatibility with `diffusers` v0.25.0 by eaidova in https://github.com/huggingface/optimum-intel/pull/497
* Modify minimum required `transformers` version by echarlaix in https://github.com/huggingface/optimum-intel/pull/498

1.12.2

* Fix compatibility with timm latest release by echarlaix in https://github.com/huggingface/optimum-intel/pull/482
* Fix causallm weights compression via quantizer by eaidova 484
* Fix pkv dtype by jiqing-feng 481
* Fix compatibility causallm models export with optimum 1.15 by eaidova 487
* Fix trainer compatibility with transformers>=4.36.0 by echarlaix 490
* Fix openvino export by eaidova 470
* Fix INC quantized model loading by echarlaix 492

1.12.1

* Fix causal language models export by eaidova in https://github.com/huggingface/optimum-intel/pull/477

1.12

- Switch to SynapseAI v1.12.0 453 regisss


Various model optimizations

- Fix graph compilation error from Falcon when batch size>1 356 schoi-habana
- Add mpt optimization for gaudi 363 sywangyi
- Improve MPT inference performance 377 schoi-habana
- Allocate KV cache in contiguous memory for HPU performance 394 puneeshkhanna
- Add support for attention softmax in BF16 such as for llama 396 puneeshkhanna
- Add trim logit logic to reduce maximum memory usage for Llama inference 395 BaihuiJin
- Skip hpugraph usage for first token to save memory 397 polisettyvarma
- Llama inference:add reuse_cache to save memory 409 regisss
- GPT2 contiguous fix 421 ZhaiFeiyue
- Improve perf and memory usage with reuse cache by slicing inputs till token idx for 1st token generation 422 puneeshkhanna
- GPT-J/NeoX contiguous 454 BaihuiJin


TGI

- Fix gptj incorrect output issue in TGI 340 sywangyi
- Enable hpu graph 330 htang2012
- Upgrade to TGI v1.0.3 373 regisss
- Accelerate the inference when input prompt length changes in TGI 386 sywangyi
- Support static shape in concatenate and filter in TGI 389 sywangyi
- Fix bloom concatenate and filter issue 401 sywangyi
- Fix error in logits process in hpu graph 404 sywangyi
- Fix first token 408 regisss
- Temporary fix in TGI for max total tokens 443 hsubramony


Check min version in examples

A utility method was added to ensure that the latest version of Optimum Habana is installed to run the examples.

- Add check_optimum_habana_min_version 335 regisss


Others

- Add support for autocast custom ops in GaudiTrainer 308 regisss
- Add warmup arg and move stats printing to the end 390 polisettyvarma
- Add a configurable max input tokens parameter 426 puneeshkhanna
- Add transformers model tests for gaudi 427 ankurneog
- Modify loraft llama falcon 415 libinta
- Option to not crop in dataset run 444 ssarkar2
- Enable auto tensor parallelism for Falcon 451 mandy-li


Various fixes

- Fixes for streaming dataset mode 324 MohitIntel
- Fix beam search output 360 puneeshkhanna
- Fix DDP for LoRA 368 sywangyi
- Load llama ckpt to meta to work around OOM issue on CPU 359 mandy-li
- Fix gradient checkpointing in LoRA example 398 regisss
- No need to wrap DDP when using Fast DDP 430 ikurtchen
- Fix falcon-40b error when DeepSpeed enabled 434 schoi-habana
- Revert "Fix T5 DeepSpeed ZeRO-3 (393)" 466 sywangyi


Regression tests for this release are available here: https://github.com/huggingface/optimum-habana/actions/runs/6580186897

1.12.0

AutoGPTQ integration

Part of [AutoGPTQ]() library has been integrated in Optimum, with utilities to ease the integration in other Hugging Face libraries. Reference: https://huggingface.co/docs/optimum/llm_quantization/usage_guides/quantization

* Add GPTQ Quantization by SunMarc in https://github.com/huggingface/optimum/pull/1216
* Fix GPTQ doc by regisss in https://github.com/huggingface/optimum/pull/1267
* Add AutoGPTQ benchmark by fxmarty in https://github.com/huggingface/optimum/pull/1292
* Fix gptq params by SunMarc in https://github.com/huggingface/optimum/pull/1284

Extended BetterTransformer support

BetterTransformer now supports BLOOM and GPT-BigCode architectures.

* Bt bloom by baskrahmer in https://github.com/huggingface/optimum/pull/1221
* Support gpt_bigcode in bettertransformer by fxmarty in https://github.com/huggingface/optimum/pull/1252
* Fix BetterTransformer starcoder init by fxmarty in https://github.com/huggingface/optimum/pull/1254
* Fix BT starcoder fp16 by fxmarty in https://github.com/huggingface/optimum/pull/1255
* SDPA dispatches to flash for MQA by fxmarty in https://github.com/huggingface/optimum/pull/1259
* Check output_attentions is False in BetterTransformer by fxmarty in https://github.com/huggingface/optimum/pull/1306

Other changes and bugfixes
* Update bug report template by fxmarty in https://github.com/huggingface/optimum/pull/1266
* Fix ORTModule uses fp32 model issue by jingyanwangms in https://github.com/huggingface/optimum/pull/1264
* Fix build PR doc workflow by fxmarty in https://github.com/huggingface/optimum/pull/1270
* Avoid triggering stop job on label by fxmarty in https://github.com/huggingface/optimum/pull/1274
* Update version following 1.11.1 patch by fxmarty in https://github.com/huggingface/optimum/pull/1275
* Fix fp16 ONNX detection for decoder models by fxmarty in https://github.com/huggingface/optimum/pull/1276
* Update version following 1.11.2 patch by regisss in https://github.com/huggingface/optimum/pull/1291
* Pin tensorflow<=2.12.1 by fxmarty in https://github.com/huggingface/optimum/pull/1305
* ONNX: disable text-generation models for sequence classification & fixes for transformers 4.32 by fxmarty in https://github.com/huggingface/optimum/pull/1308
* Fix staging tests following transformers 4.32 release by fxmarty in https://github.com/huggingface/optimum/pull/1309
* More fixes following transformers 4.32 release by fxmarty in https://github.com/huggingface/optimum/pull/1311

New Contributors
* SunMarc made their first contribution in https://github.com/huggingface/optimum/pull/1216
* jingyanwangms made their first contribution in https://github.com/huggingface/optimum/pull/1264

**Full Changelog**: https://github.com/huggingface/optimum/compare/v1.11.2...v1.12.0

Page 9 of 23

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.