Optimum

Latest version: v1.24.0

Safety actively analyzes 723938 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 23

4.45

Transformers v4.45 support by echarlaix in https://github.com/huggingface/optimum/pull/2023 and https://github.com/huggingface/optimum/pull/2045

Subfolder

Remove the restriction for the model's config to be in the model's subfolder by echarlaix in https://github.com/huggingface/optimum/pull/2044

New Contributors
* tcsavage made their first contribution in https://github.com/huggingface/optimum/pull/1965
* yuanwu2017 made their first contribution in https://github.com/huggingface/optimum/pull/2003
* h3110Fr13nd made their first contribution in https://github.com/huggingface/optimum/pull/2031
* glegendre01 made their first contribution in https://github.com/huggingface/optimum/pull/2033
* rbrugaro made their first contribution in https://github.com/huggingface/optimum/pull/2027

**Full Changelog**: https://github.com/huggingface/optimum/compare/v1.22.0...v1.23.0

4.43

- Upgrade to Transformers 4.43 1163 regisss

4.40

- Upgrade to Transformers 4.40 1027 regisss

Speculative Sampling

- Speculative sampling on Gaudi using Optimum-Habana 973 nraste
- Fix assisted decoding generation error 1080 libinta

Model optimizations

- Add --bucket_size support for gpt_bigcode 802 jiminha
- Optimize StableLM model inference 805 XinyuYe-Intel
- Enable google/gemma-7b. 747 lkk12014402
- Enable llava static generation. 767 lkk12014402
- Fix perf drop in flan-t5 summarization 908 MohitIntel
- Enable Qwen2 model 774 XinyuYe-Intel
- Extend bucket_internal to SAMPLE generation mode 819 xt574chen
- SpeechT5 static consistent dropout 824 Spycsh
- Optimize inference of Persimmon model 822 XinyuYe-Intel
- Enable OWL-ViT graph mode on Gaudi platform 783 cfgfung
- Support mixtral kvcache reuse and remove kv_cache_fp8 898 jychen21
- Add fp8 related changes to mistral for text-generation 918 skaulintel
- Optimization for phi series models: support fp8 kv cache and reuse kv cache 902 yuwenzho
- Support Mistral 32K input token 931 jiminha
- Support mixtral long sequence 32k with bs 4 903 jychen21
- Adapt Mixtral long sequence handling for Mistral 985 jiminha
- Fix performance issue in mistral 1030 jiminha
- Optimized inference of Starcoder2 model 829 XinyuYe-Intel
- Add support for IBM Granite 1045 regisss
- Enable fp8 inference for Llava-hf 7B and 13B in 1.16 release 951 Luca-Calabria
- Fusedrope inp bf16 1026 ssarkar2
- Enhance Qwen2 model with FSDPA and bucket 1033 Zhiwei35
- Optimize seamless-m4t/vits model for text-to-speech generation 825 sywangyi
- cache_optimization 1028 ssarkar2
- Ensure KV cache is not returned as output tensor during decode phase for Falcon 993 schoi-habana
- Fast softmax 972 wszczurekhabana
- Falcon optimization 974 libinta
- Quantization for FSDPA 976 dudilester
- Falcon update park 1052 ssarkar2
- Add the Llava_next support 1041 yuanwu2017
- Improve torch compile performance 1082 libinta

Stable Video Diffusion

- Add SVD pipeline 743 dsocek

PEFT

- Add ia3 and adalora support 809 sywangyi
- Enable prompt tuning/prefix tuning/p tuning clm and example 758 sywangyi

TRL

- Finetuning stable diffusion with DDPO 733 skavulya

Object Segmentation Example

- Add an example of object segmentation (ClipSeg) 801 cfgfung

Dreambooth

- Diffuser dreambooth full/lora/lokr/loha/oft finetune, dreambooth XL lora finetune 881 sywangyi

Others

- Text generation pipeline: Extended functionality to align with run_generation script 782 mgonchar
- Enable clip mediapipe and update G2 baseline 856 MohitIntel
- Add ci test for SFT and DPO 857 sywangyi
- Fix SFT, DPO CI on Gaudi1 893 regisss
- Add SDXL in README 894 regisss
- Fix falcon 180b oom issue if peft > 0.6.2 895 sywangyi
- Enabled additional models in CI 879 MohitIntel
- Add static shape support for vision_encoder_decoder generation if decoder supports static shape 834 sywangyi
- Add HabanaProfile to Stable Diffusion and XL 828 atakaha
- Pytest accuracy updates for Falcon, T5, GPT2 916 Luca-Calabria
- Update text-generation readme with torch.compile info. 884 libinta
- Update Wav2Vec2ModelTest::test_initialization 919 malkomes
- Add linear and dynamic RoPE to Mistral and Mixtral 892 regisss
- Fix for wav2vec2 test cases 923 lqnguyen
- Add nograd() to prevent backward backend 897 astachowiczhabana
- Assisted decoding not implemented 910 tjs-intel
- Disable wav2vec2 symbolic tracing test 904 tjs-intel
- Add support for symbolic tracing of GPT2 models 913 tjs-intel
- Utils: return more reasonable error in case of attempt of non-PyTorch model loading 921 mgonchar
- Pytest accuracy updates for Bridgetower, Swin, Vit 927 Luca-Calabria
- Text generation: added langchain pipeline script 887 mgonchar
- Fix for AST models 914 vidyasiv
- Fix AttributeError for wav2vec test 929 Jianhong-Zhang
- Fix ValueError for test_summarization 939 Jianhong-Zhang
- Grad norm tensor fix 938 yeonsily
- Add information to the audio-classification examples README about --ddp_find_unused_parameters parameter 941 Alberto-Villarreal
- Add leaderboard link 947 echarlaix
- Fix formatting of arg parse help strings in the PEFT example 944 dmsuehir
- Use new Habana llama and falcon model configs 940 skaulintel
- Update based on legal requirements. 900 libinta
- Update test generation config to raise ValueError 949 malkomes
- Add --trust_remote_code for text generation examples 870 yangulei
- Added Llama-2 fp8 text-generation test cases 934 yeonsily
- Upgrade SD output image verification with CLIP score 920 MohitIntel
- Llama Guard for text classification example 871 dsmertin
- Update README logo 950 regisss
- Add Gaudi CI for Sentence Transformers 928 regisss
- Get iteration times through generate() 899 hsubramony
- Update speech recognition seq2seq example 953 regisss
- Fix wrongly all_gather for mixtral finetune 965 ccrhx4
- Add intel-mila protST example 860 sywangyi
- Small CI refacto 968 regisss
- Llama70b one card to infer device map with max memory limitation 963 Yantom1
- Map list to tensors 926 ssarkar2
- Fix fsdp lora torch compile issue 971 sywangyi
- Fix for the simulate_dyn_prompt flag assertion 984 alekseyfa
- Initial enablement with FP8 Training (port from OHF 91) 936 libinta
- Warn user when using --disk_offload without hqt 964 Yantom1
- Assign grad_norm for logging only if it's a single element tensor 992 yeonsily
- Update examples 998 regisss
- Fix warmup for diffusers when batch size < throughput_warmup_steps 960 dsocek
- Add torch.compile instructions for Roberta-Large 981 MohitIntel
- Fix gpt_neox, stablelm inference regression caused by RoPE dtype 999 mandy-li
- fea(examples): Updated the READMEs with requirements.txt installation 1000 imangohari1
- Initial commit for fp8 CI 995 yeonsily
- Fixed 'MixtralConfig' object has no attribute 'rope_scaling' 1009 aslanxie
- Use the lenght of timesteps as the inference step num 986 yuanwu2017
- Fix the bug of output_type=np or latent. 996 yuanwu2017
- Fix wav2vec test load adapter 937 malkomes
- Mark scale as const and remove --fp8 flag usage 962 Yantom1
- Add per step time collection to other methods 1004 ssarkar2
- Fix first token time 1019 ssarkar2
- Fix text-generation example 1025 regisss
- Updates test_beam_search to transformers_4.40 1017 malkomes
- Fix eos problem 1034 sywangyi
- fp8 textgen ci structure update 1029 jiminha
- Fix a return value issue casued by PR 973 1040 yafshar
- Add no_checks for sub dataset in lvwerra/stack-exchange-paired since it does not contain test split 1003 sywangyi
- Readme Update for FSDP 980 hlahkar
- Add unifier script and disk offload flag usages to README. 1023 libinta
- Add mixtral for meta device load due to mixtral-8x22b model size 909 libinta
- Update unifier script 1010 Yantom1
- Update text-generation CI configuration for falcon and Mixtral 1044 yeonsily
- Update multi-node README to check ssh connection issue 1048 yeonsily
- Infra upgrade workflows 480 glegendre01
- Update test_text_generation_example.py 1051 ssarkar2
- BERT training migrated to torch.compile 990 ANSHUMAN87
- Update test_examples.py 1053 ssarkar2
- Update modeling_llama.py: deepspeed fix for codellama 1054 ssarkar2
- No shapes in profilings by default 1050 astachowiczhabana
- Change the way to unset environemt variable for gpt-neox ci 1060 yeonsily
- Update README for Albert torch.compile mode 1061 MohitIntel
- Fix lm_evaluation_harness to specific commit (240) 1064 astachowiczhabana
- Fix text-generation example README.md 1081 shepark

4.38

The codebase is fully validated for Transformers v4.38.

- Upgrade to Transformers 4.38 788 regisss

Model optimizations

- Add optimization for blip text model generation 653 sywangyi
- Enable internal kv bucket in llama 720 xt574chen
- Enable Mixtral-8x7B 739 jychen-habana
- Update Mixtral-8x7B fp8 hqt example 756 jychen-habana
- Further fixes for performance with internal bucketing 781 puneeshkhanna
- speecht5 optimization 722 sywangyi
- move img_maskget_attn_mask() to hpu 795 hsubramony
- Mistral optimizations 804 ssarkar2

Image-to-text and VQA examples

- Add image-to-text and visual question answering example 738 sywangyi

torch.compile

- Enable torch_compile mode for distributed 659 kalyanjk
- Fix graph breaks in torch compile mode 806 hlahkar
- Fix torch.compile for text generation 811 regisss
- Add Llama7b FSDP test for torch.compile mode 818 pankd

Bug fixes

- Fix beamsearch crash and incorrect output in decode-only model and encode-decode model 627 sywangyi
- Fix translation models 710 vidyasiv
- Fix throughput calculation for diffusion models 715 skavulya
- Fix crash in llama mode in llava image-to-text generation 755 sywangyi
- Fix backward error in DDP when running reward model finetune in RLHF 507 sywangyi
- Fix get_dtype and convert_into_dtypes 769 regisss
- Override sdpa option in Gaudi 771 jiminha
- Fix Llama-70B-FSDP model loading issue 752 hlahkar
- Fix FSDP in transformer4.38 812 libinta
- Delay importing deepspeed comm due for perf 810 jiminha
- Fix llama rotary pos emb issue for transformers 4.38 813 libinta
- Fix torch.full issue below when running deepspeed z3 for llama 820 libinta
- Fix profile issue with 1st step 837 libinta
- Fix mistral after syn1.15 update 858 ssarkar2

Others

- Small test_text_generation_example.py refacto 725 regisss
- Update README, add PPO support 721 sywangyi
- Update the Mistral model naming 726 yafshar
- Changing backend name 708 vivekgoe
- Update ppo_trainer.py 718 skaulintel
- Add seed in sft example, make sft result reproducable 735 sywangyi
- Adding a flag whether to save checkpoint or not in run_lora_clm.py 736 yeonsily
- Refactor and update CI for encoder-decoders 742 regisss
- Expose Llama Fused OPs control from run_lora_clm.py 751 hlahkar
- Fixing tests by making static_shapes False 778 bhargaveede
- Fix ControlNet README 785 regisss
- Workaround for RoPE computed in bf16 for GPT-NeoX 746 regisss
- Add Whisper and SpeechT5 to model table 790 regisss
- Update summarization example README 791 srajabos
- Block torchscript pytest because of seg fault issue 793 yeonsily
- Fix test_encoder_decoder.py for opus-mt-zh-en 798 regisss
- Replacing obsolete API for mediapipe 796 MohitIntel
- Add --distribution_strategy fast_ddp in contrastive-image-text README and BridgeTower test 799 regisss
- Fix redundant bucket internal and hpu graph setting 797 puneeshkhanna
- Add Llama test for fsdp 761 hlahkar
- Enable dynamic shapes for esmfold 803 hsubramony
- Add Llama/Llama2 support in Question-Answering 745 kplau1128
- Update MLM example 830 regisss
- Revert Wav2Vec2 TDNNLayer forward function same as transformer v4.37.2 827 yeonsily
- Save CI test output image 835 MohitIntel
- Update ckpt loading 773 schoi-habana
- Skip SDXL test in CI 840 regisss
- Fix FSDP test on Gaudi1 841 regisss
- Remove installation from source for Diffusers in CI 846 regisss
- Fix fp8 ci 852 regisss
- Fix PR 848 853 regisss
- Disable safe loading tests in CI 854 regisss
- Add warmup for eval 855 libinta

Known issue

- A crash may occur with [unify_measurements.py](https://github.com/huggingface/optimum-habana/blob/main/examples/text-generation/quantization_tools/unify_measurements.py)

4.37

- Upgrade to Transformers 4.37 651

**Full Changelog**: https://github.com/huggingface/optimum-habana/compare/v1.10.0...v1.10.2

4.31

Transformers v4.31 (latest stable release) is fully supported.

- Upgrade to Transformers v4.31 312 regisss

Page 1 of 23

Releases

Has known vulnerabilities

Optimum

Page 1 of 23

4.45

4.43

4.40

4.38

4.37

4.31

Page 1 of 23

Links

Releases