Deduplicate Embedding / LM head weight in the ONNX export
Workaround for a bug in the PyTorch ONNX export that does not deduplicate the Embedding and LM head shared weight: https://github.com/pytorch/pytorch/issues/108342. For small enough models, this results in up to 50% ONNX serialized model size decrease.
* Fix PyTorch tied weights being duplicated in the exported ONNX models by fxmarty in https://github.com/huggingface/optimum/pull/1326
* Fix initializer detection for weight deduplication by fxmarty in https://github.com/huggingface/optimum/pull/1333
Extended ONNX Runtime support
ONNX Runtime integration now supports Pix2Struct and MPT architectures. Donut now supports IO Binding. Encoder-Decoder models are now supported as well.
* Pix2Struct onnxruntime support by krathul in https://github.com/huggingface/optimum/pull/1296
* Add MPT onnx and ORT support by jiqing-feng in https://github.com/huggingface/optimum/pull/1161
* Donut iobinding by IlyasMoutawwakil in https://github.com/huggingface/optimum/pull/1209
* Add encoder decoder model by mht-sharma in https://github.com/huggingface/optimum/pull/851
Extended ONNX export: MPT, TIMM models, Encoder-Decoder
Additionally, the model SAM is now be default exported as a vision_encoder.onnx, and prompt_encoder_mask_decoder.onnx.
* Add MPT onnx and ORT support by jiqing-feng in https://github.com/huggingface/optimum/pull/1161
* Adds ONNX Export Support for Timm Models by mht-sharma in https://github.com/huggingface/optimum/pull/965
* Add encoder decoder model by mht-sharma in https://github.com/huggingface/optimum/pull/851
* Fix SAM ONNX export requirements with transformers 4.32, export vision encoder separately by fxmarty in https://github.com/huggingface/optimum/pull/1301
BetterTransformer supports Falcon
* [`BetterTransformer`] Add falcon to `BetterTransformer` by younesbelkada in https://github.com/huggingface/optimum/pull/1343
Major bugfix: ability to set GPTQ Exllama kernel maximum length in the transformers integration
The function `exllama_set_max_input_length` from `auto-gptq` can now be used with Transformers GPTQ models.
* Version bump + add max_input_length to gptq by SunMarc in https://github.com/huggingface/optimum/pull/1329
Other changes and bugfixes
* Update version to 1.12.1.dev0 following release by fxmarty in https://github.com/huggingface/optimum/pull/1312
* Add GPTQ prefill benchmark by fxmarty in https://github.com/huggingface/optimum/pull/1313
* Precise ORTModel documentation by fxmarty in https://github.com/huggingface/optimum/pull/1268
* Improve BetterTransformer backward compatibility by fxmarty in https://github.com/huggingface/optimum/pull/1314
* Improve ORTModel documentation by fxmarty in https://github.com/huggingface/optimum/pull/1245
* Add bitsandbytes benchmark by fxmarty in https://github.com/huggingface/optimum/pull/1320
* fix typo in log message by AAnirudh07 in https://github.com/huggingface/optimum/pull/1322
* Support customize dtype for dummy generators by JingyaHuang in https://github.com/huggingface/optimum/pull/1307
* Fix opset custom onnx export by mht-sharma in https://github.com/huggingface/optimum/pull/1331
* Replace mpt to ernie custom export by mht-sharma in https://github.com/huggingface/optimum/pull/1332
* Fix BT benchmark script by fxmarty in https://github.com/huggingface/optimum/pull/1344
* Add name_or_path for donut generation by fxmarty in https://github.com/huggingface/optimum/pull/1345
* send both negative prompt embeds to ORT SDXL by ssube in https://github.com/huggingface/optimum/pull/1339
* add vae image processor by echarlaix in https://github.com/huggingface/optimum/pull/1219
* add negative prompt test by echarlaix in https://github.com/huggingface/optimum/pull/1347
* Add GPT BigCode to the BT documentation by fxmarty in https://github.com/huggingface/optimum/pull/1356
* Add BT dummy objects by fxmarty in https://github.com/huggingface/optimum/pull/1355
* Add text2text-generation-with-past test for encoder-decoder model by mht-sharma in https://github.com/huggingface/optimum/pull/1338
* Fix sentence transformer export by mht-sharma in https://github.com/huggingface/optimum/pull/1366
New Contributors
* krathul made their first contribution in https://github.com/huggingface/optimum/pull/1296
* AAnirudh07 made their first contribution in https://github.com/huggingface/optimum/pull/1322
* jiqing-feng made their first contribution in https://github.com/huggingface/optimum/pull/1161
* ssube made their first contribution in https://github.com/huggingface/optimum/pull/1339
**Full Changelog**: https://github.com/huggingface/optimum/compare/v1.12.0...v1.13.0