AutoGPTQ integration
Part of [AutoGPTQ]() library has been integrated in Optimum, with utilities to ease the integration in other Hugging Face libraries. Reference: https://huggingface.co/docs/optimum/llm_quantization/usage_guides/quantization
* Add GPTQ Quantization by SunMarc in https://github.com/huggingface/optimum/pull/1216
* Fix GPTQ doc by regisss in https://github.com/huggingface/optimum/pull/1267
* Add AutoGPTQ benchmark by fxmarty in https://github.com/huggingface/optimum/pull/1292
* Fix gptq params by SunMarc in https://github.com/huggingface/optimum/pull/1284
Extended BetterTransformer support
BetterTransformer now supports BLOOM and GPT-BigCode architectures.
* Bt bloom by baskrahmer in https://github.com/huggingface/optimum/pull/1221
* Support gpt_bigcode in bettertransformer by fxmarty in https://github.com/huggingface/optimum/pull/1252
* Fix BetterTransformer starcoder init by fxmarty in https://github.com/huggingface/optimum/pull/1254
* Fix BT starcoder fp16 by fxmarty in https://github.com/huggingface/optimum/pull/1255
* SDPA dispatches to flash for MQA by fxmarty in https://github.com/huggingface/optimum/pull/1259
* Check output_attentions is False in BetterTransformer by fxmarty in https://github.com/huggingface/optimum/pull/1306
Other changes and bugfixes
* Update bug report template by fxmarty in https://github.com/huggingface/optimum/pull/1266
* Fix ORTModule uses fp32 model issue by jingyanwangms in https://github.com/huggingface/optimum/pull/1264
* Fix build PR doc workflow by fxmarty in https://github.com/huggingface/optimum/pull/1270
* Avoid triggering stop job on label by fxmarty in https://github.com/huggingface/optimum/pull/1274
* Update version following 1.11.1 patch by fxmarty in https://github.com/huggingface/optimum/pull/1275
* Fix fp16 ONNX detection for decoder models by fxmarty in https://github.com/huggingface/optimum/pull/1276
* Update version following 1.11.2 patch by regisss in https://github.com/huggingface/optimum/pull/1291
* Pin tensorflow<=2.12.1 by fxmarty in https://github.com/huggingface/optimum/pull/1305
* ONNX: disable text-generation models for sequence classification & fixes for transformers 4.32 by fxmarty in https://github.com/huggingface/optimum/pull/1308
* Fix staging tests following transformers 4.32 release by fxmarty in https://github.com/huggingface/optimum/pull/1309
* More fixes following transformers 4.32 release by fxmarty in https://github.com/huggingface/optimum/pull/1311
New Contributors
* SunMarc made their first contribution in https://github.com/huggingface/optimum/pull/1216
* jingyanwangms made their first contribution in https://github.com/huggingface/optimum/pull/1264
**Full Changelog**: https://github.com/huggingface/optimum/compare/v1.11.2...v1.12.0