What's Changed
IBM Granite model support. Full auto-buildless wheel install from pypi. Reduce max cpu memory usage by >20% during quantization. 100% CI model/feature coverage. Updated hf-integration support with latest transformers.
Full deprecations: liger-kernel support and exllama v1 quant kernel.
* Fix deprecated by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/447
* [COMPAT] [FIX] vllm params by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/448
* add estimate-vram by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/452
* add field uri by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/449
* auto infer model base name from model files by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/451
* remove exllama v1 by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/453
* [SECURITY] drop support of loading unsafe .bin weights by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/460
* [MODEL] add granite support by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/466
* Split base.py file by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/465
* Move save_quantized function into saver.py by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/467
* remove deprecated exllama v1 code by Qubitium in https://github.com/ModelCloud/GPTQModel/pull/473
* [MISC] move model def file to model_def folder by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/479
* [FIX] Fix unit test by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/480
* Download whl in setup.py by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/481
* [Fix] cpu memory leak by ZX-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/485
* [CI] set ninja threads to 4 by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/487
* [FIX] sharded model loading error by ZX-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/490
* add internlm test by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/491
* remove needless function by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/494
* Fix unit test by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/495
* [FIX] fix test_integration by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/497
* [Test] add codegen and xverse test by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/496
**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v1.0.9...v1.1.0