Gptqmodel

Latest version: v2.2.0

Safety actively analyzes 724004 Python packages for vulnerabilities to keep your Python projects secure.

Page 5 of 8

1.1.0

What's Changed

IBM Granite model support. Full auto-buildless wheel install from pypi. Reduce max cpu memory usage by >20% during quantization. 100% CI model/feature coverage. Updated hf-integration support with latest transformers.

Full deprecations: liger-kernel support and exllama v1 quant kernel.

* Fix deprecated by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/447
* [COMPAT] [FIX] vllm params by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/448
* add estimate-vram by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/452
* add field uri by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/449
* auto infer model base name from model files by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/451
* remove exllama v1 by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/453
* [SECURITY] drop support of loading unsafe .bin weights by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/460
* [MODEL] add granite support by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/466
* Split base.py file by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/465
* Move save_quantized function into saver.py by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/467
* remove deprecated exllama v1 code by Qubitium in https://github.com/ModelCloud/GPTQModel/pull/473
* [MISC] move model def file to model_def folder by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/479
* [FIX] Fix unit test by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/480
* Download whl in setup.py by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/481
* [Fix] cpu memory leak by ZX-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/485
* [CI] set ninja threads to 4 by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/487
* [FIX] sharded model loading error by ZX-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/490
* add internlm test by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/491
* remove needless function by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/494
* Fix unit test by ZYC-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/495
* [FIX] fix test_integration by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/497
* [Test] add codegen and xverse test by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/496

**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v1.0.9...v1.1.0

1.0.9

What's Changed

Fixed HF integration to work with latest transformers. Moved AutoRound to optional. Update flaky CI tests.

* [FIX] mark auto_round extras_require by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/430
* [BUILD] update compile flags by Qubitium in https://github.com/ModelCloud/GPTQModel/pull/428
* [FIX] failed test_transformers_integration.py by ZX-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/435

**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v1.0.8...v1.0.9

1.0.8

What's Changed

Moved QBits to optional. Add Python 3.12 wheels and fix wheel generation for cuda 11.8.

* [PKG] update vllm/sglang optional depends by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/423
* [FIX] autoround depend causing torch-cpu to be installed by Qubitium in https://github.com/ModelCloud/GPTQModel/pull/422

**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v1.0.7...v1.0.8

1.0.7

What's Changed

Fixed marlin (faster) kernel was not auto-selected for some models and `autoround` quantization save throwing json errors.

* [FIX] marlin_inference_linear not correctly auto selected for eligible models by ZX-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/413
* [FIX] remove "scale" and "zp" Tensor from layer_config by ZX-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/414
* [FIX] Failed unit test by ZX-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/420

**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v1.0.6...v1.0.7

1.0.6

What's Changed

Patch release to fix loading of quantized Llama 3.2 Vision model.

* [FIX] mllama loader by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/404

**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v1.0.5...v1.0.6

1.0.5

What's Changed
Added partial quantization support Llama 3.2 Vision model. v1.0.5 allows quantization of text-layers (layers responsible for text-generation) only. We will add vision layer support shortly. A Llama 3.2 11B Vision Instruct models will quantize to 50% of the size in 4bit mode. Once vision layer support is added, the size will reduce to expected ~1/4.

* [MODEL] Add Llama 3.2 Vision (mllama)* support by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/401

**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v1.0.4...v1.0.5

Page 5 of 8

Releases

Has known vulnerabilities

Previous Next

Gptqmodel

Page 5 of 8

1.1.0

1.0.9

1.0.8

1.0.7

1.0.6

1.0.5

Page 5 of 8

Links

Releases