Gptqmodel

Latest version: v2.2.0

Safety actively analyzes 724004 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 8

1.0.4

What's Changed

Liger Kernel support added for ~50% vram reduction in quantization stage for some models. Added toggle to disable parallel packing to avoid oom larger models. Transformers depend updated to 4.45.0 for Llama 3.2 support.

* [FEATURE] add a parallel_packing toggle by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/393
* [FEATURE] add liger_kernel support by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/394


**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v1.0.3...v1.0.4

1.0.3

What's Changed

* [MODEL] Add minicpm3 by LDLINGLINGLING in https://github.com/ModelCloud/GPTQModel/pull/385
* [FIX] fix minicpm3 support by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/387
* [MODEL] Added GRIN-MoE support by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/388

New Contributors
* LDLINGLINGLING made their first contribution in https://github.com/ModelCloud/GPTQModel/pull/385
* mrT23 made their first contribution in https://github.com/ModelCloud/GPTQModel/pull/386

**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v1.0.2...v1.0.3

1.0.2

What's Changed

Upgrade the AutoRound package to v0.3.0. Pre-built WHL and PyPI source releases are now available. Installation can be done by downloading our pre-built WHL or using `pip install gptqmodel --no-build-isolation`.

* [CORE] Autoround v0.3 by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/368
* [CI] Lots of CI fixups by CSY-ModelCloud

**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v1.0.0...v1.0.2

1.0.0

What's Changed

40% faster multi-threaded `packing`, new `lm_eval` api, fixed python 3.9 compat.

* Add `lm_eval` api by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/338
* Multi-threaded `packing` in quantization by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/354
* [CI] Add TGI unit test by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/348
* [CI] Updates by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/347, https://github.com/ModelCloud/GPTQModel/pull/352, https://github.com/ModelCloud/GPTQModel/pull/353, https://github.com/ModelCloud/GPTQModel/pull/355, CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/357
* Fix python 3.9 compat by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/358


**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v0.9.11...v1.0.0

0.9.11

What's Changed

Added LG EXAONE 3.0 model support. New dynamic per layer/module flexible quantization where each layer/module may have different bits/params. Added proper sharding support to backend.BITBLAS. Auto-heal quantization errors due to small damp values.

* [CORE] add support for pack and shard to bitblas by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/316
* Add `dynamic` bits by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/311, https://github.com/ModelCloud/GPTQModel/pull/319, https://github.com/ModelCloud/GPTQModel/pull/321, https://github.com/ModelCloud/GPTQModel/pull/323, https://github.com/ModelCloud/GPTQModel/pull/327
* [MISC] Adjust the validate order of QuantLinear when BACKEND is AUTO by ZX-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/318
* add save_quantized log model total size by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/320
* Auto damp recovery by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/326
* [FIX] add missing original_infeatures by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/337
* Update Transformers to 4.44.0 by Qubitium in https://github.com/ModelCloud/GPTQModel/pull/336
* [MODEL] add exaone model support by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/340
* [CI] Upload wheel to local server by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/339
* [MISC] Fix assert by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/342


**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v0.9.10...v0.9.11

0.9.10

What's Changed

Ported vllm/nm gptq_marlin inference kernel with expanded bits (8bits), group_size (64,32), and desc_act support for all GPTQ models with `format = FORMAT.GPTQ`. Auto calculate auto-round nsamples/seglen parameters based on calibration dataset. Fixed `save_quantized()` called on pre-quantized models with non-supported backends. HF transformers depend updated to ensure Llama 3.1 fixes are correctly applied to both quant and inference stage.

* [CORE] add marlin inference kernel by ZX-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/310
* [CI] Increase timeout to 40m by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/295, https://github.com/ModelCloud/GPTQModel/pull/299
* [FIX] save_quantized() by ZX-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/296
* [FIX] autoround nsample/seqlen to be actual size of calibration_dataset. by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/297, LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/298
* Update HF transformers to 4.43.3 by Qubitium in https://github.com/ModelCloud/GPTQModel/pull/305
* [CI] remove test_marlin_hf_cache_serialization() by ZX-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/314

**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v0.9.9...v0.9.10

Page 6 of 8

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.