Gptqmodel

Latest version: v1.4.1

Safety actively analyzes 688944 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 5

1.0.5

What's Changed
Added partial quantization support Llama 3.2 Vision model. v1.0.5 allows quantization of text-layers (layers responsible for text-generation) only. We will add vision layer support shortly. A Llama 3.2 11B Vision Instruct models will quantize to 50% of the size in 4bit mode. Once vision layer support is added, the size will reduce to expected ~1/4.

* [MODEL] Add Llama 3.2 Vision (mllama)* support by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/401


**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v1.0.4...v1.0.5

1.0.4

What's Changed

Liger Kernel support added for ~50% vram reduction in quantization stage for some models. Added toggle to disable parallel packing to avoid oom larger models. Transformers depend updated to 4.45.0 for Llama 3.2 support.

* [FEATURE] add a parallel_packing toggle by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/393
* [FEATURE] add liger_kernel support by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/394


**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v1.0.3...v1.0.4

1.0.3

What's Changed

* [MODEL] Add minicpm3 by LDLINGLINGLING in https://github.com/ModelCloud/GPTQModel/pull/385
* [FIX] fix minicpm3 support by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/387
* [MODEL] Added GRIN-MoE support by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/388

New Contributors
* LDLINGLINGLING made their first contribution in https://github.com/ModelCloud/GPTQModel/pull/385
* mrT23 made their first contribution in https://github.com/ModelCloud/GPTQModel/pull/386

**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v1.0.2...v1.0.3

1.0.2

What's Changed

Upgrade the AutoRound package to v0.3.0. Pre-built WHL and PyPI source releases are now available. Installation can be done by downloading our pre-built WHL or using `pip install gptqmodel --no-build-isolation`.

* [CORE] Autoround v0.3 by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/368
* [CI] Lots of CI fixups by CSY-ModelCloud

**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v1.0.0...v1.0.2

1.0.0

What's Changed

40% faster multi-threaded `packing`, new `lm_eval` api, fixed python 3.9 compat.

* Add `lm_eval` api by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/338
* Multi-threaded `packing` in quantization by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/354
* [CI] Add TGI unit test by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/348
* [CI] Updates by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/347, https://github.com/ModelCloud/GPTQModel/pull/352, https://github.com/ModelCloud/GPTQModel/pull/353, https://github.com/ModelCloud/GPTQModel/pull/355, CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/357
* Fix python 3.9 compat by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/358


**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v0.9.11...v1.0.0

0.9.11

What's Changed

Added LG EXAONE 3.0 model support. New dynamic per layer/module flexible quantization where each layer/module may have different bits/params. Added proper sharding support to backend.BITBLAS. Auto-heal quantization errors due to small damp values.

* [CORE] add support for pack and shard to bitblas by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/316
* Add `dynamic` bits by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/311, https://github.com/ModelCloud/GPTQModel/pull/319, https://github.com/ModelCloud/GPTQModel/pull/321, https://github.com/ModelCloud/GPTQModel/pull/323, https://github.com/ModelCloud/GPTQModel/pull/327
* [MISC] Adjust the validate order of QuantLinear when BACKEND is AUTO by ZX-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/318
* add save_quantized log model total size by PZS-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/320
* Auto damp recovery by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/326
* [FIX] add missing original_infeatures by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/337
* Update Transformers to 4.44.0 by Qubitium in https://github.com/ModelCloud/GPTQModel/pull/336
* [MODEL] add exaone model support by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/340
* [CI] Upload wheel to local server by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/339
* [MISC] Fix assert by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/342


**Full Changelog**: https://github.com/ModelCloud/GPTQModel/compare/v0.9.10...v0.9.11

Page 3 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.