Hqq

Latest version: v0.2.5

Safety actively analyzes 722491 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 4

0.2.5

-Fix `.name` in backends
-Skip gemlite invalid in/out feature sizes in VLLM patching
-Faster VLLM packing via GemLite

0.2.3.post1

Bug fixes:
- Check `W_q` in state dict to fix peft issue https://github.com/mobiusml/hqq/issues/151
- Fix bugs related to `AutoHQQHFModel.save_to_safetensors`

0.2.3

* VLLM support via patching - GemLite backend + on-the-fly quantization
* Add support for Aria
* Add support to load quantized SequenceClassification
* Faster decoding via (custom cudagraphs, sdpa math backend, etc.)
* Fix bugs related torch compile and hf_generator related to the newer transformers versions
* Fix bugs related to saving quantized models with no grouping
* Fix bugs related to saving large quantized models
* Update examples
* Add support for HQQLinear `.to(device)`

0.2.2

- Support static cache compilation without using `HFGenerator`
- Fixing various issues related to `torch.compile`

0.2.1

- `HQQLinear.state_dict()` for non-initialized layers. Mainly used in for https://github.com/huggingface/transformers/pull/33141

0.2.0

- Bug fixes
- Safetensors support for transformers via https://github.com/huggingface/transformers/pull/33141
- `quant_scale`, `quant_zero` and `offload_meta` are now deprecated. You can still use them with the hqq lib, but you can't use them with the transformers lib

Page 1 of 4

Releases

Has known vulnerabilities

Hqq

Page 1 of 4

0.2.5

0.2.3.post1

0.2.3

0.2.2

0.2.1

0.2.0

Page 1 of 4

Links

Releases