Hqq

Latest version: v0.2.5

Safety actively analyzes 723177 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 4

0.1.8

- Add BitBlas backend support
- Simpler HQQLinear from weights `HQQLinear.from_weights(W, bias, etc.)`
- Fix memory leak while swaping layers for the TorchAO Backend
- Add `HQQLinear.unpack()` call

0.1.7.post3

- Enable CPU quantization and runtime
- `_load_state_dict` fix
- fix `extra_repr` in `HQQLinear`
- fix `from_quantized` bugs
- fix `|` typing
- fix 3-bit `axis=1` slicing bug
- add 5/6 bit for testing

0.1.7.post2

- Various bug fixes, especially with `AutoHQQHFModel` and the patching logic, to make it work with any transformers model.
- Readme refactoring.
- Whisper example.

0.1.7

- Faster inference with torchao / marlin 4-bit kernels
- Multi-gpu support for `model.quantize()`
- Custom HF generator
- Various bug fixes/improvements

0.1.6.post2

Same as <a href="https://github.com/mobiusml/hqq/releases/tag/0.1.6">v0.1.6</a> with setup.py fixes:

- find_packages fix: https://github.com/mobiusml/hqq/pull/25
- Auto-build CUDA kernels via pypi package: https://github.com/mobiusml/hqq/pull/26

0.1.6.post1

Same as <a href="https://github.com/mobiusml/hqq/releases/tag/0.1.6">v0.1.6</a> with a find_packages fix https://github.com/mobiusml/hqq/pull/25

Page 2 of 4

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.