Hqq

Latest version: v0.2.5

Safety actively analyzes 723177 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 4

0.1.2

Improvements
- Added LoRA support
- Added LoRA with fake quantization support (experimental)
- Optimizer V2 with scale update support
- Some code refactoring in quantize.py

0.1.1.post1

No improvements over v0.1.1. Just removed Pytorch from the dependencies and updated the Readme.

0.1.1

Improvements:
- Added Mixtral support for Hugging Face.
- Added support for layer-wise custom quantization configs.

0.1.0

Improvements
- Added compile backend support
- Added Aten C++ backend (experimental)
- Faster bit unpacking via pre-allocated empty tensor
- Added VLLM support
- Refactoring to call `quantize_model()` on instances

Supported models
- Llama (Hugging Face + VLLM)
- ViT-CLIP (timm)

Limitations
- HF only supports single GPU runtime.
- VLLM only supports single GPU with a single worker.
- The compile backend sometimes creates issues with async runtime
- Doesn't support PEFT (LoRA, etc.).

0.1.0alpha

Alpha version with basic Hugging Face/Timm support.

Supported models:

- Llama (Hugging Face)
- ViT (timm)

Limitations:
- Uses a pure Pytorch implementation without optimizations.
- Only supports single GPU runtime.
- Doesn't support Peft (LoRA, etc.) for custom training.

Page 4 of 4

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.