Exllamav2

Latest version: v0.2.8

Safety actively analyzes 714792 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 7

0.2.8

- Support Qwen2.5-VL
- Minor bugfixes

**Full Changelog**: https://github.com/turboderp-org/exllamav2/compare/v0.2.7...v0.2.8

0.2.7

* Basic video support for Qwen2-VL
* Support Cohere2 arch
* Support Granite3 arch
* Couple of bugfixes

**Full Changelog**: https://github.com/turboderp/exllamav2/compare/v0.2.6...v0.2.7

0.2.6

- Some small fixes, most notably for Qwen2-VL inference on Windows

**Full Changelog**: https://github.com/turboderp/exllamav2/compare/v0.2.5...v0.2.6

0.2.5

- Initial support for Qwen2-VL (images for now, no video)
- Some bugfixes

**Full Changelog**: https://github.com/turboderp/exllamav2/compare/v0.2.4...v0.2.5

0.2.4

- Support Pixtral
- Refactoring for more multimodal support
- Faster filter evaluation
- Various optimizations and bugfixes
- Various quality of life improvements

**Full Changelog**: https://github.com/turboderp/exllamav2/compare/v0.2.3...v0.2.4

0.2.3

- No longer use safetensors for loading weights (fix virtual memory issues on Windows especially)
- Disable fasttensors option (now redundant)
- Prioritize HF Tokenizers model when both HF and SPM models available
- Add XTC sampler
- Add YaRN support
- Various fixes and QoL improvements

**Full Changelog**: https://github.com/turboderp/exllamav2/compare/v0.2.2...v0.2.3

Page 1 of 7

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.