Exllamav2

Latest version: v0.2.8

Safety actively analyzes 723518 Python packages for vulnerabilities to keep your Python projects secure.

Page 5 of 7

0.0.16

- Adds support for Cohere models
- N-gram decoding
- A few bugfixes
- Lots of optimizations

**Full Changelog**: https://github.com/turboderp/exllamav2/compare/v0.0.15...v0.0.16

0.0.15

- Adds Q4 cache mode
- Support for StarCoder2
- Minor optimizations and a couple of bugfixes

**Full Changelog**: https://github.com/turboderp/exllamav2/compare/v0.0.14...v0.0.15

0.0.14

Adds support for Qwen1.5 and Gemma architectures.

Various fixes and optimizations.

**Full Changelog since 0.0.13**: https://github.com/turboderp/exllamav2/compare/v0.0.13...v0.0.14

0.0.13.post2

**Full Changelog**: https://github.com/turboderp/exllamav2/compare/0.0.13.post1...0.0.13.post2

0.0.13.post1

Fixes inference on models with vocab sizes that are not multiples of 32

0.0.13

This release is mostly to update the prebuilt wheels to Torch 2.2, since it won't load extensions built for earlier versions.

Adds dynamic temperature and quadratic sampling. Fixes performance degradation on some GPUs after batch optimizations and various other little things.

Page 5 of 7

Releases

Has known vulnerabilities

Previous Next

Exllamav2

Page 5 of 7

0.0.16

0.0.15

0.0.14

0.0.13.post2

0.0.13.post1

0.0.13

Page 5 of 7

Links

Releases