Optimum-quanto

Latest version: v0.2.7

Safety actively analyzes 722491 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 4

0.2.7

What's Changed

What's new
* Add repr for QuantizedTransformersModel by imba-tjd in https://github.com/huggingface/optimum-quanto/pull/357
* Bump minimal pytorch version to 2.6 by dacorvo in https://github.com/huggingface/optimum-quanto/pull/373

Bug fixes
* [tests] enable testing for xpu (rebased) by dacorvo in https://github.com/huggingface/optimum-quanto/pull/349
* enable qbitstensor test on xpu by dacorvo in https://github.com/huggingface/optimum-quanto/pull/350
* fix(library): only compile CUDA extension on Linux by dacorvo in https://github.com/huggingface/optimum-quanto/pull/365
* Fix error when trying to access `state_dict` after activation quantization by DN6 in https://github.com/huggingface/optimum-quanto/pull/371

New Contributors
* imba-tjd made their first contribution in https://github.com/huggingface/optimum-quanto/pull/357
* DN6 made their first contribution in https://github.com/huggingface/optimum-quanto/pull/371

**Full Changelog**: https://github.com/huggingface/optimum-quanto/compare/v0.2.6...v0.2.7

0.2.6

What's Changed

* Add hip support by Disty0 in https://github.com/huggingface/optimum-quanto/pull/330
* Switched linters, black -> ruff by ishandeva in https://github.com/huggingface/optimum-quanto/pull/334
* Add marlin int4 kernel by dacorvo and shcho1118 in https://github.com/huggingface/optimum-quanto/pull/333
* fix: use reshape instead of view by dacorvo in https://github.com/huggingface/optimum-quanto/pull/338
* Support QLayerNorm without weights by dacorvo in https://github.com/huggingface/optimum-quanto/pull/341

New Contributors
* ishandeva made their first contribution in https://github.com/huggingface/optimum-quanto/pull/334
* Disty0 made their first contribution in https://github.com/huggingface/optimum-quanto/pull/330
* shcho1118 made their first contribution in https://github.com/huggingface/optimum-quanto/pull/333

**Full Changelog**: https://github.com/huggingface/optimum-quanto/compare/v0.2.5...v0.2.6

0.2.5

New features

- Load and save models from the Hugging Face hub 263 by sayakpaul
- Add support for float8 e4f3mnuz 310 (from 281) by maktukmak
- Faster and less memory-intensive requantization 290 by latentCall145
- Support torch.equal for QTensor 294 by dacorvo
- Add Marlin Float8 kernel 296 (from 241) by fxmarty
- Add Whisper for speech recognition example 298 (from 242) by mattiadg
- Add ViT classification example 308 by shovan777

Bug fixes

- Fix include patterns in quantize 271 by kaibioinfo
- Enable non-strict loading of state dicts 295 by BenjaminBossan
- Fix transformers forward error 303 by dacorvo
- Fix missing call in transformers models 325 by dacorvo
- Fix 8-bit mm calls for 4D inputs 326 by dacorvo

**Full Changelog**: https://github.com/huggingface/optimum-quanto/compare/v0.2.4...v0.2.5

0.2.4

Bug Fixes:

- fix import error in `optimum-cli` when diffusers is not installed by dacorvo

**Full Changelog**: https://github.com/huggingface/optimum-quanto/compare/v0.2.3...v0.2.4

0.2.3

What's Changed

* Use new int8 torch kernels by dacorvo in https://github.com/huggingface/optimum-quanto/pull/222
* Rebuild extension when pytorch is updated by dacorvo in https://github.com/huggingface/optimum-quanto/pull/223
* Use tinygemm bfloat16 / int4 kernel whenever possible by dacorvo in https://github.com/huggingface/optimum-quanto/pull/234
* Add HQQ optimizer by dacorvo in https://github.com/huggingface/optimum-quanto/pull/235
* Add QuantizedModelForCausalLM by dacorvo in https://github.com/huggingface/optimum-quanto/pull/243
* Integrate quanto commands to optimum-cli by dacorvo in https://github.com/huggingface/optimum-quanto/pull/244
* Add pixart-sigma test to image example by dacorvo in https://github.com/huggingface/optimum-quanto/pull/247
* Support diffusion models. by sayakpaul in https://github.com/huggingface/optimum-quanto/pull/255

Bug fixes

* Fix: align extension on max arch by dacorvo in https://github.com/huggingface/optimum-quanto/pull/227
* Fix TinyGemmQBitsTensor move by dacorvo in https://github.com/huggingface/optimum-quanto/pull/246
* Fix stream-lining bug by dacorvo in https://github.com/huggingface/optimum-quanto/pull/249
* Fix float/int8 matrix multiplication latency regression by dacorvo in https://github.com/huggingface/optimum-quanto/pull/250
* Fix serialization issues by dacorvo in https://github.com/huggingface/optimum-quanto/pull/258

New Contributors
* sayakpaul made their first contribution in https://github.com/huggingface/optimum-quanto/pull/255

**Full Changelog**: https://github.com/huggingface/optimum-quanto/compare/v0.2.2...v0.2.3

0.2.2

New features

- add OWLv2 detection example by dacorvo,
- use new torch quantization kernels by dacorvo.

Bug fixes

- avoid CUDA compilation errors on older Nvidia cards (pre Ampere) by dacorvo,
- recompile extensions when pytorch is updated and prevent segfault by dacorvo.

Page 1 of 4

Releases

Has known vulnerabilities

Optimum-quanto

Page 1 of 4

0.2.7

0.2.6

0.2.5

0.2.4

0.2.3

0.2.2

Page 1 of 4

Links

Releases