Auto-round

Latest version: v0.11

Safety actively analyzes 722491 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 2

0.4.6

Highlights:
1 set torch compile to false by default in https://github.com/intel/auto-round/pull/447
2 Fix packing hang and force to fp16 at exporting in https://github.com/intel/auto-round/pull/430
3 align auto_quantizer with Transformers 4.49 in https://github.com/intel/auto-round/pull/437

What's Changed
* Fix packing hang, torch compile and force to fp16 at exporting by wenhuach21 in https://github.com/intel/auto-round/pull/430
* fix nblocks issues by wenhuach21 in https://github.com/intel/auto-round/pull/432
* rm gc collect in packing by wenhuach21 in https://github.com/intel/auto-round/pull/438
* align auto_quantizer with main branch in Transformers by WeiweiZhang1 in https://github.com/intel/auto-round/pull/437
* [HPU]Fix compile bug when quant layer by yiliu30 in https://github.com/intel/auto-round/pull/441
* remove tricky setting in mxfp4 by wenhuach21 in https://github.com/intel/auto-round/pull/445
* fix bug of evaluate user model by n1ck-guo in https://github.com/intel/auto-round/pull/444
* Refine funcs by WeiweiZhang1 in https://github.com/intel/auto-round/pull/446
* set torch compile to false by default by WeiweiZhang1 in https://github.com/intel/auto-round/pull/447

**Full Changelog**: https://github.com/intel/auto-round/compare/v0.4.5...v0.4.6

0.4.5

**Highlights:**
We have enhanced support for extremely large models with the following updates:

**Multi-Card Tuning Support**: Added basic support for multi-GPU tuning. [415 support naive multi-card tuning ](https://github.com/intel/auto-round/pull/415)

**Accelerated Packing Stage:** Improved the packing speed (2X-4X)for AutoGPTQ and AutoAWQ formats by leveraging cuda. [407 speedup packing stage for autogptq and autoawq forma ](https://github.com/intel/auto-round/pull/407)

**Deepseek V3 GGUF Export:** Introduced support for exporting models to the Deepseek V3 GGUF format. [416 support to export deepseek v3 gguf format ](https://github.com/intel/auto-round/pull/416)

What's Changed
* update format readme by wenhuach21 in https://github.com/intel/auto-round/pull/411
* fix log bug and device "auto" bug by n1ck-guo in https://github.com/intel/auto-round/pull/409
* speedup packing stage for autogptq and autoawq format by wenhuach21 in https://github.com/intel/auto-round/pull/407
* support naive multi-card tuning by wenhuach21 in https://github.com/intel/auto-round/pull/415
* support bf16 inference for autoround format by wenhuach21 in https://github.com/intel/auto-round/pull/420
* enable backup pile dataset loading by WeiweiZhang1 in https://github.com/intel/auto-round/pull/417
* fix evaluation device bug, relate to issue 413 by n1ck-guo in https://github.com/intel/auto-round/pull/419
* support to export deepseek v3 gguf format by n1ck-guo in https://github.com/intel/auto-round/pull/416
* fix cuda UT torch_dtype by WeiweiZhang1 in https://github.com/intel/auto-round/pull/423
* fix eval trust_remote_code by n1ck-guo in https://github.com/intel/auto-round/pull/424

**Full Changelog**: https://github.com/intel/auto-round/compare/v0.4.4...v0.4.5

0.4.4

Highlights:
1 Fix install issue in https://github.com/intel/auto-round/pull/387
2 support to export gguf q4_0 and q4_1 format in https://github.com/intel/auto-round/pull/393
3 fix llm cmd line seqlen issue in https://github.com/intel/auto-round/pull/399

What's Changed
* fix a critic bug of static activation quantization by wenhuach21 in https://github.com/intel/auto-round/pull/392
* vlm 70B+ in single card by n1ck-guo in https://github.com/intel/auto-round/pull/395
* enhance calibration dataset and add awq pre quantization warning by wenhuach21 in https://github.com/intel/auto-round/pull/396
* support awq format for vlms by WeiweiZhang1 in https://github.com/intel/auto-round/pull/398
* [critic bug]fix llm example seqlen issue by WeiweiZhang1 in https://github.com/intel/auto-round/pull/399
* fix device auto issue by wenhuach21 in https://github.com/intel/auto-round/pull/400
* Fix auto-round install & bump into 0.4.4 by XuehaoSun in https://github.com/intel/auto-round/pull/387
* fix dtype converting issue by wenhuach21 in https://github.com/intel/auto-round/pull/403
* support for deepseek vl2 by n1ck-guo in https://github.com/intel/auto-round/pull/401
* llm_layer_config_bugfix by WeiweiZhang1 in https://github.com/intel/auto-round/pull/406
* support awq with qbits, only support sym by wenhuach21 in https://github.com/intel/auto-round/pull/402
* support to export gguf q4_0 and q4_1 format by n1ck-guo in https://github.com/intel/auto-round/pull/393

**Full Changelog**: https://github.com/intel/auto-round/compare/v0.4.3...v0.4.4

0.4.3

Highlights:
fix incorrect device setting in autoround format inference by WeiweiZhang1 in https://github.com/intel/auto-round/pull/383
remove the dependency on AutoGPTQ by XuehaoSun in https://github.com/intel/auto-round/pull/380

What's Changed
* support_llava_hf_vlm_example by WeiweiZhang1 in https://github.com/intel/auto-round/pull/381
* fix block_name_to_quantize by WeiweiZhang1 in https://github.com/intel/auto-round/pull/382
* fix incorrect device setting in autoround format inference by WeiweiZhang1 in https://github.com/intel/auto-round/pull/383
* refine homepage, update model links by WeiweiZhang1 in https://github.com/intel/auto-round/pull/385
* update eval basic usage by n1ck-guo in https://github.com/intel/auto-round/pull/384
* refine error msg and dump more log in the tuning by wenhuach21 in https://github.com/intel/auto-round/pull/386
* remove the dependency on AutoGPTQ for CPU and bump to V0.4.3 by XuehaoSun in https://github.com/intel/auto-round/pull/380

**Full Changelog**: https://github.com/intel/auto-round/compare/v0.4.2...v0.4.3

0.4.2

Highlights
1 Fix autoawq exporting issue
2 remove bias exporting if possible in autogptq format

What's Changed
* bump version into v0.4.1 by XuehaoSun in https://github.com/intel/auto-round/pull/350
* Update docker user and remove baseline UT by XuehaoSun in https://github.com/intel/auto-round/pull/347
* delete llm example and refine readme by wenhuach21 in https://github.com/intel/auto-round/pull/354
* Simulated W4Afp8 Quantization by wenhuach21 in https://github.com/intel/auto-round/pull/331
* add QWQ-32B, VLM, Qwen2.5, Llama3.1 int4 models by wenhuach21 in https://github.com/intel/auto-round/pull/356
* fix awq exporting by wenhuach21 in https://github.com/intel/auto-round/pull/358
* Tensor reshape bugfix by WeiweiZhang1 in https://github.com/intel/auto-round/pull/364
* fix awq backend and fp_layers issue by wenhuach21 in https://github.com/intel/auto-round/pull/363
* fix awq exporting bugs by wenhuach21 in https://github.com/intel/auto-round/pull/365
* fix bug of only_text_test check due to inference issue on cpu by n1ck-guo in https://github.com/intel/auto-round/pull/362
* add gpu test by wenhuach21 in https://github.com/intel/auto-round/pull/367
* using multicard when device set to "auto" by n1ck-guo in https://github.com/intel/auto-round/pull/368
* quant_block_names enhancement by WeiweiZhang1 in https://github.com/intel/auto-round/pull/369
* [HPU] Add lazy mode back by yiliu30 in https://github.com/intel/auto-round/pull/371
* remove bias exporting if possible in autogptq format by wenhuach21 in https://github.com/intel/auto-round/pull/375
* save processor automatically by n1ck-guo in https://github.com/intel/auto-round/pull/372
* Add gpu ut by wenhuach21 in https://github.com/intel/auto-round/pull/370
* fix gpu ut by n1ck-guo in https://github.com/intel/auto-round/pull/376
* fix typos by wenhuach21 in https://github.com/intel/auto-round/pull/377

**Full Changelog**: https://github.com/intel/auto-round/compare/v0.4.1...v0.4.2

0.4.1

Highlights:
- Fixed vllm calibration infinite loop issue
- Corrected the default value for the sym argument in the API configuration.

What's Changed
* fix typo by wenhuach21 in https://github.com/intel/auto-round/pull/342
* vllm/llama-vision llava calibration infinite loop fix by WeiweiZhang1 in https://github.com/intel/auto-round/pull/343
* [HPU]Enhance `numba` check by yiliu30 in https://github.com/intel/auto-round/pull/345
* [VLM]fix bs and grad reset by n1ck-guo in https://github.com/intel/auto-round/pull/344
* [HPU]Enhance installation check by yiliu30 in https://github.com/intel/auto-round/pull/346
* [Critical Bug]API use sym as default by wenhuach21 in https://github.com/intel/auto-round/pull/349
* triton backend requires< 3.0 by wenhuach21 in https://github.com/intel/auto-round/pull/348

**Full Changelog**: https://github.com/intel/auto-round/compare/v0.4...v0.4.1

Page 1 of 2

Releases

Has known vulnerabilities

Auto-round

Page 1 of 2

0.4.6

0.4.5

0.4.4

0.4.3

0.4.2

0.4.1

Page 1 of 2

Links

Releases