Autoawq

Latest version: v0.2.8

Safety actively analyzes 723177 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 5

0.1.8

What's Changed
* Fix MPT by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/206
* Add config to Base model by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/207
* Add Qwen model by Sanster in https://github.com/casper-hansen/AutoAWQ/pull/182
* Robust quantization for Catcher by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/209
* New scaling to improve perplexity by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/216
* Benchmark hf generate by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/237
* Fix position ids by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/215
* Pass `model_init_kwargs` to `check_and_get_model_type` function by rycont in https://github.com/casper-hansen/AutoAWQ/pull/232
* Fixed an issue where the Qwen model had too much error after quantization by jundolc in https://github.com/casper-hansen/AutoAWQ/pull/243
* Load on CPU to avoid OOM by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/236
* Update README.md by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/245
* [`core`] Make AutoAWQ fused modules compatible with HF transformers by younesbelkada in https://github.com/casper-hansen/AutoAWQ/pull/244
* [`core`] Fix quantization issues with transformers==4.36.0 by younesbelkada in https://github.com/casper-hansen/AutoAWQ/pull/249
* FEAT: Add possibility of skipping modules when quantizing by younesbelkada in https://github.com/casper-hansen/AutoAWQ/pull/248
* Fix quantization issue with transformers >= 4.36.0 by younesbelkada in https://github.com/casper-hansen/AutoAWQ/pull/264
* Mixtral: Mixture of Experts quantization by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/251
* Fused rope theta by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/270
* FEAT: add llava to autoawq by younesbelkada in https://github.com/casper-hansen/AutoAWQ/pull/250
* Add Baichuan2 Support by AoyuQC in https://github.com/casper-hansen/AutoAWQ/pull/247
* Set default rope_theta on LlamaLikeBlock by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/271
* Update news and models supported by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/272
* Add vLLM async example by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/273
* Bump to v0.1.8 by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/274

New Contributors
* Sanster made their first contribution in https://github.com/casper-hansen/AutoAWQ/pull/182
* rycont made their first contribution in https://github.com/casper-hansen/AutoAWQ/pull/232
* jundolc made their first contribution in https://github.com/casper-hansen/AutoAWQ/pull/243
* AoyuQC made their first contribution in https://github.com/casper-hansen/AutoAWQ/pull/247

**Full Changelog**: https://github.com/casper-hansen/AutoAWQ/compare/v0.1.7...v0.1.8

0.1.7

What's Changed
* Build older cuda wheels by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/158
* Exclude download of CUDA wheels by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/159
* New benchmarks in README by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/160
* Fix typo in benchmark command by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/161
* Yi support by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/167
* Make sure to delete dummy model by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/180
* Fix CUDA error: invalid argument by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/179
* New logic for passing past_key_value by younesbelkada in https://github.com/casper-hansen/AutoAWQ/pull/177
* Reset cache on new generation by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/178
* Adaptive batch sizing by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/181
* Pass arguments to AutoConfig by s4rduk4r in https://github.com/casper-hansen/AutoAWQ/pull/97
* Fix cache util logic by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/186
* Fix multi-GPU loading and inference by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/190
* [`core`] Replace `QuantLlamaMLP` with `QuantFusedMLP` by younesbelkada in https://github.com/casper-hansen/AutoAWQ/pull/188
* [`core`] Add `is_hf_transformers` flag by younesbelkada in https://github.com/casper-hansen/AutoAWQ/pull/195
* Fixed multi-GPU quantization by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/196


**Full Changelog**: https://github.com/casper-hansen/AutoAWQ/compare/v0.1.6...v0.1.7

0.1.6

What's Changed
* Pseudo dequantize function by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/127
* CUDA 11.8.0 and 12.1.1 build by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/128
* AwqConfig class by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/132
* Fix init quant by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/136
* Update readme by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/137
* Benchmark info by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/138
* Bump to v0.1.6 by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/139
* CUDA 12 release by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/140
* Revert to previous version by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/141
* Fix performance regression by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/148
* [`core` / `attention`] Fix fused attention generation with newest transformers version by younesbelkada in https://github.com/casper-hansen/AutoAWQ/pull/146
* Fix condition when rolling cache by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/150
* Default to safetensors for quantized models by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/151
* Create fused LlamaLikeModel by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/152


**Full Changelog**: https://github.com/casper-hansen/AutoAWQ/compare/v0.1.5...v0.1.6

0.1.5

What's Changed
* Only apply attention mask if seqlen is greater than 1 by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/96
* add gpt_neox support by twaka in https://github.com/casper-hansen/AutoAWQ/pull/113
* [`core`] Support fp32 / bf16 inference by younesbelkada in https://github.com/casper-hansen/AutoAWQ/pull/121
* Fix potential overflow by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/102
* Fixing starcoder based models with 15B by SebastianBodza in https://github.com/casper-hansen/AutoAWQ/pull/118
* Support Aquila models. by ftgreat in https://github.com/casper-hansen/AutoAWQ/pull/123
* Add benchmark of Aquila2 34B AWQ in README.md. by ftgreat in https://github.com/casper-hansen/AutoAWQ/pull/126

New Contributors
* twaka made their first contribution in https://github.com/casper-hansen/AutoAWQ/pull/113
* younesbelkada made their first contribution in https://github.com/casper-hansen/AutoAWQ/pull/121
* SebastianBodza made their first contribution in https://github.com/casper-hansen/AutoAWQ/pull/118
* ftgreat made their first contribution in https://github.com/casper-hansen/AutoAWQ/pull/123

**Full Changelog**: https://github.com/casper-hansen/AutoAWQ/compare/v0.1.4...v0.1.5

0.1.4

What's Changed
* Refactor cache and embedding modules by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/95
* Fix `TypeError: 'NoneType' object is not subscriptable`


**Full Changelog**: https://github.com/casper-hansen/AutoAWQ/compare/v0.1.3...v0.1.4

0.1.3

What's Changed
* Turing inference support (Colab+Kaggle working) by casper-hansen in https://github.com/casper-hansen/AutoAWQ/pull/92
* Fix memory bug (save 2GB VRAM)

**Full Changelog**: https://github.com/casper-hansen/AutoAWQ/compare/v0.1.2...v0.1.3

Page 3 of 5

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.