What's Changed (First Release since AutoGPTQ fork)
4 New Models plus `sym=False` asymmetry and `lm_head` quantized inference support.
* ✨ [FEATURE/BUG] `sym=false` support by qwopqwop200, Qubitium, fxmarty
* ✨ [FEATURE] `lm_head` quantization inference by Qubitium
* 🚀 [MODEL] ChatGLM by LRL-ModelCloud Qubitium
* 🚀 [MODEL] MiniCPM model support by LDLINGLINGLING, Qubitium in https://github.com/ModelCloud/GPTQModel/pull/18
* 🚀 [MODEL] Phi-3 model support by davidgxue, ZX-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/27
* 🚀 [MODEL] QwenMoE model support by bozheng-hit, LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/24
* 🚀 [CORE] Faster quantization and better quality (PPL) quant by Qubitium
* 👾[BUG] H100 crash by Qubitium
* 👾[BUG] Packing perf regression on high core-count systems by Qubitium
* 🚀 [REFRACTOR] Major refractor and code debloat by Qubitium
* 🤖 [CI] Code quality by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/31
* 🤖 [CI] Add Perplexity regression test by LRL-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/1
* 🤖 [CI] Add Runner by CSY-ModelCloud in https://github.com/ModelCloud/GPTQModel/pull/3
**Full Changelog**: https://github.com/ModelCloud/GPTQModel/commits/v0.9.0