Qllm

Latest version: v0.1.8

Safety actively analyzes 623704 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 2

0.1.8

What's Changed
* Update README.md by wejoncy in https://github.com/wejoncy/QLLM/pull/102
* buf fix. by wejoncy in https://github.com/wejoncy/QLLM/pull/103
* Onnx fix qzeros odd-shape by wejoncy in https://github.com/wejoncy/QLLM/pull/104
* Refactor by wejoncy in https://github.com/wejoncy/QLLM/pull/105
* support `MARLIN` pack_mode by wejoncy in https://github.com/wejoncy/QLLM/pull/106
* support awq sym by wejoncy in https://github.com/wejoncy/QLLM/pull/107
* Refactor by wejoncy in https://github.com/wejoncy/QLLM/pull/108

**Full Changelog**: https://github.com/wejoncy/QLLM/compare/v0.1.7.1...v0.1.8

0.1.7.1

What's Changed
* fix "disable win in release by wejoncy in https://github.com/wejoncy/QLLM/pull/98
* minor fix and dataset speed by wejoncy in https://github.com/wejoncy/QLLM/pull/99
* minor fix by wejoncy in https://github.com/wejoncy/QLLM/pull/100
* patch release v0.1.7.1 by wejoncy in https://github.com/wejoncy/QLLM/pull/101

**Full Changelog**: https://github.com/wejoncy/QLLM/compare/v0.1.7...v0.1.7.1

0.1.7

What's Changed
* ort ops support in main branch with act_order by wejoncy in https://github.com/wejoncy/QLLM/pull/92
* support export hqq to onnx by wejoncy in https://github.com/wejoncy/QLLM/pull/93
* Bump to 0.1.7 by wejoncy in https://github.com/wejoncy/QLLM/pull/94
* improve .cpu() with non_blocking by wejoncy in https://github.com/wejoncy/QLLM/pull/95
* disable win in release by wejoncy in https://github.com/wejoncy/QLLM/pull/96
* refactor args by wejoncy in https://github.com/wejoncy/QLLM/pull/97

**Full Changelog**: https://github.com/wejoncy/QLLM/compare/v0.1.6...v0.1.7

0.1.6

What's Changed
* illegal memory access by wejoncy in https://github.com/wejoncy/QLLM/pull/71
* Format by wejoncy in https://github.com/wejoncy/QLLM/pull/72
* Hqq support by wejoncy in https://github.com/wejoncy/QLLM/pull/73
* [feature]fast build support [a few seconds to build and install qllm ] by wejoncy in https://github.com/wejoncy/QLLM/pull/74
* minor fix for detecting ort_ops and torch.compile by wejoncy in https://github.com/wejoncy/QLLM/pull/75
* static_groups by wejoncy in https://github.com/wejoncy/QLLM/pull/76
* minor Fix (cudaguard) by wejoncy in https://github.com/wejoncy/QLLM/pull/77
* fix cudaguard by wejoncy in https://github.com/wejoncy/QLLM/pull/78
* ruff Format by wejoncy in https://github.com/wejoncy/QLLM/pull/79
* move ops into qllm by wejoncy in https://github.com/wejoncy/QLLM/pull/80
* fix memory layout in QuantizeLinear by yufenglee in https://github.com/wejoncy/QLLM/pull/82
* add continuous check for ort kernel by wejoncy in https://github.com/wejoncy/QLLM/pull/84
* Ort fix by wejoncy in https://github.com/wejoncy/QLLM/pull/85
* more general Ort ops export by wejoncy in https://github.com/wejoncy/QLLM/pull/86
* 0.1.6.dev by wejoncy in https://github.com/wejoncy/QLLM/pull/87
* 0.1.6 by wejoncy in https://github.com/wejoncy/QLLM/pull/88
* speed up ort node packing by wejoncy in https://github.com/wejoncy/QLLM/pull/89
* fix attn_implementation by wejoncy in https://github.com/wejoncy/QLLM/pull/90

New Contributors
* yufenglee made their first contribution in https://github.com/wejoncy/QLLM/pull/82

**Full Changelog**: https://github.com/wejoncy/QLLM/compare/v0.1.5...v0.1.6

0.1.5

What's Changed
* works on windows, set dtype is importang by wejoncy in https://github.com/wejoncy/QLLM/pull/54
* use_heuristic=false default for models having hard to predict unquantized layers like mixtral-8x7b by wejoncy in https://github.com/wejoncy/QLLM/pull/55
* add mixtral in readme example by wejoncy in https://github.com/wejoncy/QLLM/pull/56
* bugfix when export 16bit model by wejoncy in https://github.com/wejoncy/QLLM/pull/57
* Fix build err, uint32_t is not defined. <stdint.h> by wejoncy in https://github.com/wejoncy/QLLM/pull/58
* dp kernel support g_idx by wejoncy in https://github.com/wejoncy/QLLM/pull/59
* [important] packing improve, faster by wejoncy in https://github.com/wejoncy/QLLM/pull/60
* [improve packing]fix for awq unpack by wejoncy in https://github.com/wejoncy/QLLM/pull/61
* 3bit support with g_idx in dq_kernel by wejoncy in https://github.com/wejoncy/QLLM/pull/63
* 3bit fix by wejoncy in https://github.com/wejoncy/QLLM/pull/64
* 0.1.5.dev by wejoncy in https://github.com/wejoncy/QLLM/pull/65
* onnx support Act_order && some onnx fix by wejoncy in https://github.com/wejoncy/QLLM/pull/66
* Support gemv with g_idx and some fix in exporter/dataloader by wejoncy in https://github.com/wejoncy/QLLM/pull/67
* support mixtral in gptq/awq by wejoncy in https://github.com/wejoncy/QLLM/pull/68
* minor fix for act_order detect by wejoncy in https://github.com/wejoncy/QLLM/pull/70
* Bump version to 0.1.5 by wejoncy in https://github.com/wejoncy/QLLM/pull/69

**Full Changelog**: https://github.com/wejoncy/QLLM/compare/v0.1....v0.1.5

0.1.4

What's Changed
* suport Phi, detect multi blocks by wejoncy in https://github.com/wejoncy/QLLM/pull/43
* quick fix by wejoncy in https://github.com/wejoncy/QLLM/pull/44
* add colab example && turing support for awq && remove dependency of xbitops by wejoncy in https://github.com/wejoncy/QLLM/pull/46
* quick fix for meta device by wejoncy in https://github.com/wejoncy/QLLM/pull/47
* add trust code by wejoncy in https://github.com/wejoncy/QLLM/pull/48
* fix trust_code by wejoncy in https://github.com/wejoncy/QLLM/pull/49
* quick fix for turing awq 75 by wejoncy in https://github.com/wejoncy/QLLM/pull/50
* fix low_cpu_mem_usage by wejoncy in https://github.com/wejoncy/QLLM/pull/51
* fix model dtype ,default half by wejoncy in https://github.com/wejoncy/QLLM/pull/52

**Full Changelog**: https://github.com/wejoncy/QLLM/compare/v0.1.3...v0.1.4

Page 1 of 2

Releases

Has known vulnerabilities

Qllm

Page 1 of 2

0.1.8

0.1.7.1

0.1.7

0.1.6

0.1.5

0.1.4

Page 1 of 2

Links

Releases