Llmcompressor

Latest version: v0.4.1

Safety actively analyzes 722491 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

0.4.1

What's Changed
* Remove version by dsikka in https://github.com/vllm-project/llm-compressor/pull/1077
* Require 'ready' label for transformers tests by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/1079
* GPTQModifier Nits and Code Clarity by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1068
* Also run on pushes to `main` by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/1083
* VLM: Phi3 Vision Example by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1032
* VLM: Qwen2_VL Example by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1027
* Composability with sparse and quantization compressors by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/948
* Remove `TraceableMistralForCausalLM` by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1052
* [Fix Test Failure]: Propagate name change to test by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1088
* [Audio] Support Audio Datasets by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1085
* [Test Fix] Add Quantization then finetune tests by horheynm in https://github.com/vllm-project/llm-compressor/pull/964
* [Smoothquant] Phi3 Vision Mappings by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1089
* [VLM] Multimodal Data Collator by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1087
* VLM: Model Tracing Guide by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1030
* Turn off 2:4 sparse compression until supported in vllm by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1092
* [Test Fix] Fix Consecutive oneshot by horheynm in https://github.com/vllm-project/llm-compressor/pull/971
* [Bug Fix] Fix test that requre GPU by horheynm in https://github.com/vllm-project/llm-compressor/pull/1096
* Add Idefics3/SmolVLM quant support via traceable class by leon-seidel in https://github.com/vllm-project/llm-compressor/pull/1095
* Traceability Guide: Clarity and typo by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1099
* [VLM] Examples README by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1057
* Raise warning for 24 compressed sparse-only models by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1107
* Remove log_model_load by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1016
* Return empty sparsity config if targets and ignores are empty by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1115
* Remove uses of get_observer by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/939
* FSDP utils cleanup by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/854
* Update maintainers, add notice by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1091
* Replace readme paths with urls by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1097
* GPTQ add Arkiv link, move file location by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1100
* Extend `remove_hooks` to remove subsets by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1021
* [Audio] Whisper Example and Readme by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1106
* [Audio] Add whisper fp8 dynamic example by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1111
* [VLM] Update pixtral data collator to reflect latest transformers changes by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1116
* Use unique test names in `TestvLLM` by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/1124
* Remove smoothquant from examples by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1121
* Extend `disable_hooks` to keep subsets by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1023
* Unpin `pynvml` to fix e2e test failures with vLLM by dsikka in https://github.com/vllm-project/llm-compressor/pull/1125
* Replace LayerCompressor with HooksMixin by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1038
* [Oneshot Refactor] Rename get_shared_processor_src to get_processor_name_from_model by horheynm in https://github.com/vllm-project/llm-compressor/pull/1108
* Allow Shortcutting Min-max Observer by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/887
* [Polish] Remove unused code by horheynm in https://github.com/vllm-project/llm-compressor/pull/1128
* Properly restore training mode with `eval_context` by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1126
* SQ and QM: Remove `torch.cuda.empty_cache`, use `calibration_forward_context` by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1114
* [Oneshot Refactor] dataclass Arguments by horheynm in https://github.com/vllm-project/llm-compressor/pull/1103
* [Bugfix] SparseGPT, Pipelines by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1130
* [Oneshot refactor] Refactor initialize_model_from_path by horheynm in https://github.com/vllm-project/llm-compressor/pull/1109
* [e2e] Update vllm tests with additional datasets by brian-dellabetta in https://github.com/vllm-project/llm-compressor/pull/1131
* Update: SparseGPT recipes by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1142
* Add timer support for testing by dsikka in https://github.com/vllm-project/llm-compressor/pull/1137
* [Audio] Support Whisper V3 by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1147
* Fix: Re-enable Sparse Compression for 2of4 Examples by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1153
* [VLM] Add caption to flickr dataset by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1138
* [VLM] Update mllama traceable definition by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1140
* Fix CPU Offloading by dsikka in https://github.com/vllm-project/llm-compressor/pull/1159
* [TRL_SFT_Trainer] Fix and Update Examples code by horheynm in https://github.com/vllm-project/llm-compressor/pull/1161
* [TRL_SFT_Trainer] Fix TRL-SFT Distillation Training by horheynm in https://github.com/vllm-project/llm-compressor/pull/1163
* Bump version for patch release by dsikka in https://github.com/vllm-project/llm-compressor/pull/1166
* Update DeepSeek Examples by dsikka in https://github.com/vllm-project/llm-compressor/pull/1175
* Update gemma2 examples with a note about sample generation by dsikka in https://github.com/vllm-project/llm-compressor/pull/1176

New Contributors
* leon-seidel made their first contribution in https://github.com/vllm-project/llm-compressor/pull/1095

**Full Changelog**: https://github.com/vllm-project/llm-compressor/compare/0.4.0...0.4.1

0.4.0

What's Changed
* Record config file name as test suite property by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/947
* Update setup.py by dsikka in https://github.com/vllm-project/llm-compressor/pull/975
* Depreciate OBCQ Helpers by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/977
* KV Cache, E2E Tests by horheynm in https://github.com/vllm-project/llm-compressor/pull/742
* Use 1 GPU for offloading examples by dsikka in https://github.com/vllm-project/llm-compressor/pull/979
* Replace tokenizer with processor by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/955
* Revert "KV Cache, E2E Tests (742)" by dsikka in https://github.com/vllm-project/llm-compressor/pull/989
* Fix SmoothQuant offload bug by dsikka in https://github.com/vllm-project/llm-compressor/pull/978
* Add LM Eval Configs by dsikka in https://github.com/vllm-project/llm-compressor/pull/980
* Fix `test_model_reload` test by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1005
* Calibration and Compression Contexts by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/998
* Add info for clarity by dsikka in https://github.com/vllm-project/llm-compressor/pull/1009
* [Bugfix] Pass `trust_remote_code_model=True` for deepseek examples by dsikka in https://github.com/vllm-project/llm-compressor/pull/1012
* Vision Datasets by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/943
* Add example for fp8 kv cache of phi3.5 and gemma2 by mgoin in https://github.com/vllm-project/llm-compressor/pull/991
* Update ReadMe and test for cpu_offloading by dsikka in https://github.com/vllm-project/llm-compressor/pull/1013
* Adding amdsmi for AMD gpus by citrix123 in https://github.com/vllm-project/llm-compressor/pull/1018
* CompressionLogger add time units by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1026
* patch_tied_tensors_bug: support malformed model definitions by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1014
* Add: 2of4 example with/without fp8 quantization by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1033
* Remove unccessary step in 2of4 Example by dsikka in https://github.com/vllm-project/llm-compressor/pull/1034
* Remove Neural Magic copyright from files by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/992
* VLM Support via GPTQ Hooks and Data Pipelines by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/914
* [E2E Testing] KV-Cache by horheynm in https://github.com/vllm-project/llm-compressor/pull/1004
* [E2E Testing] Add recipe check vllm e2e by horheynm in https://github.com/vllm-project/llm-compressor/pull/929
* [MoE] GPTQ compress using callback not hook by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1049
* Explicit dataset tokenizer `text` kwarg by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1031
* Fix smoothquant ignore, Fix typing, Add glm mappings by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1015
* [Test Fix] Quant model reload by horheynm in https://github.com/vllm-project/llm-compressor/pull/974
* Remove old examples by dsikka in https://github.com/vllm-project/llm-compressor/pull/1062
* VLM: Fix typo bug in TraceableLlavaForConditionalGeneration by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1065
* Add tests for "examples/sparse_2of4_[...]" by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/1067
* VLM Image Examples by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1064
* Add quick warning for DeepSeek with transformers 4.48.0 by dsikka in https://github.com/vllm-project/llm-compressor/pull/1066
* [KV Cache] kv-cache end to end unit tests by horheynm in https://github.com/vllm-project/llm-compressor/pull/141
* [E2E Testing] Fix HF upload by horheynm in https://github.com/vllm-project/llm-compressor/pull/1061
* [Test Fix] Fix/update test_run_compressed by horheynm in https://github.com/vllm-project/llm-compressor/pull/970
* Revert "[Test Fix] Fix/update test_run_compressed" by mgoin in https://github.com/vllm-project/llm-compressor/pull/1071
* Sparse 2:4 + FP8 Quantization e2e vLLM tests by dsikka in https://github.com/vllm-project/llm-compressor/pull/1073
* [Test Patch] Remove redundant code for "Fix/update test_run_compressed" by horheynm in https://github.com/vllm-project/llm-compressor/pull/1072
* bump; set ct version by dsikka in https://github.com/vllm-project/llm-compressor/pull/1076

New Contributors
* citrix123 made their first contribution in https://github.com/vllm-project/llm-compressor/pull/1018

**Full Changelog**: https://github.com/vllm-project/llm-compressor/compare/0.3.1...0.4.0

0.3.1

What's Changed
* BLOOM Default Smoothquant Mappings by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/906
* [SparseAutoModelForCausalLM Deprecation] Feature change by horheynm in https://github.com/vllm-project/llm-compressor/pull/881
* Correct "dyanmic" typo by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/888
* Explicit defaults for QuantizationModifier targets by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/889
* [SparseAutoModelForCausalLM Deprecation] Update examples by horheynm in https://github.com/vllm-project/llm-compressor/pull/880
* Support pack_quantized format for nonuniform mixed-precision by mgoin in https://github.com/vllm-project/llm-compressor/pull/913
* Actually make the `run_compressed` test useful by dsikka in https://github.com/vllm-project/llm-compressor/pull/920
* Fix for e2e tests by horheynm in https://github.com/vllm-project/llm-compressor/pull/927
* [Bugfix] Correct metrics calculations by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/878
* Update kv_cache example by dsikka in https://github.com/vllm-project/llm-compressor/pull/921
* [1/2] Expand e2e testing to prepare for lm-eval by dsikka in https://github.com/vllm-project/llm-compressor/pull/922
* Update pytest command to capture results to file by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/932
* [Bugfix] DisableKVCache Context by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/834
* Add helpful info to the marlin-24 example by dsikka in https://github.com/vllm-project/llm-compressor/pull/946
* Remove requires_torch by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/949
* Remove unused sparseml.export utilities by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/950
* Implement HooksMixin by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/917
* Add LM Eval Testing by dsikka in https://github.com/vllm-project/llm-compressor/pull/945
* update version by dsikka in https://github.com/vllm-project/llm-compressor/pull/969


**Full Changelog**: https://github.com/vllm-project/llm-compressor/compare/0.3.0...0.3.1

0.3.0

Key Features and Improvements
- **GPTQ Quantized-weight Sequential Updating** ([177](https://github.com/vllm-project/llm-compressor/pull/177)): Introduced an efficient sequential updating mechanism for GPTQ quantization, improving model compression performance and compatibility.
- **Auto-Infer Mappings for SmoothQuantModifier** ([119](https://github.com/vllm-project/llm-compressor/pull/119)): Automatically infers `mappings` based on model architecture, making SmoothQuant easier to apply across various models.
- **Improved Sparse Compression Usability** ([191](https://github.com/vllm-project/llm-compressor/pull/191)): Added support for targeted sparse compression with specific ignore rules during inference, allowing for more flexible model configurations.
- **Generic Wrapper for Any Hugging Face Model** ([185](https://github.com/vllm-project/llm-compressor/pull/185)): Added `wrap_hf_model_class` utility, enabling better support and integration for Hugging Face models i.e. not based on `AutoModelForCausalLM`.
- **Observer Restructure** ([837](https://github.com/vllm-project/llm-compressor/pull/837)): Introduced calibration and frozen steps within `QuantizationModifier`, moving Observers from compressed-tensors to llm-compressor.

Bug Fixes
- **Fix Tied Tensors Bug** ([659](https://github.com/vllm-project/llm-compressor/pull/659))
- **Observer Initialization in GPTQ Wrapper** ([883](https://github.com/vllm-project/llm-compressor/pull/883))
- **Sparsity Reload Testing** ([882](https://github.com/vllm-project/llm-compressor/pull/882))

Documentation
- **Updated SmoothQuant Tutorial** ([115](https://github.com/vllm-project/llm-compressor/pull/115)): Expanded SmoothQuant documentation to include detailed mappings for easier implementation.


What's Changed
* Fix compresed typo by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/188
* GPTQ Quantized-weight Sequential Updating by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/177
* Add: targets and ignore inference for sparse compression by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/191
* switch tests from weekly to nightly by dhuangnm in https://github.com/vllm-project/llm-compressor/pull/658
* Compression wrapper abstract methods by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/170
* Explicitly set sequential_update in examples by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/187
* Increase Sparsity Threshold for compressors by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/679
* Add a generic `wrap_hf_model_class` utility to support VLMs by mgoin in https://github.com/vllm-project/llm-compressor/pull/185
* Add tests for examples by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/149
* Rename to quantization config by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/730
* Implement Missing Modifier Methods by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/166
* Fix 2/4 GPTQ Model Tests by dsikka in https://github.com/vllm-project/llm-compressor/pull/769
* SmoothQuant mappings tutorial by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/115
* Fix import of `ModelCompressor` by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/776
* update test by dsikka in https://github.com/vllm-project/llm-compressor/pull/773
* [Bugfix] Fix saving offloaded state dict by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/172
* Auto-Infer `mappings` Argument for `SmoothQuantModifier` Based on Model Architecture by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/119
* Update workflows/actions by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/774
* [Bugfix] Prepare KD Models when Saving by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/174
* Set Sparse compression to save_compressed by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/821
* Install compressed-tensors after llm-compressor by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/825
* Fix test typo by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/828
* Add `AutoModelForCausalLM` example by dsikka in https://github.com/vllm-project/llm-compressor/pull/698
* [Bugfix] Workaround tied tensors bug by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/659
* Only untie word embeddings by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/839
* Check for config hidden size by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/840
* Use float32 for Hessian dtype by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/847
* GPTQ: Depreciate non-sequential update option by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/762
* Typehint nits by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/826
* [ DOC ] Remove version restrictions in W8A8 exmaple by miaojinc in https://github.com/vllm-project/llm-compressor/pull/849
* Fix inconsistence in example config of 2:4 sparse quantization by yzlnew in https://github.com/vllm-project/llm-compressor/pull/80
* Fix forward function pass call by dsikka in https://github.com/vllm-project/llm-compressor/pull/845
* [Bugfix] Use weight parameter of linear layer by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/836
* [Bugfix] Rename files to remove colons by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/846
* cover all 3.9-3.12 in commit testing by dhuangnm in https://github.com/vllm-project/llm-compressor/pull/864
* Add marlin-24 recipe/configs for e2e testing by dsikka in https://github.com/vllm-project/llm-compressor/pull/866
* [Bugfix] onload during sparsity calculation by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/862
* Fix HFTrainer overloads by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/869
* Support Model Offloading Tied Tensors Patch by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/872
* Add advice about dealing with non-invertable hessians by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/875
* seed commit workflow by andy-neuma in https://github.com/vllm-project/llm-compressor/pull/877
* [Observer Restructure]: Add Observers; Add `calibration` and `frozen` steps to `QuantizationModifier` by dsikka in https://github.com/vllm-project/llm-compressor/pull/837
* Bugfix observer initialization in `gptq_wrapper` by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/883
* BugFix: Fix Sparsity Reload Testing by dsikka in https://github.com/vllm-project/llm-compressor/pull/882
* Use custom unique test names for e2e tests by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/892
* Revert "Use custom unique test names for e2e tests (892)" by dsikka in https://github.com/vllm-project/llm-compressor/pull/893
* Move config["testconfig_path"] assignment by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/895
* Cap accelerate version to avoid bug by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/897
* Fix observing offloaded weight by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/896
* Update image in README.md by mgoin in https://github.com/vllm-project/llm-compressor/pull/861
* update accelerate version by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/899
* [GPTQ] Iterative Parameter Updating by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/863
* Small fixes for release by dsikka in https://github.com/vllm-project/llm-compressor/pull/901
* use smaller portion of dataset by dsikka in https://github.com/vllm-project/llm-compressor/pull/902
* Update example to not fail hessian inversion by dsikka in https://github.com/vllm-project/llm-compressor/pull/904
* Bump version to 0.3.0 by dsikka in https://github.com/vllm-project/llm-compressor/pull/907

New Contributors
* miaojinc made their first contribution in https://github.com/vllm-project/llm-compressor/pull/849
* yzlnew made their first contribution in https://github.com/vllm-project/llm-compressor/pull/80
* andy-neuma made their first contribution in https://github.com/vllm-project/llm-compressor/pull/877

**Full Changelog**: https://github.com/vllm-project/llm-compressor/compare/0.2.0...0.3.0

0.2.0

What's Changed
* Correct Typo in SparseAutoModelForCausalLM docstring by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/56
* Disable Default Bitmask Compression by Satrat in https://github.com/vllm-project/llm-compressor/pull/60
* TRL Example fix by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/59
* Fix typo by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/63
* Correct typo by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/61
* correct import in README.md by zzc0430 in https://github.com/vllm-project/llm-compressor/pull/66
* Fix for issue 43 -- starcoder model by horheynm in https://github.com/vllm-project/llm-compressor/pull/71
* Update README.md by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/74
* Layer by Layer Sequential GPTQ Updates by Satrat in https://github.com/vllm-project/llm-compressor/pull/47
* [ Docs ] Update main readme by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/77
* [ Docs ] `gemma2` examples by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/78
* [ Docs ] Update `FP8` example to use dynamic per token by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/75
* [ Docs ] Overhaul `accelerate` user guide by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/76
* Support `kv_cache_scheme` for quantizing KV Cache by mgoin in https://github.com/vllm-project/llm-compressor/pull/88
* Propagate `trust_remote_code` Argument by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/90
* Fix for issue 81 by horheynm in https://github.com/vllm-project/llm-compressor/pull/84
* Fix for issue 83 by horheynm in https://github.com/vllm-project/llm-compressor/pull/85
* [ DOC ] Big Model Example by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/99
* Enable obcq/finetune integration tests with `commit` cadence by dsikka in https://github.com/vllm-project/llm-compressor/pull/101
* metric logging on GPTQ path by horheynm in https://github.com/vllm-project/llm-compressor/pull/65
* Update test config files by dsikka in https://github.com/vllm-project/llm-compressor/pull/97
* remove workflows + update runners by dsikka in https://github.com/vllm-project/llm-compressor/pull/103
* metrics by horheynm in https://github.com/vllm-project/llm-compressor/pull/104
* add debug by horheynm in https://github.com/vllm-project/llm-compressor/pull/108
* Add FP8 KV Cache quant example by mgoin in https://github.com/vllm-project/llm-compressor/pull/113
* Add vLLM e2e tests by dsikka in https://github.com/vllm-project/llm-compressor/pull/117
* Fix style, fix noqa by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/123
* GPTQ Algorithm Cleanup by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/120
* GPTQ Activation Ordering by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/94
* demote recipe string initialization to debug and make more descriptive by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/116
* compressed-tensors main dependency for base-tests by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/125
* Set `ready` label for transformer tests; add message reminder on PR opened by dsikka in https://github.com/vllm-project/llm-compressor/pull/126
* Fix markdown check test by dsikka in https://github.com/vllm-project/llm-compressor/pull/127
* Naive Run Compressed Pt. 2 by Satrat in https://github.com/vllm-project/llm-compressor/pull/62
* Fix transformer test conditions by dsikka in https://github.com/vllm-project/llm-compressor/pull/131
* Run Compressed Tests by Satrat in https://github.com/vllm-project/llm-compressor/pull/132
* Correct typo by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/124
* Activation Ordering Strategies by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/121
* Fix README Issue by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/139
* update by dsikka in https://github.com/vllm-project/llm-compressor/pull/143
* Update finetune and oneshot tests by dsikka in https://github.com/vllm-project/llm-compressor/pull/114
* Validate Recipe Parsing Output by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/100
* fix build error for nightly by dhuangnm in https://github.com/vllm-project/llm-compressor/pull/145
* Fix recipe nested in configs by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/140
* MOE example with warning by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/87
* Bug Fix: recipe stages were not being concatenated by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/150
* fix package name bug for nightly by dhuangnm in https://github.com/vllm-project/llm-compressor/pull/155
* Add descriptions for pytest marks by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/156
* Fix Sparsity Unit Test by Satrat in https://github.com/vllm-project/llm-compressor/pull/153
* Fix: Error during model saving with shared tensors by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/158
* Update 2:4 Examples by dsikka in https://github.com/vllm-project/llm-compressor/pull/161
* DeepSeek: Fix Hessian Estimation by Satrat in https://github.com/vllm-project/llm-compressor/pull/157
* bump up main to 0.2.0 by dhuangnm in https://github.com/vllm-project/llm-compressor/pull/163
* Fix help dialogue by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/151
* Add MoE and Compressed Inference Examples by Satrat in https://github.com/vllm-project/llm-compressor/pull/160
* Separate `trust_remote_code` args by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/152
* Enable a skipped finetune test by dsikka in https://github.com/vllm-project/llm-compressor/pull/169
* Fix filename in example command by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/173
* Add DeepSeek V2.5 Example by dsikka in https://github.com/vllm-project/llm-compressor/pull/171
* fix quality by dsikka in https://github.com/vllm-project/llm-compressor/pull/176
* Patch log function name in gptq by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/168
* README for Modifiers by Satrat in https://github.com/vllm-project/llm-compressor/pull/165
* Fix default for sequential updates by dsikka in https://github.com/vllm-project/llm-compressor/pull/186
* fix default test case by dsikka in https://github.com/vllm-project/llm-compressor/pull/193
* Fix Initalize typo by Imss27 in https://github.com/vllm-project/llm-compressor/pull/190
* Update MoE examples by mgoin in https://github.com/vllm-project/llm-compressor/pull/192

New Contributors
* zzc0430 made their first contribution in https://github.com/vllm-project/llm-compressor/pull/66
* horheynm made their first contribution in https://github.com/vllm-project/llm-compressor/pull/71
* dsikka made their first contribution in https://github.com/vllm-project/llm-compressor/pull/101
* dhuangnm made their first contribution in https://github.com/vllm-project/llm-compressor/pull/145
* Imss27 made their first contribution in https://github.com/vllm-project/llm-compressor/pull/190

**Full Changelog**: https://github.com/vllm-project/llm-compressor/compare/0.1.0...0.2.0

0.1.0

What's Changed
* Address Test Failures by Satrat in https://github.com/vllm-project/llm-compressor/pull/1
* Remove SparseZoo Usage by Satrat in https://github.com/vllm-project/llm-compressor/pull/2
* SparseML Cleanup by markurtz in https://github.com/vllm-project/llm-compressor/pull/6
* Remove all references to Neural Magic copyright within LLM Compressor by markurtz in https://github.com/vllm-project/llm-compressor/pull/7
* Add FP8 Support by Satrat in https://github.com/vllm-project/llm-compressor/pull/4
* Fix Weekly Test Failure by Satrat in https://github.com/vllm-project/llm-compressor/pull/8
* Add Scheme UX for QuantizationModifier by Satrat in https://github.com/vllm-project/llm-compressor/pull/9
* Add Group Quantization Test Case by Satrat in https://github.com/vllm-project/llm-compressor/pull/10
* Loguru logging standardization for LLM Compressor by markurtz in https://github.com/vllm-project/llm-compressor/pull/11
* Clarify Function Names for Logging by Satrat in https://github.com/vllm-project/llm-compressor/pull/12
* [ Examples ] E2E Examples by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/5
* Update setup.py by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/15
* SmoothQuant Mapping Defaults by Satrat in https://github.com/vllm-project/llm-compressor/pull/13
* Initial README by bfineran in https://github.com/vllm-project/llm-compressor/pull/3
* [Bug] Fix validation errors for smoothquant modifier + update examples by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/19
* [MOE Quantization] Warn against "undercalibrated" modules by dbogunowicz in https://github.com/vllm-project/llm-compressor/pull/20
* Port SparseML Remote Code Fix by Satrat in https://github.com/vllm-project/llm-compressor/pull/21
* Update Quantization Save Defaults by Satrat in https://github.com/vllm-project/llm-compressor/pull/22
* [Bugfix] Add fix to preserve modifier order when passed as a list by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/26
* GPTQ - move calibration of quantiztion params to after hessian calibration by bfineran in https://github.com/vllm-project/llm-compressor/pull/25
* Fix typos by eldarkurtic in https://github.com/vllm-project/llm-compressor/pull/31
* Remove ceiling from `datasets` dep by mgoin in https://github.com/vllm-project/llm-compressor/pull/27
* Revert naive compression format by Satrat in https://github.com/vllm-project/llm-compressor/pull/32
* Fix layerwise targets by Satrat in https://github.com/vllm-project/llm-compressor/pull/36
* Move Weight Update Out Of Loop by Satrat in https://github.com/vllm-project/llm-compressor/pull/40
* Fix End Epoch Default by Satrat in https://github.com/vllm-project/llm-compressor/pull/39
* Fix typos in example for w8a8 quant by eldarkurtic in https://github.com/vllm-project/llm-compressor/pull/38
* Model Offloading Support Pt 2 by Satrat in https://github.com/vllm-project/llm-compressor/pull/34
* set version to 1.0.0 for release by bfineran in https://github.com/vllm-project/llm-compressor/pull/44
* Update version for first release by markurtz in https://github.com/vllm-project/llm-compressor/pull/50
* BugFix: Update TRL example scripts to point to the right SFTTrainer by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/51
* Update examples/quantization_24_sparse_w4a16 README by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/52
* Fix Failing Transformers Tests by Satrat in https://github.com/vllm-project/llm-compressor/pull/53
* Offloading Bug Fix by Satrat in https://github.com/vllm-project/llm-compressor/pull/58

New Contributors
* markurtz made their first contribution in https://github.com/vllm-project/llm-compressor/pull/6
* bfineran made their first contribution in https://github.com/vllm-project/llm-compressor/pull/3
* dbogunowicz made their first contribution in https://github.com/vllm-project/llm-compressor/pull/20
* eldarkurtic made their first contribution in https://github.com/vllm-project/llm-compressor/pull/31
* mgoin made their first contribution in https://github.com/vllm-project/llm-compressor/pull/27
* dbarbuzzi made their first contribution in https://github.com/vllm-project/llm-compressor/pull/52

**Full Changelog**: https://github.com/vllm-project/llm-compressor/commits/0.1.0

Links

Releases

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.