Llmcompressor

Latest version: v0.5.0

Safety actively analyzes 723882 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 2

0.5.0

What's Changed
* re-add vllm e2e test now that bug is fixed by brian-dellabetta in https://github.com/vllm-project/llm-compressor/pull/1162
* Fix Readme Imports by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1165
* Remove event_called by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1155
* Update: Test name by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1172
* Remove lifecycle initialized_structure attribute by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1156
* [VLM] Qwen 2.5 VL by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1113
* Revert bump by dsikka in https://github.com/vllm-project/llm-compressor/pull/1178
* Remove CLI by dsikka in https://github.com/vllm-project/llm-compressor/pull/1144
* Add group act order case to lm_eval test by dsikka in https://github.com/vllm-project/llm-compressor/pull/1080
* Update e2e test timings ouputs by dsikka in https://github.com/vllm-project/llm-compressor/pull/1179
* [Oneshot Refactor] Main refactor by horheynm in https://github.com/vllm-project/llm-compressor/pull/1110
* [StageRunner Removal] Remove Evalulate / validate pathway by horheynm in https://github.com/vllm-project/llm-compressor/pull/1145
* [StageRemoval] Remove Predict pathway by horheynm in https://github.com/vllm-project/llm-compressor/pull/1146
* Fix 2of4 Apply Example by dsikka in https://github.com/vllm-project/llm-compressor/pull/1181
* Fix Sparse2of4 Example by dsikka in https://github.com/vllm-project/llm-compressor/pull/1182
* Add qwen moe w4a16 example by mgoin in https://github.com/vllm-project/llm-compressor/pull/1186
* [Callbacks] Consolidate Saving Methods by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1168
* lmeval tests multimodal by brian-dellabetta in https://github.com/vllm-project/llm-compressor/pull/1150
* [Dataset Performance] Add num workers on dataset processing - labels, tokenization by horheynm in https://github.com/vllm-project/llm-compressor/pull/1189
* Fix a minor typo by eldarkurtic in https://github.com/vllm-project/llm-compressor/pull/1191
* [Callbacks] Remove pre_initialize_structure by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1160
* Make `transformers-tests` job conditional on files changed by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/1197
* Update finetune tests to decrease execution time by dsikka in https://github.com/vllm-project/llm-compressor/pull/1208
* Update transformers tests to speed-up execution by dsikka in https://github.com/vllm-project/llm-compressor/pull/1211
* Fix logging bug in oneshot.py by aman2304 in https://github.com/vllm-project/llm-compressor/pull/1213
* [Training] Decouple Argument parser by horheynm in https://github.com/vllm-project/llm-compressor/pull/1207
* Remove MonkeyPatch for GPUs by dsikka in https://github.com/vllm-project/llm-compressor/pull/1227
* [Cosmetic] Rename data_args to dataset_args by horheynm in https://github.com/vllm-project/llm-compressor/pull/1206
* [Training] Datasets - update Module by horheynm in https://github.com/vllm-project/llm-compressor/pull/1209
* [BugFix] Fix logging disabling bug and add tests by aman2304 in https://github.com/vllm-project/llm-compressor/pull/1218
* [Training] Unifying Preprocess + Postprocessing logic for Train/Oneshot by horheynm in https://github.com/vllm-project/llm-compressor/pull/1212
* [Docs] Add info on when to use which PTQ/Sparsification by horheynm in https://github.com/vllm-project/llm-compressor/pull/1157
* [Callbacks] Remove `MagnitudePruningModifier.leave_enabled` by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1198
* Replace Xenova model stub with nm-testing model stub by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1239
* Offload Cache Support torch.dtype by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1141
* Remove unused/duplicated/non-applicable utils from pytorch/utils/helpers by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1174
* [Bugfix] Staged 2of4 example by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1238
* wandb/tensorboard loggers set default init to False by brian-dellabetta in https://github.com/vllm-project/llm-compressor/pull/1235
* fixing reproducibility of lmeval tests by brian-dellabetta in https://github.com/vllm-project/llm-compressor/pull/1220
* [Audio] People's Speech dataset and tracer tool by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1086
* Use KV cache constant names provided by compressed tensors by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1200
* [Bugfix] Raise error for processor remote code by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1184
* Remove missing weights silencers in favor of HFQuantizer solution by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1017
* Fix run_compressed tests by dsikka in https://github.com/vllm-project/llm-compressor/pull/1246
* [Train] Training Pipeline by horheynm in https://github.com/vllm-project/llm-compressor/pull/1214
* [Tests] Increase maximum quantization error by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1245
* [Callbacks] Remove EventLifecycle and on_start event by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1170
* [Bugfix] Disable generation of deepseek models with transformers>=4.48 by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1259
* Remove clear_ml by dsikka in https://github.com/vllm-project/llm-compressor/pull/1261
* [Tests] Remove clear_ml test from GHA by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1265
* Remove click by dsikka in https://github.com/vllm-project/llm-compressor/pull/1262
* [Bugfix] Remove constant pruning from 2of4 examples by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1267
* Addback: ConstantPruningModifier for finetuning cases by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1272
* Remove docker by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1255
* move failing mulitmodal lmeval tests to skipped folder by brian-dellabetta in https://github.com/vllm-project/llm-compressor/pull/1273
* Replace tj-action/changed-files by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/1270
* [BugFix]: Sparse2of4 example sparsity-only case by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1282
* Revert "update" by dsikka in https://github.com/vllm-project/llm-compressor/pull/1296
* Fix Multi-Context Manager Syntax for Python 3.9 Compatibility by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1287
* Revert "Fix Multi-Context Manager Syntax for Python 3.9 Compatibility… by dsikka in https://github.com/vllm-project/llm-compressor/pull/1300
* [StageRunner] Stage Runner entrypoint and pipeline by horheynm in https://github.com/vllm-project/llm-compressor/pull/1202
* Bump: Min python version to 3.9 by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1288
* Keep quantization enabled during calibration by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1299
* [BugFix] TRL distillation bug fix by horheynm in https://github.com/vllm-project/llm-compressor/pull/1278
* Update: Readme for fp8 support by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1304
* [GPTQ] Add inversion fallback by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1283
* fix typo by eldarkurtic in https://github.com/vllm-project/llm-compressor/pull/1290
* [Tests] Fix oneshot + finetune test by passing splits to oneshot by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1316
* [Tests] Remove the `compress` entrypoint by dsikka in https://github.com/vllm-project/llm-compressor/pull/1317
* Fix Multi-Context Manager Syntax for Python 3.9 Compatibility by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1313
* [BugFix] Directly Convert Modifiers to Recipe Instance by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1271
* bump version, tag ct by dsikka in https://github.com/vllm-project/llm-compressor/pull/1318

New Contributors
* aman2304 made their first contribution in https://github.com/vllm-project/llm-compressor/pull/1213

**Full Changelog**: https://github.com/vllm-project/llm-compressor/compare/0.4.1...0.5.0

0.4.1

What's Changed
* Remove version by dsikka in https://github.com/vllm-project/llm-compressor/pull/1077
* Require 'ready' label for transformers tests by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/1079
* GPTQModifier Nits and Code Clarity by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1068
* Also run on pushes to `main` by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/1083
* VLM: Phi3 Vision Example by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1032
* VLM: Qwen2_VL Example by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1027
* Composability with sparse and quantization compressors by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/948
* Remove `TraceableMistralForCausalLM` by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1052
* [Fix Test Failure]: Propagate name change to test by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1088
* [Audio] Support Audio Datasets by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1085
* [Test Fix] Add Quantization then finetune tests by horheynm in https://github.com/vllm-project/llm-compressor/pull/964
* [Smoothquant] Phi3 Vision Mappings by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1089
* [VLM] Multimodal Data Collator by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1087
* VLM: Model Tracing Guide by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1030
* Turn off 2:4 sparse compression until supported in vllm by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1092
* [Test Fix] Fix Consecutive oneshot by horheynm in https://github.com/vllm-project/llm-compressor/pull/971
* [Bug Fix] Fix test that requre GPU by horheynm in https://github.com/vllm-project/llm-compressor/pull/1096
* Add Idefics3/SmolVLM quant support via traceable class by leon-seidel in https://github.com/vllm-project/llm-compressor/pull/1095
* Traceability Guide: Clarity and typo by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1099
* [VLM] Examples README by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1057
* Raise warning for 24 compressed sparse-only models by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1107
* Remove log_model_load by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1016
* Return empty sparsity config if targets and ignores are empty by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1115
* Remove uses of get_observer by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/939
* FSDP utils cleanup by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/854
* Update maintainers, add notice by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1091
* Replace readme paths with urls by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1097
* GPTQ add Arkiv link, move file location by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1100
* Extend `remove_hooks` to remove subsets by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1021
* [Audio] Whisper Example and Readme by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1106
* [Audio] Add whisper fp8 dynamic example by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1111
* [VLM] Update pixtral data collator to reflect latest transformers changes by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1116
* Use unique test names in `TestvLLM` by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/1124
* Remove smoothquant from examples by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1121
* Extend `disable_hooks` to keep subsets by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1023
* Unpin `pynvml` to fix e2e test failures with vLLM by dsikka in https://github.com/vllm-project/llm-compressor/pull/1125
* Replace LayerCompressor with HooksMixin by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1038
* [Oneshot Refactor] Rename get_shared_processor_src to get_processor_name_from_model by horheynm in https://github.com/vllm-project/llm-compressor/pull/1108
* Allow Shortcutting Min-max Observer by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/887
* [Polish] Remove unused code by horheynm in https://github.com/vllm-project/llm-compressor/pull/1128
* Properly restore training mode with `eval_context` by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1126
* SQ and QM: Remove `torch.cuda.empty_cache`, use `calibration_forward_context` by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1114
* [Oneshot Refactor] dataclass Arguments by horheynm in https://github.com/vllm-project/llm-compressor/pull/1103
* [Bugfix] SparseGPT, Pipelines by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1130
* [Oneshot refactor] Refactor initialize_model_from_path by horheynm in https://github.com/vllm-project/llm-compressor/pull/1109
* [e2e] Update vllm tests with additional datasets by brian-dellabetta in https://github.com/vllm-project/llm-compressor/pull/1131
* Update: SparseGPT recipes by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1142
* Add timer support for testing by dsikka in https://github.com/vllm-project/llm-compressor/pull/1137
* [Audio] Support Whisper V3 by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1147
* Fix: Re-enable Sparse Compression for 2of4 Examples by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1153
* [VLM] Add caption to flickr dataset by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1138
* [VLM] Update mllama traceable definition by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1140
* Fix CPU Offloading by dsikka in https://github.com/vllm-project/llm-compressor/pull/1159
* [TRL_SFT_Trainer] Fix and Update Examples code by horheynm in https://github.com/vllm-project/llm-compressor/pull/1161
* [TRL_SFT_Trainer] Fix TRL-SFT Distillation Training by horheynm in https://github.com/vllm-project/llm-compressor/pull/1163
* Bump version for patch release by dsikka in https://github.com/vllm-project/llm-compressor/pull/1166
* Update DeepSeek Examples by dsikka in https://github.com/vllm-project/llm-compressor/pull/1175
* Update gemma2 examples with a note about sample generation by dsikka in https://github.com/vllm-project/llm-compressor/pull/1176

New Contributors
* leon-seidel made their first contribution in https://github.com/vllm-project/llm-compressor/pull/1095

**Full Changelog**: https://github.com/vllm-project/llm-compressor/compare/0.4.0...0.4.1

0.4.0

What's Changed
* Record config file name as test suite property by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/947
* Update setup.py by dsikka in https://github.com/vllm-project/llm-compressor/pull/975
* Depreciate OBCQ Helpers by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/977
* KV Cache, E2E Tests by horheynm in https://github.com/vllm-project/llm-compressor/pull/742
* Use 1 GPU for offloading examples by dsikka in https://github.com/vllm-project/llm-compressor/pull/979
* Replace tokenizer with processor by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/955
* Revert "KV Cache, E2E Tests (742)" by dsikka in https://github.com/vllm-project/llm-compressor/pull/989
* Fix SmoothQuant offload bug by dsikka in https://github.com/vllm-project/llm-compressor/pull/978
* Add LM Eval Configs by dsikka in https://github.com/vllm-project/llm-compressor/pull/980
* Fix `test_model_reload` test by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1005
* Calibration and Compression Contexts by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/998
* Add info for clarity by dsikka in https://github.com/vllm-project/llm-compressor/pull/1009
* [Bugfix] Pass `trust_remote_code_model=True` for deepseek examples by dsikka in https://github.com/vllm-project/llm-compressor/pull/1012
* Vision Datasets by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/943
* Add example for fp8 kv cache of phi3.5 and gemma2 by mgoin in https://github.com/vllm-project/llm-compressor/pull/991
* Update ReadMe and test for cpu_offloading by dsikka in https://github.com/vllm-project/llm-compressor/pull/1013
* Adding amdsmi for AMD gpus by citrix123 in https://github.com/vllm-project/llm-compressor/pull/1018
* CompressionLogger add time units by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1026
* patch_tied_tensors_bug: support malformed model definitions by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1014
* Add: 2of4 example with/without fp8 quantization by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/1033
* Remove unccessary step in 2of4 Example by dsikka in https://github.com/vllm-project/llm-compressor/pull/1034
* Remove Neural Magic copyright from files by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/992
* VLM Support via GPTQ Hooks and Data Pipelines by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/914
* [E2E Testing] KV-Cache by horheynm in https://github.com/vllm-project/llm-compressor/pull/1004
* [E2E Testing] Add recipe check vllm e2e by horheynm in https://github.com/vllm-project/llm-compressor/pull/929
* [MoE] GPTQ compress using callback not hook by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1049
* Explicit dataset tokenizer `text` kwarg by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1031
* Fix smoothquant ignore, Fix typing, Add glm mappings by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1015
* [Test Fix] Quant model reload by horheynm in https://github.com/vllm-project/llm-compressor/pull/974
* Remove old examples by dsikka in https://github.com/vllm-project/llm-compressor/pull/1062
* VLM: Fix typo bug in TraceableLlavaForConditionalGeneration by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1065
* Add tests for "examples/sparse_2of4_[...]" by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/1067
* VLM Image Examples by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/1064
* Add quick warning for DeepSeek with transformers 4.48.0 by dsikka in https://github.com/vllm-project/llm-compressor/pull/1066
* [KV Cache] kv-cache end to end unit tests by horheynm in https://github.com/vllm-project/llm-compressor/pull/141
* [E2E Testing] Fix HF upload by horheynm in https://github.com/vllm-project/llm-compressor/pull/1061
* [Test Fix] Fix/update test_run_compressed by horheynm in https://github.com/vllm-project/llm-compressor/pull/970
* Revert "[Test Fix] Fix/update test_run_compressed" by mgoin in https://github.com/vllm-project/llm-compressor/pull/1071
* Sparse 2:4 + FP8 Quantization e2e vLLM tests by dsikka in https://github.com/vllm-project/llm-compressor/pull/1073
* [Test Patch] Remove redundant code for "Fix/update test_run_compressed" by horheynm in https://github.com/vllm-project/llm-compressor/pull/1072
* bump; set ct version by dsikka in https://github.com/vllm-project/llm-compressor/pull/1076

New Contributors
* citrix123 made their first contribution in https://github.com/vllm-project/llm-compressor/pull/1018

**Full Changelog**: https://github.com/vllm-project/llm-compressor/compare/0.3.1...0.4.0

0.3.1

What's Changed
* BLOOM Default Smoothquant Mappings by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/906
* [SparseAutoModelForCausalLM Deprecation] Feature change by horheynm in https://github.com/vllm-project/llm-compressor/pull/881
* Correct "dyanmic" typo by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/888
* Explicit defaults for QuantizationModifier targets by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/889
* [SparseAutoModelForCausalLM Deprecation] Update examples by horheynm in https://github.com/vllm-project/llm-compressor/pull/880
* Support pack_quantized format for nonuniform mixed-precision by mgoin in https://github.com/vllm-project/llm-compressor/pull/913
* Actually make the `run_compressed` test useful by dsikka in https://github.com/vllm-project/llm-compressor/pull/920
* Fix for e2e tests by horheynm in https://github.com/vllm-project/llm-compressor/pull/927
* [Bugfix] Correct metrics calculations by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/878
* Update kv_cache example by dsikka in https://github.com/vllm-project/llm-compressor/pull/921
* [1/2] Expand e2e testing to prepare for lm-eval by dsikka in https://github.com/vllm-project/llm-compressor/pull/922
* Update pytest command to capture results to file by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/932
* [Bugfix] DisableKVCache Context by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/834
* Add helpful info to the marlin-24 example by dsikka in https://github.com/vllm-project/llm-compressor/pull/946
* Remove requires_torch by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/949
* Remove unused sparseml.export utilities by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/950
* Implement HooksMixin by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/917
* Add LM Eval Testing by dsikka in https://github.com/vllm-project/llm-compressor/pull/945
* update version by dsikka in https://github.com/vllm-project/llm-compressor/pull/969

**Full Changelog**: https://github.com/vllm-project/llm-compressor/compare/0.3.0...0.3.1

0.3.0

Key Features and Improvements
- **GPTQ Quantized-weight Sequential Updating** ([177](https://github.com/vllm-project/llm-compressor/pull/177)): Introduced an efficient sequential updating mechanism for GPTQ quantization, improving model compression performance and compatibility.
- **Auto-Infer Mappings for SmoothQuantModifier** ([119](https://github.com/vllm-project/llm-compressor/pull/119)): Automatically infers `mappings` based on model architecture, making SmoothQuant easier to apply across various models.
- **Improved Sparse Compression Usability** ([191](https://github.com/vllm-project/llm-compressor/pull/191)): Added support for targeted sparse compression with specific ignore rules during inference, allowing for more flexible model configurations.
- **Generic Wrapper for Any Hugging Face Model** ([185](https://github.com/vllm-project/llm-compressor/pull/185)): Added `wrap_hf_model_class` utility, enabling better support and integration for Hugging Face models i.e. not based on `AutoModelForCausalLM`.
- **Observer Restructure** ([837](https://github.com/vllm-project/llm-compressor/pull/837)): Introduced calibration and frozen steps within `QuantizationModifier`, moving Observers from compressed-tensors to llm-compressor.

Bug Fixes
- **Fix Tied Tensors Bug** ([659](https://github.com/vllm-project/llm-compressor/pull/659))
- **Observer Initialization in GPTQ Wrapper** ([883](https://github.com/vllm-project/llm-compressor/pull/883))
- **Sparsity Reload Testing** ([882](https://github.com/vllm-project/llm-compressor/pull/882))

Documentation
- **Updated SmoothQuant Tutorial** ([115](https://github.com/vllm-project/llm-compressor/pull/115)): Expanded SmoothQuant documentation to include detailed mappings for easier implementation.

What's Changed
* Fix compresed typo by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/188
* GPTQ Quantized-weight Sequential Updating by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/177
* Add: targets and ignore inference for sparse compression by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/191
* switch tests from weekly to nightly by dhuangnm in https://github.com/vllm-project/llm-compressor/pull/658
* Compression wrapper abstract methods by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/170
* Explicitly set sequential_update in examples by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/187
* Increase Sparsity Threshold for compressors by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/679
* Add a generic `wrap_hf_model_class` utility to support VLMs by mgoin in https://github.com/vllm-project/llm-compressor/pull/185
* Add tests for examples by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/149
* Rename to quantization config by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/730
* Implement Missing Modifier Methods by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/166
* Fix 2/4 GPTQ Model Tests by dsikka in https://github.com/vllm-project/llm-compressor/pull/769
* SmoothQuant mappings tutorial by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/115
* Fix import of `ModelCompressor` by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/776
* update test by dsikka in https://github.com/vllm-project/llm-compressor/pull/773
* [Bugfix] Fix saving offloaded state dict by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/172
* Auto-Infer `mappings` Argument for `SmoothQuantModifier` Based on Model Architecture by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/119
* Update workflows/actions by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/774
* [Bugfix] Prepare KD Models when Saving by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/174
* Set Sparse compression to save_compressed by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/821
* Install compressed-tensors after llm-compressor by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/825
* Fix test typo by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/828
* Add `AutoModelForCausalLM` example by dsikka in https://github.com/vllm-project/llm-compressor/pull/698
* [Bugfix] Workaround tied tensors bug by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/659
* Only untie word embeddings by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/839
* Check for config hidden size by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/840
* Use float32 for Hessian dtype by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/847
* GPTQ: Depreciate non-sequential update option by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/762
* Typehint nits by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/826
* [ DOC ] Remove version restrictions in W8A8 exmaple by miaojinc in https://github.com/vllm-project/llm-compressor/pull/849
* Fix inconsistence in example config of 2:4 sparse quantization by yzlnew in https://github.com/vllm-project/llm-compressor/pull/80
* Fix forward function pass call by dsikka in https://github.com/vllm-project/llm-compressor/pull/845
* [Bugfix] Use weight parameter of linear layer by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/836
* [Bugfix] Rename files to remove colons by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/846
* cover all 3.9-3.12 in commit testing by dhuangnm in https://github.com/vllm-project/llm-compressor/pull/864
* Add marlin-24 recipe/configs for e2e testing by dsikka in https://github.com/vllm-project/llm-compressor/pull/866
* [Bugfix] onload during sparsity calculation by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/862
* Fix HFTrainer overloads by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/869
* Support Model Offloading Tied Tensors Patch by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/872
* Add advice about dealing with non-invertable hessians by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/875
* seed commit workflow by andy-neuma in https://github.com/vllm-project/llm-compressor/pull/877
* [Observer Restructure]: Add Observers; Add `calibration` and `frozen` steps to `QuantizationModifier` by dsikka in https://github.com/vllm-project/llm-compressor/pull/837
* Bugfix observer initialization in `gptq_wrapper` by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/883
* BugFix: Fix Sparsity Reload Testing by dsikka in https://github.com/vllm-project/llm-compressor/pull/882
* Use custom unique test names for e2e tests by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/892
* Revert "Use custom unique test names for e2e tests (892)" by dsikka in https://github.com/vllm-project/llm-compressor/pull/893
* Move config["testconfig_path"] assignment by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/895
* Cap accelerate version to avoid bug by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/897
* Fix observing offloaded weight by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/896
* Update image in README.md by mgoin in https://github.com/vllm-project/llm-compressor/pull/861
* update accelerate version by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/899
* [GPTQ] Iterative Parameter Updating by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/863
* Small fixes for release by dsikka in https://github.com/vllm-project/llm-compressor/pull/901
* use smaller portion of dataset by dsikka in https://github.com/vllm-project/llm-compressor/pull/902
* Update example to not fail hessian inversion by dsikka in https://github.com/vllm-project/llm-compressor/pull/904
* Bump version to 0.3.0 by dsikka in https://github.com/vllm-project/llm-compressor/pull/907

New Contributors
* miaojinc made their first contribution in https://github.com/vllm-project/llm-compressor/pull/849
* yzlnew made their first contribution in https://github.com/vllm-project/llm-compressor/pull/80
* andy-neuma made their first contribution in https://github.com/vllm-project/llm-compressor/pull/877

**Full Changelog**: https://github.com/vllm-project/llm-compressor/compare/0.2.0...0.3.0

0.2.0

What's Changed
* Correct Typo in SparseAutoModelForCausalLM docstring by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/56
* Disable Default Bitmask Compression by Satrat in https://github.com/vllm-project/llm-compressor/pull/60
* TRL Example fix by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/59
* Fix typo by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/63
* Correct typo by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/61
* correct import in README.md by zzc0430 in https://github.com/vllm-project/llm-compressor/pull/66
* Fix for issue 43 -- starcoder model by horheynm in https://github.com/vllm-project/llm-compressor/pull/71
* Update README.md by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/74
* Layer by Layer Sequential GPTQ Updates by Satrat in https://github.com/vllm-project/llm-compressor/pull/47
* [ Docs ] Update main readme by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/77
* [ Docs ] `gemma2` examples by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/78
* [ Docs ] Update `FP8` example to use dynamic per token by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/75
* [ Docs ] Overhaul `accelerate` user guide by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/76
* Support `kv_cache_scheme` for quantizing KV Cache by mgoin in https://github.com/vllm-project/llm-compressor/pull/88
* Propagate `trust_remote_code` Argument by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/90
* Fix for issue 81 by horheynm in https://github.com/vllm-project/llm-compressor/pull/84
* Fix for issue 83 by horheynm in https://github.com/vllm-project/llm-compressor/pull/85
* [ DOC ] Big Model Example by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/99
* Enable obcq/finetune integration tests with `commit` cadence by dsikka in https://github.com/vllm-project/llm-compressor/pull/101
* metric logging on GPTQ path by horheynm in https://github.com/vllm-project/llm-compressor/pull/65
* Update test config files by dsikka in https://github.com/vllm-project/llm-compressor/pull/97
* remove workflows + update runners by dsikka in https://github.com/vllm-project/llm-compressor/pull/103
* metrics by horheynm in https://github.com/vllm-project/llm-compressor/pull/104
* add debug by horheynm in https://github.com/vllm-project/llm-compressor/pull/108
* Add FP8 KV Cache quant example by mgoin in https://github.com/vllm-project/llm-compressor/pull/113
* Add vLLM e2e tests by dsikka in https://github.com/vllm-project/llm-compressor/pull/117
* Fix style, fix noqa by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/123
* GPTQ Algorithm Cleanup by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/120
* GPTQ Activation Ordering by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/94
* demote recipe string initialization to debug and make more descriptive by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/116
* compressed-tensors main dependency for base-tests by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/125
* Set `ready` label for transformer tests; add message reminder on PR opened by dsikka in https://github.com/vllm-project/llm-compressor/pull/126
* Fix markdown check test by dsikka in https://github.com/vllm-project/llm-compressor/pull/127
* Naive Run Compressed Pt. 2 by Satrat in https://github.com/vllm-project/llm-compressor/pull/62
* Fix transformer test conditions by dsikka in https://github.com/vllm-project/llm-compressor/pull/131
* Run Compressed Tests by Satrat in https://github.com/vllm-project/llm-compressor/pull/132
* Correct typo by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/124
* Activation Ordering Strategies by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/121
* Fix README Issue by robertgshaw2-neuralmagic in https://github.com/vllm-project/llm-compressor/pull/139
* update by dsikka in https://github.com/vllm-project/llm-compressor/pull/143
* Update finetune and oneshot tests by dsikka in https://github.com/vllm-project/llm-compressor/pull/114
* Validate Recipe Parsing Output by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/100
* fix build error for nightly by dhuangnm in https://github.com/vllm-project/llm-compressor/pull/145
* Fix recipe nested in configs by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/140
* MOE example with warning by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/87
* Bug Fix: recipe stages were not being concatenated by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/150
* fix package name bug for nightly by dhuangnm in https://github.com/vllm-project/llm-compressor/pull/155
* Add descriptions for pytest marks by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/156
* Fix Sparsity Unit Test by Satrat in https://github.com/vllm-project/llm-compressor/pull/153
* Fix: Error during model saving with shared tensors by rahul-tuli in https://github.com/vllm-project/llm-compressor/pull/158
* Update 2:4 Examples by dsikka in https://github.com/vllm-project/llm-compressor/pull/161
* DeepSeek: Fix Hessian Estimation by Satrat in https://github.com/vllm-project/llm-compressor/pull/157
* bump up main to 0.2.0 by dhuangnm in https://github.com/vllm-project/llm-compressor/pull/163
* Fix help dialogue by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/151
* Add MoE and Compressed Inference Examples by Satrat in https://github.com/vllm-project/llm-compressor/pull/160
* Separate `trust_remote_code` args by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/152
* Enable a skipped finetune test by dsikka in https://github.com/vllm-project/llm-compressor/pull/169
* Fix filename in example command by dbarbuzzi in https://github.com/vllm-project/llm-compressor/pull/173
* Add DeepSeek V2.5 Example by dsikka in https://github.com/vllm-project/llm-compressor/pull/171
* fix quality by dsikka in https://github.com/vllm-project/llm-compressor/pull/176
* Patch log function name in gptq by kylesayrs in https://github.com/vllm-project/llm-compressor/pull/168
* README for Modifiers by Satrat in https://github.com/vllm-project/llm-compressor/pull/165
* Fix default for sequential updates by dsikka in https://github.com/vllm-project/llm-compressor/pull/186
* fix default test case by dsikka in https://github.com/vllm-project/llm-compressor/pull/193
* Fix Initalize typo by Imss27 in https://github.com/vllm-project/llm-compressor/pull/190
* Update MoE examples by mgoin in https://github.com/vllm-project/llm-compressor/pull/192

New Contributors
* zzc0430 made their first contribution in https://github.com/vllm-project/llm-compressor/pull/66
* horheynm made their first contribution in https://github.com/vllm-project/llm-compressor/pull/71
* dsikka made their first contribution in https://github.com/vllm-project/llm-compressor/pull/101
* dhuangnm made their first contribution in https://github.com/vllm-project/llm-compressor/pull/145
* Imss27 made their first contribution in https://github.com/vllm-project/llm-compressor/pull/190

**Full Changelog**: https://github.com/vllm-project/llm-compressor/compare/0.1.0...0.2.0

Page 1 of 2

Releases

Has known vulnerabilities

Llmcompressor

Page 1 of 2

0.5.0

0.4.1

0.4.0

0.3.1

0.3.0

0.2.0

Page 1 of 2

Links

Releases