Peft

Latest version: v0.15.1

Safety actively analyzes 722491 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 6

0.15.1

This patch includes a fix for 2450. In this bug `modules_to_save` was not handled correctly when used in conjunction with DeepSpeed ZeRO stage 3 which resulted in those modules being placeholder values in the saved checkpoints.

**Full Changelog**: https://github.com/huggingface/peft/compare/v0.15.0...v0.15.1

0.15.0

Highlights

![peft-v0 15 0](https://github.com/user-attachments/assets/4095edca-7269-403f-be2e-2ef95d6ed474)

New Methods

CorDA: Context-Oriented Decomposition Adaptation

iboing and 5eqn contributed [CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning](https://arxiv.org/abs/2406.05223) . This task-driven initialization method has [two modes](https://huggingface.co/docs/peft/main/en/developer_guides/lora#corda), knowledge-preservation and instruction-preservation, both using external data to select ranks intelligently. The former can be used to select those ranks that correspond to weights not affiliated with knowledge from, say, a QA dataset. The latter can be used to select those ranks that correspond most to the task at hand (e.g., a classification task). (2231)

Trainable Tokens: Selective token update
The new [Trainable Tokens](https://huggingface.co/docs/peft/main/en/package_reference/trainable_tokens) tuner allows for selective training of tokens without re-training the full embedding matrix, e.g. when adding support for reasoning / thinking tokens. This is a lot more memory efficient and the saved checkpoint is much smaller. It can be used standalone or [in conjunction with LoRA adapters](https://huggingface.co/docs/peft/main/en/developer_guides/lora#efficiently-train-tokens-alongside-lora) by passing `trainable_token_indices` to `LoraConfig`. (2376)

Enhancements

LoRA now supports targeting multihead attention modules (but for now only those with `_qkv_same_embed_dim=True`). These modules were tricky as they may expose linear submodules but won't use their forward methods, therefore needing explicit support. (1324)

[Hotswapping](https://huggingface.co/docs/peft/main/en/package_reference/hotswap) now allows different alpha scalings and ranks without recompilation of the model when the model is prepared using a call to `prepare_model_for_compiled_hotswap()` before compiling the model. (#2177)

[GPTQModel](https://github.com/ModelCloud/GPTQModel) support was added in #2247 as a replacement for AutoGPTQ which is not maintained anymore.

Changes
- It's now possible to use `all-linear` as `target_modules` for custom (non-transformers) models (2267). With this change comes a bugfix where it was possible that non-linear layers were selected when they shared the same name with a linear layer (e.g., `bar.foo` and `baz.foo`).
- The internal tuner API was refactored to make method registration easier. With this change the number of changes to numerous files is reduced to a single `register_peft_method()` call. (2282)
- `PEFT_TYPE_TO_MODEL_MAPPING` is now deprecated and should not be relied upon. Use `PEFT_TYPE_TO_TUNER_MAPPING` instead. (2282)
- Mixed adapter batches can now be used in conjunction with beam search. (2287)
- It was possible that `modules_to_save` keys wrongly matched parts of the state dict if the key was a substring of another key (e.g., `classifier` and `classifier2`). (2334)
- Auto-casting of the input dtype to the LoRA adapter dtype can now be disabled via `disable_input_dtype_casting=True`. (2353)
- The config parameters `rank_pattern` and `alpha_pattern` used by many adapters now supports matching full paths as well by specifying the pattern with a caret in front, for example: `^foo` to target `model.foo` but not `model.bar.foo`. (2419)
- AutoPeftModels do not reduce the embedding size anymore if the tokenizer size differs from the embedding size. Only if there are more tokens in the tokenizer than in the embedding matrix, the matrix will be resized. This is to prevent resizing of embedding matrices in models that have 'spare' tokens built-in. (2427)

What's Changed
* FIX: Ensure Device Compatibility for BOFT Forward/Merging by d-kleine in https://github.com/huggingface/peft/pull/2242
* MNT: Bump version to 0.14.1.dev0 by BenjaminBossan in https://github.com/huggingface/peft/pull/2263
* ENH: fix library interface by bluenote10 in https://github.com/huggingface/peft/pull/2265
* FIX: Add warning for `adapter_name` conflict with tuner by pzdkn in https://github.com/huggingface/peft/pull/2254
* ENH: FIX: Allow `"all-linear"` to target custom models by BenjaminBossan in https://github.com/huggingface/peft/pull/2267
* MNT: apply sorting of exported symbols in `__all__` by bluenote10 in https://github.com/huggingface/peft/pull/2280
* MNT: apply sorting of imports by bluenote10 in https://github.com/huggingface/peft/pull/2279
* FIX: Adoption prompt: New way to obtain position embeddings by BenjaminBossan in https://github.com/huggingface/peft/pull/2276
* FIX: Int8 check for torchao v0.7.0 by BenjaminBossan in https://github.com/huggingface/peft/pull/2284
* FEAT: Adding CorDA as an optional initialization method of LoRA by iboing in https://github.com/huggingface/peft/pull/2231
* FIX: typo in lora `config.py` by innerlee in https://github.com/huggingface/peft/pull/2297
* DOC: Added information regarding freezing the base model in `prepare_model_for_kbit_training` docstring by NilBiescas in https://github.com/huggingface/peft/pull/2305
* DOC: add `resize_token_embeddings` to docs by bingwork in https://github.com/huggingface/peft/pull/2290
* FIX: Make CorDA example work by 5eqn in https://github.com/huggingface/peft/pull/2300
* FIX: 2295: Warn when user reloads modified model by githubnemo in https://github.com/huggingface/peft/pull/2306
* ENH: Extend usage for OLoRA finetune script by jiqing-feng in https://github.com/huggingface/peft/pull/2308
* CI: Add zizmor for CI (security) linting by githubnemo in https://github.com/huggingface/peft/pull/2288
* FEAT: Add LoRA multihead attention module by BenjaminBossan in https://github.com/huggingface/peft/pull/1324
* DOC: Updated documentation for `get_peft_model()` for in-place base model modification by d-kleine in https://github.com/huggingface/peft/pull/2313
* FIX: Prefix tuning test w/ rotary embedding on multi GPU by BenjaminBossan in https://github.com/huggingface/peft/pull/2311
* FIX: Adaption prompt errors after changes from transformers 35235 by BenjaminBossan in https://github.com/huggingface/peft/pull/2314
* FIX: Package checks for torchao, EETQ by BenjaminBossan in https://github.com/huggingface/peft/pull/2320
* Refactor: PEFT method registration function by BenjaminBossan in https://github.com/huggingface/peft/pull/2282
* FIX: `low_cpu_mem_usage=True` with 8bit bitsandbytes by BenjaminBossan in https://github.com/huggingface/peft/pull/2325
* FIX: Reinstate `PEFT_TYPE_TO_MODEL_MAPPING` variable with deprecation by BenjaminBossan in https://github.com/huggingface/peft/pull/2328
* FIX: reduce CorDA memory consumption + docs by 5eqn in https://github.com/huggingface/peft/pull/2324
* MNT: React on new zizmor version findings by githubnemo in https://github.com/huggingface/peft/pull/2331
* TST: make cuda-only tests device-agnostic by faaany in https://github.com/huggingface/peft/pull/2323
* FIX: Generating with mixed adapter batches and with beam search enabled by BenjaminBossan in https://github.com/huggingface/peft/pull/2287
* FIX: Bug with `modules_to_save` loading if substring by BenjaminBossan in https://github.com/huggingface/peft/pull/2334
* FIX: Add missing attributes to MultiheadAttention by BenjaminBossan in https://github.com/huggingface/peft/pull/2335
* FIX: for zizmor permission warnings by githubnemo in https://github.com/huggingface/peft/pull/2338
* CI: Attempt at adding a cache for models by githubnemo in https://github.com/huggingface/peft/pull/2327
* FIX: Avoid needless copy from `modules_to_save` by BenjaminBossan in https://github.com/huggingface/peft/pull/2220
* DOC: Add entry to solve unknown config argument by BenjaminBossan in https://github.com/huggingface/peft/pull/2340
* FEAT: add gptqmodel support by jiqing-feng in https://github.com/huggingface/peft/pull/2247
* MNT: Update ruff to v0.9.2 by BenjaminBossan in https://github.com/huggingface/peft/pull/2343
* TST: Update `torch.compile` tests and docs by BenjaminBossan in https://github.com/huggingface/peft/pull/2332
* FIX: Documentation & error checking for AdaLoRA timing by githubnemo in https://github.com/huggingface/peft/pull/2341
* DOC: Better document init_lora_weights=False option by BenjaminBossan in https://github.com/huggingface/peft/pull/2347
* ENH: Adding Lora implementation for `nn.Conv1d` by CCLDArjun in https://github.com/huggingface/peft/pull/2333
* FIX: Failing AdaLoRA GPU test by BenjaminBossan in https://github.com/huggingface/peft/pull/2349
* ENH: Improve invalid peft config error message by thedebugger in https://github.com/huggingface/peft/pull/2346
* TST: Use different diffusion model for testing by BenjaminBossan in https://github.com/huggingface/peft/pull/2345
* CI: Use locked install for zizmor by githubnemo in https://github.com/huggingface/peft/pull/2350
* DOC: fix links to PEFT guides by makelinux in https://github.com/huggingface/peft/pull/2357
* DOC: rename link to PEFT Quicktour by makelinux in https://github.com/huggingface/peft/pull/2358
* ENH: Allow disabling input dtype casting for LoRA by BenjaminBossan in https://github.com/huggingface/peft/pull/2353
* ENH: Hotswap allow different alpha scalings and ranks by BenjaminBossan in https://github.com/huggingface/peft/pull/2177
* DOC: Fix links to boft by makelinux in https://github.com/huggingface/peft/pull/2365
* DOC: Explain uninitialized weights warning by BenjaminBossan in https://github.com/huggingface/peft/pull/2369
* ENH: Optimization for ConvNd if dropout=0. by gslama12 in https://github.com/huggingface/peft/pull/2371
* FIX: Small fixes to hotswapping by BenjaminBossan in https://github.com/huggingface/peft/pull/2366
* ENH: `prepare_model_for_compiled_hotswap` raises when no adapter was found by BenjaminBossan in https://github.com/huggingface/peft/pull/2375
* FIX: Ensure `hf_hub_download` arguments are used when loading locally by henryzhengr in https://github.com/huggingface/peft/pull/2373
* FIX: Avoid caching in X-LoRA generate by BenjaminBossan in https://github.com/huggingface/peft/pull/2384
* CI: Skip audio test on single GPU CI by BenjaminBossan in https://github.com/huggingface/peft/pull/2380
* SEC: Bump transformers version used in examples by BenjaminBossan in https://github.com/huggingface/peft/pull/2374
* FIX: Failing single GPU tests related to hotswapping by BenjaminBossan in https://github.com/huggingface/peft/pull/2385
* ENH: Make hotswap error on compile optional by BenjaminBossan in https://github.com/huggingface/peft/pull/2393
* FEAT: Standalone Custom Tokens Tuner and integrated into LoRA by githubnemo in https://github.com/huggingface/peft/pull/2376
* FIX: GPTQModel LoRA Compat by Qubitium in https://github.com/huggingface/peft/pull/2404
* FIX: Model with nested `all-linear` target modules by BenjaminBossan in https://github.com/huggingface/peft/pull/2391
* FIX: Bug with `PeftConfig.from_pretrained` by BenjaminBossan in https://github.com/huggingface/peft/pull/2397
* ENH: Add simple script to estimate train memory by BenjaminBossan in https://github.com/huggingface/peft/pull/2378
* CI: Use new slack secret token name by githubnemo in https://github.com/huggingface/peft/pull/2409
* ENH: Trainable Tokens: Support for Weight Tying by githubnemo in https://github.com/huggingface/peft/pull/2399
* TST: enable BNB tests on XPU by faaany in https://github.com/huggingface/peft/pull/2396
* FIX: Reset the FP32 matmul precision in tests by BenjaminBossan in https://github.com/huggingface/peft/pull/2411
* TST: add the missing `.eval()` for inference by faaany in https://github.com/huggingface/peft/pull/2408
* FIX: Revert optimization for LoRA scaling == 1 by BenjaminBossan in https://github.com/huggingface/peft/pull/2416
* ENH: Extend the regex for rank/alpha pattern by BenjaminBossan in https://github.com/huggingface/peft/pull/2419
* FIX: AutoPeftModels never reduce embedding size by BenjaminBossan in https://github.com/huggingface/peft/pull/2427
* FIX: Minimal target module optimization bug with IA³ by BenjaminBossan in https://github.com/huggingface/peft/pull/2432
* FIX: 2422: Modules to save with multiple adapters by githubnemo in https://github.com/huggingface/peft/pull/2430

New Contributors
* bluenote10 made their first contribution in https://github.com/huggingface/peft/pull/2265
* pzdkn made their first contribution in https://github.com/huggingface/peft/pull/2254
* iboing made their first contribution in https://github.com/huggingface/peft/pull/2231
* innerlee made their first contribution in https://github.com/huggingface/peft/pull/2297
* NilBiescas made their first contribution in https://github.com/huggingface/peft/pull/2305
* bingwork made their first contribution in https://github.com/huggingface/peft/pull/2290
* 5eqn made their first contribution in https://github.com/huggingface/peft/pull/2300
* CCLDArjun made their first contribution in https://github.com/huggingface/peft/pull/2333
* thedebugger made their first contribution in https://github.com/huggingface/peft/pull/2346
* makelinux made their first contribution in https://github.com/huggingface/peft/pull/2357
* gslama12 made their first contribution in https://github.com/huggingface/peft/pull/2371
* henryzhengr made their first contribution in https://github.com/huggingface/peft/pull/2373
* Qubitium made their first contribution in https://github.com/huggingface/peft/pull/2404

**Full Changelog**: https://github.com/huggingface/peft/compare/v0.14.0...v0.15.0

0.14.0

Highlights

![peft-v0 14 0](https://github.com/user-attachments/assets/9994bc6d-f047-419f-9ab5-a60c6033d5b6)

New Methods

Context-aware Prompt Tuning
tsachiblau added a new soft prompt method called [Context-aware Prompt Tuning (CPT)](https://huggingface.co/docs/peft/main/en/conceptual_guides/prompting#context-aware-prompt-tuning-cpt) which is a combination of In-Context Learning and Prompt Tuning in the sense that, for each training sample, it builds a learnable context from training examples in addition to the single training sample. Allows for sample- and parameter-efficient few-shot classification and addresses recency-bias.

Explained Variance Adaptation
sirluk contributed a new LoRA initialization method called [Explained Variance Adaptation (EVA)](https://huggingface.co/docs/peft/main/en/developer_guides/lora#eva). Instead of randomly initializing LoRA weights, this method uses SVD on minibatches of finetuning data to initialize the LoRA weights and is also able to re-allocate the ranks of the adapter based on the explained variance ratio (derived from SVD). Thus, this initialization method can yield better initial values and better rank distribution.

Bone
JL-er added an implementation for [Block Affine (Bone) Adaptation](https://huggingface.co/docs/peft/main/en/conceptual_guides/adapter#bone) which utilizes presumed sparsity in the base layer weights to divide them into multiple sub-spaces that share a single low-rank matrix for updates. Compared to LoRA, Bone has the potential to significantly reduce memory usage and achieve faster computation.

Enhancements
PEFT now supports LoRAs for `int8` torchao quantized models (check [this](https://github.com/huggingface/peft/blob/main/examples/sequence_classification/LoRA-torchao-8bit.ipynb) and [this](https://github.com/huggingface/peft/blob/main/examples/sequence_classification/LoRA-torchao-8bit-dynamic-activation.ipynb) notebook) . In addition, VeRA can now be used with 4 and 8 bit bitsandbytes quantization thanks to ZiadHelal.

[Hot-swapping of LoRA adapters](https://huggingface.co/docs/peft/main/en/package_reference/hotswap) is now possible using the `hotswap_adapter` function. Now you are able to load one LoRA and replace its weights in-place with the LoRA weights of another adapter which, in general, should be faster than deleting one adapter and loading the other adapter in its place. The feature is built so that no re-compilation of the model is necessary if `torch.compile` was called on the model (right now, this requires ranks and alphas to be the same for the adapters).

LoRA and IA³ now support `Conv3d` layers thanks to jsilter, and JINO-ROHIT added a [notebook](https://github.com/huggingface/peft/blob/main/examples/evaluation/lora-lm-eval.ipynb) showcasing PEFT model evaluation using lm-eval-harness toolkit.

With the `target_modules` argument, you can specify which layers to target with the adapter (e.g. LoRA). Now you can also specify which modules *not* to target by using the `exclude_modules` parameter (thanks JINO-ROHIT).

Changes

- There have been made several fixes to the OFT implementation, among other things, to fix merging, which makes adapter weights trained with PEFT versions prior to this release incompatible (see 1996 for details).
- Adapter configs are now forward-compatible by accepting unknown keys.
- Prefix tuning was fitted to the `DynamicCache` caching infrastructure of transformers (see 2096). If you are using this PEFT version and a recent version of transformers with an old prefix tuning checkpoint, you should double check that it still works correctly and retrain it if it doesn't.
- Added `lora_bias` parameter to LoRA layers to enable bias on LoRA B matrix. This is useful when extracting LoRA weights from fully fine-tuned parameters with bias vectors so that these can be taken into account.
- 2180 provided a couple of bug fixes to LoKr (thanks yaswanth19). If you're using LoKr, your old checkpoints should still work but it's recommended to retrain your adapter.
- `from_pretrained` now warns the user if PEFT keys are missing.
- Attribute access to modules in `modules_to_save` is now properly and transparently handled.
- PEFT supports the changes to bitsandbytes 8bit quantization from the [recent v0.45.0 release](https://github.com/bitsandbytes-foundation/bitsandbytes/releases/tag/0.45.0). To benefit from these improvements, we thus recommend to upgrade bitsandbytes if you're using QLoRA. Expect slight numerical differences in model outputs if you're using QLoRA with 8bit bitsandbytes quantization.

What's Changed
* Bump version to 0.13.1.dev0 by BenjaminBossan in https://github.com/huggingface/peft/pull/2094
* Support Conv3d layer in LoRA and IA3 by jsilter in https://github.com/huggingface/peft/pull/2082
* Fix Inconsistent Missing Keys Warning for Adapter Weights in PEFT by yaswanth19 in https://github.com/huggingface/peft/pull/2084
* FIX: Change check if past_key_values is empty by BenjaminBossan in https://github.com/huggingface/peft/pull/2106
* Update install.md by Salehbigdeli in https://github.com/huggingface/peft/pull/2110
* Update OFT to fix merge bugs by Zeju1997 in https://github.com/huggingface/peft/pull/1996
* ENH: Improved attribute access for modules_to_save by BenjaminBossan in https://github.com/huggingface/peft/pull/2117
* FIX low_cpu_mem_usage consolidates devices by BenjaminBossan in https://github.com/huggingface/peft/pull/2113
* TST Mark flaky X-LoRA test as xfail by BenjaminBossan in https://github.com/huggingface/peft/pull/2114
* ENH: Warn when from_pretrained misses PEFT keys by BenjaminBossan in https://github.com/huggingface/peft/pull/2118
* FEAT: Adding exclude modules param(2044) by JINO-ROHIT in https://github.com/huggingface/peft/pull/2102
* fix merging bug / update boft conv2d scaling variable by Zeju1997 in https://github.com/huggingface/peft/pull/2127
* FEAT: Support quantization for VeRA using bitsandbytes (2070) by ZiadHelal in https://github.com/huggingface/peft/pull/2076
* Bump version to 0.13.2.dev0 by BenjaminBossan in https://github.com/huggingface/peft/pull/2137
* FEAT: Support torchao by BenjaminBossan in https://github.com/huggingface/peft/pull/2062
* FIX: Transpose weight matrix based on fan_in_fan_out condition in PiSSA initialization (2103) by suyang160 in https://github.com/huggingface/peft/pull/2104
* FIX Type annoations in vera/bnb.py by BenjaminBossan in https://github.com/huggingface/peft/pull/2139
* ENH Make PEFT configs forward compatible by BenjaminBossan in https://github.com/huggingface/peft/pull/2038
* FIX Raise an error when performing mixed adapter inference and passing non-existing adapter names by BenjaminBossan in https://github.com/huggingface/peft/pull/2090
* FIX Prompt learning with latest transformers error by BenjaminBossan in https://github.com/huggingface/peft/pull/2140
* adding peft lora example notebook for ner by JINO-ROHIT in https://github.com/huggingface/peft/pull/2126
* FIX TST: NaN issue with HQQ GPU test by BenjaminBossan in https://github.com/huggingface/peft/pull/2143
* FIX: Bug in target module optimization if child module name is suffix of parent module name by BenjaminBossan in https://github.com/huggingface/peft/pull/2144
* Bump version to 0.13.2.dev0 by BenjaminBossan in https://github.com/huggingface/peft/pull/2145
* FIX Don't assume past_key_valus for encoder models by BenjaminBossan in https://github.com/huggingface/peft/pull/2149
* Use `SFTConfig` instead of `SFTTrainer` keyword args by qgallouedec in https://github.com/huggingface/peft/pull/2150
* FIX: Sft train script FSDP QLoRA embedding mean resizing error by BenjaminBossan in https://github.com/huggingface/peft/pull/2151
* Optimize DoRA in `eval` and `no dropout` by ariG23498 in https://github.com/huggingface/peft/pull/2122
* FIX Missing low_cpu_mem_usage argument by BenjaminBossan in https://github.com/huggingface/peft/pull/2156
* MNT: Remove version pin of diffusers by BenjaminBossan in https://github.com/huggingface/peft/pull/2162
* DOC: Improve docs for layers_pattern argument by BenjaminBossan in https://github.com/huggingface/peft/pull/2157
* Update HRA by DaShenZi721 in https://github.com/huggingface/peft/pull/2160
* fix fsdp_auto_wrap_policy by eljandoubi in https://github.com/huggingface/peft/pull/2167
* MNT Remove Python 3.8 since it's end of life by BenjaminBossan in https://github.com/huggingface/peft/pull/2135
* Improving error message when users pass layers_to_transform and layers_pattern by JINO-ROHIT in https://github.com/huggingface/peft/pull/2169
* FEAT Add hotswapping functionality by BenjaminBossan in https://github.com/huggingface/peft/pull/2120
* Fix to prefix tuning to fit transformers by BenjaminBossan in https://github.com/huggingface/peft/pull/2096
* MNT: Enable Python 3.12 on CI by BenjaminBossan in https://github.com/huggingface/peft/pull/2173
* MNT: Update docker nvidia base image to 12.4.1 by BenjaminBossan in https://github.com/huggingface/peft/pull/2176
* DOC: Extend modules_to_save doc with pooler example by BenjaminBossan in https://github.com/huggingface/peft/pull/2175
* FIX VeRA failure on multiple GPUs by BenjaminBossan in https://github.com/huggingface/peft/pull/2163
* FIX: Import location of HF hub errors by BenjaminBossan in https://github.com/huggingface/peft/pull/2178
* DOC: fix broken link in the README of loftq by dennis2030 in https://github.com/huggingface/peft/pull/2183
* added checks for layers to transforms and layer pattern in lora by JINO-ROHIT in https://github.com/huggingface/peft/pull/2159
* ENH: Warn when loading PiSSA/OLoRA together with other adapters by BenjaminBossan in https://github.com/huggingface/peft/pull/2186
* TST: Skip AQLM test that is incompatible with torch 2.5 by BenjaminBossan in https://github.com/huggingface/peft/pull/2187
* FIX: Prefix tuning with model on multiple devices by BenjaminBossan in https://github.com/huggingface/peft/pull/2189
* FIX: Check for prefix tuning + gradient checkpointing fails by BenjaminBossan in https://github.com/huggingface/peft/pull/2191
* Dora_datacollector_updated by shirinyamani in https://github.com/huggingface/peft/pull/2197
* [BUG] Issue with using `rank_pattern` and `alpha_pattern` together in `LoraConfig` by sirluk in https://github.com/huggingface/peft/pull/2195
* evaluation of peft model using lm-eval-harness toolkit by JINO-ROHIT in https://github.com/huggingface/peft/pull/2190
* Support Bone by JL-er in https://github.com/huggingface/peft/pull/2172
* BUG🐛: Fixed scale related bugs in LoKr | Added rank_dropout_scale parameter by yaswanth19 in https://github.com/huggingface/peft/pull/2180
* update load_dataset for examples/feature_extraction by sinchir0 in https://github.com/huggingface/peft/pull/2207
* [FEAT] New LoRA Initialization Method: Explained Variance Adaptation by sirluk in https://github.com/huggingface/peft/pull/2142
* [FIX] EVA `meta` device check bug + add multi-gpu functionality by sirluk in https://github.com/huggingface/peft/pull/2218
* CPT Tuner by tsachiblau in https://github.com/huggingface/peft/pull/2168
* [FIX] Invalid `None` check for `loftq_config` attribute in `LoraConfig` by sirluk in https://github.com/huggingface/peft/pull/2215
* TST: Move slow compile tests to nightly CI by BenjaminBossan in https://github.com/huggingface/peft/pull/2223
* CI Update AutoAWQ version to fix CI by BenjaminBossan in https://github.com/huggingface/peft/pull/2222
* FIX Correctly set device of input data in bnb test by BenjaminBossan in https://github.com/huggingface/peft/pull/2227
* CI: Skip EETQ tests while broken by BenjaminBossan in https://github.com/huggingface/peft/pull/2226
* Add Validation for Invalid `task_type` in PEFT Configurations by d-kleine in https://github.com/huggingface/peft/pull/2210
* [FEAT] EVA: ensure deterministic behavior of SVD on multi gpu setups by sirluk in https://github.com/huggingface/peft/pull/2225
* TST: Eva: Speed up consistency tests by BenjaminBossan in https://github.com/huggingface/peft/pull/2224
* CI: Fix failing torchao test by BenjaminBossan in https://github.com/huggingface/peft/pull/2232
* TST: Update Llava model id in test by BenjaminBossan in https://github.com/huggingface/peft/pull/2236
* TST: Skip test on multi-GPU as DataParallel fails by BenjaminBossan in https://github.com/huggingface/peft/pull/2234
* Bump version of MacOS runners from 12 to 13 by githubnemo in https://github.com/huggingface/peft/pull/2235
* new version Bone by JL-er in https://github.com/huggingface/peft/pull/2233
* ENH Argument to enable bias for LoRA B by BenjaminBossan in https://github.com/huggingface/peft/pull/2237
* FIX: Small regression in BNB LoRA output by BenjaminBossan in https://github.com/huggingface/peft/pull/2238
* Update CPT documentation by tsachiblau in https://github.com/huggingface/peft/pull/2229
* FIX: Correctly pass low_cpu_mem_usage argument when initializing a PEFT model with task_type by BenjaminBossan in https://github.com/huggingface/peft/pull/2253
* FIX Correctly determine word embeddings on Deberta by BenjaminBossan in https://github.com/huggingface/peft/pull/2257
* FIX: Prevent CUDA context initialization due to AWQ by BenjaminBossan in https://github.com/huggingface/peft/pull/2230
* ENH: Updates for upcoming BNB Int8 release by matthewdouglas in https://github.com/huggingface/peft/pull/2245
* Prepare for PEFT release of v0.14.0 by BenjaminBossan in https://github.com/huggingface/peft/pull/2258

New Contributors
* jsilter made their first contribution in https://github.com/huggingface/peft/pull/2082
* yaswanth19 made their first contribution in https://github.com/huggingface/peft/pull/2084
* Salehbigdeli made their first contribution in https://github.com/huggingface/peft/pull/2110
* JINO-ROHIT made their first contribution in https://github.com/huggingface/peft/pull/2102
* ZiadHelal made their first contribution in https://github.com/huggingface/peft/pull/2076
* suyang160 made their first contribution in https://github.com/huggingface/peft/pull/2104
* qgallouedec made their first contribution in https://github.com/huggingface/peft/pull/2150
* eljandoubi made their first contribution in https://github.com/huggingface/peft/pull/2167
* dennis2030 made their first contribution in https://github.com/huggingface/peft/pull/2183
* sirluk made their first contribution in https://github.com/huggingface/peft/pull/2195
* JL-er made their first contribution in https://github.com/huggingface/peft/pull/2172
* sinchir0 made their first contribution in https://github.com/huggingface/peft/pull/2207
* tsachiblau made their first contribution in https://github.com/huggingface/peft/pull/2168
* d-kleine made their first contribution in https://github.com/huggingface/peft/pull/2210
* githubnemo made their first contribution in https://github.com/huggingface/peft/pull/2235
* matthewdouglas made their first contribution in https://github.com/huggingface/peft/pull/2245

**Full Changelog**: https://github.com/huggingface/peft/compare/v0.13.2...v0.14.0

0.13.2

This patch release contains a small bug fix for an issue that prevented some LoRA checkpoints to be loaded correctly (mostly concerning stable diffusion checkpoints not trained with PEFT when loaded in diffusers, 2144).

**Full Changelog**: https://github.com/huggingface/peft/compare/v0.13.1...v0.13.2

0.13.1

This patch release contains a small bug fix for the `low_cpu_mem_usage=True` option (2113).

**Full Changelog**: https://github.com/huggingface/peft/compare/v0.13.0...v0.13.1

0.13.0

![peft-v0 13 0](https://github.com/user-attachments/assets/0423db36-73ca-4eb4-af12-c21610a1b35c)

Highlights

New methods

LoRA+

kallewoof added [LoRA\+](https://arxiv.org/abs/2402.12354) to PEFT (#1915). This is a function that allows to [initialize an optimizer](https://huggingface.co/docs/peft/main/en/developer_guides/lora#lora-optimized-lora) with settings that are better suited for training a LoRA adapter.

VB-LoRA

leo-yangli added a new method to PEFT called [VB-LoRA](https://arxiv.org/abs/2405.15179) (#2039). The idea is to have LoRA layers be composed from a single vector bank (hence "VB") that is shared among all layers. This makes VB-LoRA extremely parameter efficient and the checkpoints especially small (comparable to the VeRA method), while still promising good fine-tuning performance. Check the [VB-LoRA docs](https://huggingface.co/docs/peft/main/en/package_reference/vblora) and [example](https://github.com/huggingface/peft/blob/main/examples/sequence_classification/VBLoRA.ipynb).

Enhancements

New Hugging Face team member ariG23498 added the helper function [`rescale_adapter_scale`](https://huggingface.co/docs/peft/main/en/package_reference/helpers#peft.helpers.rescale_adapter_scale) to PEFT (1951). Use this context manager to temporarily increase or decrease the scaling of the LoRA adapter of a model. It also works for PEFT adapters loaded directly into a transformers or diffusers model.

ariG23498 also added [DoRA](https://arxiv.org/abs/2402.09353) support for embedding layers (#2006). So if you're using the `use_dora=True` option in the `LoraConfig`, you can now also target embedding layers.

For some time now, we support [inference with batches that are using different adapters](https://huggingface.co/docs/peft/v0.12.0/en/developer_guides/lora#inference-with-different-lora-adapters-in-the-same-batch) for different samples, so e.g. sample 1-5 use "adapter1" and samples 6-10 use "adapter2". However, this only worked for LoRA layers so far. saeid93 extended this to also work with layers targeted by `modules_to_save` (1990).

When loading a PEFT adapter, you now have the option to pass `low_cpu_mem_usage=True` (1961). This will initialize the adapter with empty weights ("meta" device) before loading the weights instead of initializing on CPU or GPU. This can speed up loading PEFT adapters. So use this option especially if you have a lot of adapters to load at the same time or if these adapters are very big. Please let us know if you encounter issues with this option, as we may make this the default in the future.

Changes

Safe loading of PyTorch weights

Unless indicated otherwise, PEFT adapters are saved and loaded using the secure `safetensors` format. However, we also support the [PyTorch format](https://pytorch.org/docs/stable/generated/torch.load.html) for checkpoints, which relies on the inherently insecure pickle protocol from Python. In the future, PyTorch will be more strict when loading these files to improve security by making the option `weights_only=True` the default. This is generally recommended and should not cause any trouble with PEFT checkpoints, which is why with this release, PEFT will enable this by default. Please open an issue if this causes trouble.

What's Changed
* Bump version to 0.12.1.dev0 by BenjaminBossan in https://github.com/huggingface/peft/pull/1950
* CI Fix Windows permission error on merge test by BenjaminBossan in https://github.com/huggingface/peft/pull/1952
* Check if past_key_values is provided when using prefix_tuning in peft_model by Nidhogg-lyz in https://github.com/huggingface/peft/pull/1942
* Add lora+ implementation by kallewoof in https://github.com/huggingface/peft/pull/1915
* FIX: New bloom changes breaking prompt learning by BenjaminBossan in https://github.com/huggingface/peft/pull/1969
* ENH Update VeRA preconfigured models by BenjaminBossan in https://github.com/huggingface/peft/pull/1941
* fix: lora+: include lr in optimizer kwargs by kallewoof in https://github.com/huggingface/peft/pull/1973
* FIX active_adapters for transformers models by BenjaminBossan in https://github.com/huggingface/peft/pull/1975
* FIX Loading adapter honors offline mode by BenjaminBossan in https://github.com/huggingface/peft/pull/1976
* chore: Update CI configuration for workflows by XciD in https://github.com/huggingface/peft/pull/1985
* Cast to fp32 if using bf16 weights on cpu during `merge_and_unload` by snarayan21 in https://github.com/huggingface/peft/pull/1978
* AdaLora: Trigger warning when user uses 'r' inplace of 'init_r' by bhargavyagnik in https://github.com/huggingface/peft/pull/1981
* [Add] scaling LoRA adapter weights with a context manager by ariG23498 in https://github.com/huggingface/peft/pull/1951
* DOC Small fixes for HQQ and section title by BenjaminBossan in https://github.com/huggingface/peft/pull/1986
* Add docs and examples for X-LoRA by EricLBuehler in https://github.com/huggingface/peft/pull/1970
* fix: fix docker build gpus by XciD in https://github.com/huggingface/peft/pull/1987
* FIX: Adjust transformers version check for bloom by BenjaminBossan in https://github.com/huggingface/peft/pull/1992
* [Hotfix] Fix BOFT mixed precision by Edenzzzz in https://github.com/huggingface/peft/pull/1925
* [Suggestions] Updates suggested for `helper.rescale_adapter_scale` by ariG23498 in https://github.com/huggingface/peft/pull/1989
* MAINT: Default to loading weights only for torch.load by BenjaminBossan in https://github.com/huggingface/peft/pull/1993
* BOFT bug fix when saving by Zeju1997 in https://github.com/huggingface/peft/pull/1994
* FIX Import error in BOFT half precision test by BenjaminBossan in https://github.com/huggingface/peft/pull/1995
* Update lora.md (typos) by nir-sh-automat-it in https://github.com/huggingface/peft/pull/2003
* TST Add LNTuningConfig and LoKrConfig to tests by BenjaminBossan in https://github.com/huggingface/peft/pull/2005
* ENH: Warn when a user provided model name in the config renamed by BenjaminBossan in https://github.com/huggingface/peft/pull/2004
* FIX CI Correctly report outcome of bnb import test by BenjaminBossan in https://github.com/huggingface/peft/pull/2007
* Update docs for X-LoRA and some bugfixes by EricLBuehler in https://github.com/huggingface/peft/pull/2002
* TST: Potentially Skip 8bit bnb regression test if compute capability is too low by BenjaminBossan in https://github.com/huggingface/peft/pull/1998
* CI Activate single core multi backend bnb tests by BenjaminBossan in https://github.com/huggingface/peft/pull/2008
* Fix usage of deprecated parameters/functions in X-LoRA by EricLBuehler in https://github.com/huggingface/peft/pull/2010
* [tests] enable `test_vera_dtypes` on XPU by faaany in https://github.com/huggingface/peft/pull/2017
* CI Remove regression tests from BNB CI by BenjaminBossan in https://github.com/huggingface/peft/pull/2024
* [tests] enable regression tests on XPU by faaany in https://github.com/huggingface/peft/pull/2019
* ENH: Better error msg for replace_lora_weights_loftq when using a local model. by BenjaminBossan in https://github.com/huggingface/peft/pull/2022
* [tests] make cuda-only cases in `TestModelAndLayerStatus` device-agnostic by faaany in https://github.com/huggingface/peft/pull/2026
* [tests] enable `test_mixed_adapter_batches_lora_opt_timing` on XPU by faaany in https://github.com/huggingface/peft/pull/2021
* MAINT: Update ruff version to ~0.6.1 by BenjaminBossan in https://github.com/huggingface/peft/pull/1965
* ENH Raise error when applying modules_to_save on tuner layer by BenjaminBossan in https://github.com/huggingface/peft/pull/2028
* FIX: Don't target the classification head when using target_modules="all-linear" by BenjaminBossan in https://github.com/huggingface/peft/pull/2033
* [tests] enable cuda-only tests in `test_common_gpu.py` to work on XPU by faaany in https://github.com/huggingface/peft/pull/2031
* [Add] DoRA Embedding by ariG23498 in https://github.com/huggingface/peft/pull/2006
* [tests] enable `test_gpu_examples.py` on XPU by faaany in https://github.com/huggingface/peft/pull/2036
* Bug: set correct pre-commit-hooks version by ltoniazzi in https://github.com/huggingface/peft/pull/2034
* Warn if using tied target module with `tie_word_embeddings` by ltoniazzi in https://github.com/huggingface/peft/pull/2025
* ENH: Faster adapter loading if there are a lot of target modules by BenjaminBossan in https://github.com/huggingface/peft/pull/2045
* FIX: Error with OLoRA init when using bnb by BenjaminBossan in https://github.com/huggingface/peft/pull/2011
* FIX: Small numerical discrepancy for p-tuning after loading the model by BenjaminBossan in https://github.com/huggingface/peft/pull/2047
* Add VB-LoRA by leo-yangli in https://github.com/huggingface/peft/pull/2039
* Fixing scalings logging test by EricLBuehler in https://github.com/huggingface/peft/pull/2042
* TST: Fewer inference steps for stable diffusion tests by BenjaminBossan in https://github.com/huggingface/peft/pull/2051
* TST Speed up vision model tests by BenjaminBossan in https://github.com/huggingface/peft/pull/2058
* TST: Make X-LoRA tests faster by BenjaminBossan in https://github.com/huggingface/peft/pull/2059
* Update permissions for githubtoken stale.yml by glegendre01 in https://github.com/huggingface/peft/pull/2061
* MAINT: Give stale bot permissions for PRs too by BenjaminBossan in https://github.com/huggingface/peft/pull/2064
* avoid saving boft_P in adapter model by sywangyi in https://github.com/huggingface/peft/pull/2050
* fix arguments for PiSSA preprocess by keakon in https://github.com/huggingface/peft/pull/2053
* Apply deprecated `evaluation_strategy` by muellerzr in https://github.com/huggingface/peft/pull/1664
* fixing multiple LoRA in the same batch or vit by saeid93 in https://github.com/huggingface/peft/pull/1990
* FIX: Bug that prevents BOFT from loading multiple adapters by BenjaminBossan in https://github.com/huggingface/peft/pull/2068
* [tests] skip some tests for XPU devices by faaany in https://github.com/huggingface/peft/pull/2074
* ENH: PiSSA/OLoRA: Preserve original config on save by BenjaminBossan in https://github.com/huggingface/peft/pull/2077
* Expose bias to to ModulesToSaveWrapper by dengdifan in https://github.com/huggingface/peft/pull/2081
* Update setup.py to update contact info by sayakpaul in https://github.com/huggingface/peft/pull/2086
* ENH: Allow empty initialization of adapter weight by BenjaminBossan in https://github.com/huggingface/peft/pull/1961
* ENH: Add default target layers for gemma2 architecture by BenjaminBossan in https://github.com/huggingface/peft/pull/2078
* FIX: Bug in find_minimal_target_modules by BenjaminBossan in https://github.com/huggingface/peft/pull/2083
* Fix func docstring by kwonmha in https://github.com/huggingface/peft/pull/2087
* ENH: Better DoRA check in mixed adapter batch inference by BenjaminBossan in https://github.com/huggingface/peft/pull/2089

New Contributors
* Nidhogg-lyz made their first contribution in https://github.com/huggingface/peft/pull/1942
* XciD made their first contribution in https://github.com/huggingface/peft/pull/1985
* bhargavyagnik made their first contribution in https://github.com/huggingface/peft/pull/1981
* ariG23498 made their first contribution in https://github.com/huggingface/peft/pull/1951
* Edenzzzz made their first contribution in https://github.com/huggingface/peft/pull/1925
* Zeju1997 made their first contribution in https://github.com/huggingface/peft/pull/1994
* nir-sh-automat-it made their first contribution in https://github.com/huggingface/peft/pull/2003
* faaany made their first contribution in https://github.com/huggingface/peft/pull/2017
* ltoniazzi made their first contribution in https://github.com/huggingface/peft/pull/2034
* leo-yangli made their first contribution in https://github.com/huggingface/peft/pull/2039
* glegendre01 made their first contribution in https://github.com/huggingface/peft/pull/2061
* keakon made their first contribution in https://github.com/huggingface/peft/pull/2053
* muellerzr made their first contribution in https://github.com/huggingface/peft/pull/1664
* saeid93 made their first contribution in https://github.com/huggingface/peft/pull/1990
* dengdifan made their first contribution in https://github.com/huggingface/peft/pull/2081
* kwonmha made their first contribution in https://github.com/huggingface/peft/pull/2087

**Full Changelog**: https://github.com/huggingface/peft/compare/v0.12.0...v0.13.0

Page 1 of 6

Releases

Has known vulnerabilities

Peft

Page 1 of 6

0.15.1

0.15.0

0.14.0

0.13.2

0.13.1

0.13.0

Page 1 of 6

Links

Releases