Peft

Latest version: v0.11.0

Safety actively analyzes 628918 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 4

0.11.0

Highlights

![peft-v0 11 0](https://github.com/huggingface/peft/assets/6229650/ca652d10-c389-4163-ab62-1e0c821c9c5a)

New methods

BOFT

Thanks to yfeng95, Zeju1997, and YuliangXiu, PEFT was extended with BOFT: Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization (1326, [BOFT paper link](https://huggingface.co/papers/2311.06243)). In PEFT v0.7.0, we already added [OFT](https://huggingface.co/papers/2306.07280), but BOFT is even more parameter efficient. Check out the included [BOFT controlnet](https://github.com/huggingface/peft/tree/main/examples/boft_controlnet) and [BOFT dreambooth](https://github.com/huggingface/peft/tree/main/examples/boft_dreambooth) examples.

VeRA

If the parameter reduction of LoRA is not enough for your use case, you should take a close look at VeRA: Vector-based Random Matrix Adaptation (1564, [VeRA paper link](https://huggingface.co/papers/2310.11454)). This method resembles LoRA but adds two learnable scaling vectors to the two LoRA weight matrices. However, the LoRA weights themselves are shared across all layers, considerably reducing the number of trainable parameters.

The bulk of this PR was implemented by contributor vvvm23 with the help of dkopi.

PiSSA

PiSSA, Principal Singular values and Singular vectors Adaptation, is a new initialization method for LoRA, which was added by fxmeng (1626, [PiSSA paper link](https://huggingface.co/papers/2404.02948)). The improved initialization promises to speed up convergence and improve the final performance of LoRA models. When using models quantized with bitsandbytes, PiSSA initialization should reduce the quantization error, similar to LoftQ.

Quantization

HQQ

Thanks to fahadh4ilyas, PEFT LoRA linear layers now support Half-Quadratic Quantization, HQQ (1618, [HQQ repo](https://github.com/mobiusml/hqq/)). HQQ is fast and efficient (down to 2 bits), while not requiring calibration data.

EETQ

Another new quantization method supported in PEFT is Easy & Efficient Quantization for Transformers, EETQ (1675, [EETQ repo](https://github.com/NetEase-FuXi/EETQ)). This 8 bit quantization method works for LoRA linear layers and should be faster than bitsandbytes.

Show adapter layer and model status

We added a feature to show adapter layer and model status of PEFT models in 1663. With the newly added methods, you can easily check what adapters exist on your model, whether gradients are active, whether they are enabled, which ones are active or merged. You will also be informed if irregularities have been detected.

To use this new feature, call `model.get_layer_status()` for layer-level information, and `model.get_model_status()` for model-level information. For more details, check out our [docs on layer and model status](https://huggingface.co/docs/peft/main/en/developer_guides/troubleshooting#check-layer-and-model-status).

Changes

Edge case of how we deal with `modules_to_save`

We had the issue that when we were using classes such as PeftModelForSequenceClassification, we implicitly added the classifier layers to `model.modules_to_save`. However, this would only add a new `ModulesToSaveWrapper` instance for the first adapter being initialized. When initializing a 2nd adapter via `model.add_adapter`, this information was ignored. Now, `peft_config.modules_to_save` is updated explicitly to add the classifier layers (1615). This is a departure from how this worked previously, but it reflects the intended behavior better.

Furthermore, when merging together multiple LoRA adapters using `model.add_weighted_adapter`, if these adapters had `modules_to_save`, the original parameters of these modules would be used. This is unexpected and will most likely result in bad outputs. As there is no clear way to merge these modules, we decided to raise an error in this case (1615).

What's Changed
* Bump version to 0.10.1.dev0 by BenjaminBossan in https://github.com/huggingface/peft/pull/1578
* FIX Minor issues in docs, re-raising exception by BenjaminBossan in https://github.com/huggingface/peft/pull/1581
* FIX / Docs: Fix doc link for layer replication by younesbelkada in https://github.com/huggingface/peft/pull/1582
* DOC: Short section on using transformers pipeline by BenjaminBossan in https://github.com/huggingface/peft/pull/1587
* Extend PeftModel.from_pretrained() to models with disk-offloaded modules by blbadger in https://github.com/huggingface/peft/pull/1431
* [feat] Add `lru_cache` to `import_utils` calls that did not previously have it by tisles in https://github.com/huggingface/peft/pull/1584
* fix deepspeed zero3+prompt tuning bug. word_embeddings.weight shape i… by sywangyi in https://github.com/huggingface/peft/pull/1591
* MNT: Update GH bug report template by BenjaminBossan in https://github.com/huggingface/peft/pull/1600
* fix the torch_dtype and quant_storage_dtype by pacman100 in https://github.com/huggingface/peft/pull/1614
* FIX In the image classification example, Change the model to the LoRA… by changhwa in https://github.com/huggingface/peft/pull/1624
* Remove duplicated import by nzw0301 in https://github.com/huggingface/peft/pull/1622
* FIX: bnb config wrong argument names by BenjaminBossan in https://github.com/huggingface/peft/pull/1603
* FIX Make DoRA work with Conv1D layers by BenjaminBossan in https://github.com/huggingface/peft/pull/1588
* FIX: Send results to correct channel by younesbelkada in https://github.com/huggingface/peft/pull/1628
* FEAT: Allow ignoring mismatched sizes when loading by BenjaminBossan in https://github.com/huggingface/peft/pull/1620
* itemsize is torch>=2.1, use element_size() by winglian in https://github.com/huggingface/peft/pull/1630
* FIX Multiple adapters and modules_to_save by BenjaminBossan in https://github.com/huggingface/peft/pull/1615
* FIX Correctly call element_size by BenjaminBossan in https://github.com/huggingface/peft/pull/1635
* fix: allow load_adapter to use different device by yhZhai in https://github.com/huggingface/peft/pull/1631
* Adalora deepspeed by sywangyi in https://github.com/huggingface/peft/pull/1625
* Adding BOFT: Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization by yfeng95 in https://github.com/huggingface/peft/pull/1326
* Don't use deprecated `Repository` anymore by Wauplin in https://github.com/huggingface/peft/pull/1641
* FIX Errors in the transformers integration docs by BenjaminBossan in https://github.com/huggingface/peft/pull/1629
* update figure assets of BOFT by YuliangXiu in https://github.com/huggingface/peft/pull/1642
* print_trainable_parameters - format `%` to be sensible by stas00 in https://github.com/huggingface/peft/pull/1648
* FIX: Bug with handling of active adapters by BenjaminBossan in https://github.com/huggingface/peft/pull/1659
* Remove `dreambooth` Git link by charliermarsh in https://github.com/huggingface/peft/pull/1660
* add safetensor load in multitask_prompt_tuning by sywangyi in https://github.com/huggingface/peft/pull/1662
* Adds Vera (Vector Based Random Matrix Adaption) 2 by BenjaminBossan in https://github.com/huggingface/peft/pull/1564
* Update deepspeed.md by sanghyuk-choi in https://github.com/huggingface/peft/pull/1679
* ENH: Add multi-backend tests for bnb by younesbelkada in https://github.com/huggingface/peft/pull/1667
* FIX / Workflow: Fix Mac-OS CI issues by younesbelkada in https://github.com/huggingface/peft/pull/1680
* FIX Use trl version of tiny random llama by BenjaminBossan in https://github.com/huggingface/peft/pull/1681
* FIX: Don't eagerly import bnb for LoftQ by BenjaminBossan in https://github.com/huggingface/peft/pull/1683
* FEAT: Add EETQ support in PEFT by younesbelkada in https://github.com/huggingface/peft/pull/1675
* FIX / Workflow: Always notify on slack for docker image workflows by younesbelkada in https://github.com/huggingface/peft/pull/1682
* FIX: upgrade autoawq to latest version by younesbelkada in https://github.com/huggingface/peft/pull/1684
* FIX: Initialize DoRA weights in float32 if float16 is being used by BenjaminBossan in https://github.com/huggingface/peft/pull/1653
* fix bf16 model type issue for ia3 by sywangyi in https://github.com/huggingface/peft/pull/1634
* FIX Issues with AdaLora initialization by BenjaminBossan in https://github.com/huggingface/peft/pull/1652
* FEAT Show adapter layer and model status by BenjaminBossan in https://github.com/huggingface/peft/pull/1663
* Fixing the example by providing correct tokenized seq length by jpodivin in https://github.com/huggingface/peft/pull/1686
* TST: Skiping AWQ tests for now .. by younesbelkada in https://github.com/huggingface/peft/pull/1690
* Add LayerNorm tuning model by DTennant in https://github.com/huggingface/peft/pull/1301
* FIX Use different doc builder docker image by BenjaminBossan in https://github.com/huggingface/peft/pull/1697
* Set experimental dynamo config for compile tests by BenjaminBossan in https://github.com/huggingface/peft/pull/1698
* fix the fsdp peft autowrap policy by pacman100 in https://github.com/huggingface/peft/pull/1694
* Add LoRA support to HQQ Quantization by fahadh4ilyas in https://github.com/huggingface/peft/pull/1618
* FEAT Helper to check if a model is a PEFT model by BenjaminBossan in https://github.com/huggingface/peft/pull/1713
* support Cambricon MLUs device by huismiling in https://github.com/huggingface/peft/pull/1687
* Some small cleanups in docstrings, copyright note by BenjaminBossan in https://github.com/huggingface/peft/pull/1714
* Fix docs typo by NielsRogge in https://github.com/huggingface/peft/pull/1719
* revise run_peft_multigpu.sh by abzb1 in https://github.com/huggingface/peft/pull/1722
* Workflow: Add slack messages workflow by younesbelkada in https://github.com/huggingface/peft/pull/1723
* DOC Document the PEFT checkpoint format by BenjaminBossan in https://github.com/huggingface/peft/pull/1717
* FIX Allow DoRA init on CPU when using BNB by BenjaminBossan in https://github.com/huggingface/peft/pull/1724
* Adding PiSSA as an optional initialization method of LoRA by fxmeng in https://github.com/huggingface/peft/pull/1626

New Contributors
* tisles made their first contribution in https://github.com/huggingface/peft/pull/1584
* changhwa made their first contribution in https://github.com/huggingface/peft/pull/1624
* yhZhai made their first contribution in https://github.com/huggingface/peft/pull/1631
* yfeng95 made their first contribution in https://github.com/huggingface/peft/pull/1326
* YuliangXiu made their first contribution in https://github.com/huggingface/peft/pull/1642
* charliermarsh made their first contribution in https://github.com/huggingface/peft/pull/1660
* sanghyuk-choi made their first contribution in https://github.com/huggingface/peft/pull/1679
* jpodivin made their first contribution in https://github.com/huggingface/peft/pull/1686
* DTennant made their first contribution in https://github.com/huggingface/peft/pull/1301
* fahadh4ilyas made their first contribution in https://github.com/huggingface/peft/pull/1618
* huismiling made their first contribution in https://github.com/huggingface/peft/pull/1687
* NielsRogge made their first contribution in https://github.com/huggingface/peft/pull/1719
* abzb1 made their first contribution in https://github.com/huggingface/peft/pull/1722
* fxmeng made their first contribution in https://github.com/huggingface/peft/pull/1626

**Full Changelog**: https://github.com/huggingface/peft/compare/v0.10.0...v0.11.0

0.10.0

Highlights

![image](https://github.com/huggingface/peft/assets/49240599/8274f36f-246f-4509-a6e4-804aba574566)

Support for QLoRA with DeepSpeed ZeRO3 and FSDP

We added a couple of changes to allow QLoRA to work with DeepSpeed ZeRO3 and Fully Sharded Data Parallel (FSDP). For instance, this allows you to fine-tune a 70B Llama model on two GPUs with 24GB memory each. Besides the latest version of PEFT, this requires `bitsandbytes>=0.43.0`, `accelerate>=0.28.0`, `transformers>4.38.2`, `trl>0.7.11`. Check out our docs on [DeepSpeed](https://huggingface.co/docs/peft/v0.10.0/en/accelerate/deepspeed) and [FSDP](https://huggingface.co/docs/peft/v0.10.0/en/accelerate/fsdp) with PEFT, as well as this [blogpost](https://www.answer.ai/posts/2024-03-06-fsdp-qlora.html) from answer.ai, for more details.

Layer replication

First time contributor siddartha-RE added support for layer replication with LoRA. This allows you to duplicate layers of a model and apply LoRA adapters to them. Since the base weights are shared, this costs only very little extra memory, but can lead to a nice improvement of model performance. Find out more in [our docs](https://huggingface.co/docs/peft/v0.10.0/en/developer_guides/lora#memory-efficient-layer-replication-with-lora).

Improving DoRA

Last release, we added the option to enable [DoRA](https://arxiv.org/abs/2402.09353) in PEFT by simply adding `use_dora=True` to your `LoraConfig`. However, this only worked for non-quantized linear layers. With this PEFT release, we now also support `Conv2d` layers, as well as linear layers quantized with bitsandbytes.

Mixed LoRA adapter batches

If you have a PEFT model with multiple LoRA adapters attached to it, it's now possible to apply different adapters (or, in fact, no adapter) on different samples in the same batch. To do this, pass a list of adapter names as an additional argument. For example, if you have a batch of three samples:

python
output = model(**inputs, adapter_names=["adapter1", "adapter2", "__base__"])`

Here, `"adapter1"` and `"adapter2"` should be the same name as your corresponding LoRA adapters and `"__base__"` is a special name that refers to the base model without any adapter. Find more details in [our docs](https://huggingface.co/docs/peft/v0.10.0/en/developer_guides/lora#inference-with-different-lora-adapters-in-the-same-batch).

Without this feature, if you wanted to run inference with different LoRA adapters, you'd have to use single samples or try to group batches with the same adapter, then switch between adapters using `set_adapter` -- this is inefficient and inconvenient. Therefore, it is recommended to use this new, faster method from now on when encountering this scenario.

New LoftQ initialization function

We added an alternative way to initialize LoRA weights for a quantized model using the LoftQ method, which can be more convenient than the existing method. Right now, using LoftQ requires you to go through multiple steps as shown [here](https://github.com/huggingface/peft/blob/8e979fc73248ccb4c5b5a99c415f3e14a37daae6/examples/loftq_finetuning/README.md). Furthermore, it's necessary to keep a separate copy of the quantized weights, as those are not identical to the quantized weights from the default model.

Using the new `replace_lora_weights_loftq` function, it's now possible to apply LoftQ initialization in a single step and without the need for extra copies of the weights. Check out [the docs](https://huggingface.co/docs/peft/v0.10.0/en/developer_guides/lora#a-more-convienient-way) and this [example notebook](https://github.com/huggingface/peft/blob/main/examples/loftq_finetuning/LoftQ_weight_replacement.ipynb) to see how it works. Right now, this method only supports 4bit quantization with bitsandbytes, and the model has to be stored in the safetensors format.

Deprecations

The function `prepare_model_for_int8_training` was deprecated for quite some time and is now removed completely. Use `prepare_model_for_kbit_training` instead.

What's Changed

Besides these highlights, we added many small improvements and fixed a couple of bugs. All these changes are listed below. As always, we thank all the awesome contributors who helped us improve PEFT.

* Bump version to 0.9.1.dev0 by BenjaminBossan in https://github.com/huggingface/peft/pull/1517
* Fix for "leaf Variable that requires grad" Error in In-Place Operation by DopeorNope-Lee in https://github.com/huggingface/peft/pull/1372
* FIX [`CI` / `Docker`] Follow up from 1481 by younesbelkada in https://github.com/huggingface/peft/pull/1487
* CI: temporary disable workflow by younesbelkada in https://github.com/huggingface/peft/pull/1534
* FIX [`Docs`/ `bnb` / `DeepSpeed`] Add clarification on bnb + PEFT + DS compatibilities by younesbelkada in https://github.com/huggingface/peft/pull/1529
* Expose bias attribute on tuner layers by BenjaminBossan in https://github.com/huggingface/peft/pull/1530
* docs: highlight difference between `num_parameters()` and `get_nb_trainable_parameters()` in PEFT by kmehant in https://github.com/huggingface/peft/pull/1531
* fix: fail when required args not passed when `prompt_tuning_init==TEXT` by kmehant in https://github.com/huggingface/peft/pull/1519
* Fixed minor grammatical and code bugs by gremlin97 in https://github.com/huggingface/peft/pull/1542
* Optimize `levenshtein_distance` algorithm in `peft_lora_seq2seq_accelera…` by SUNGOD3 in https://github.com/huggingface/peft/pull/1527
* Update `prompt_based_methods.md` by insist93 in https://github.com/huggingface/peft/pull/1548
* FIX Allow AdaLoRA rank to be 0 by BenjaminBossan in https://github.com/huggingface/peft/pull/1540
* FIX: Make adaptation prompt CI happy for transformers 4.39.0 by younesbelkada in https://github.com/huggingface/peft/pull/1551
* MNT: Use `BitsAndBytesConfig` as `load_in_*` is deprecated by BenjaminBossan in https://github.com/huggingface/peft/pull/1552
* Add Support for Mistral Model in Llama-Adapter Method by PrakharSaxena24 in https://github.com/huggingface/peft/pull/1433
* Add support for layer replication in LoRA by siddartha-RE in https://github.com/huggingface/peft/pull/1368
* QDoRA: Support DoRA with BnB quantization by BenjaminBossan in https://github.com/huggingface/peft/pull/1518
* Feat: add support for Conv2D DoRA by sayakpaul in https://github.com/huggingface/peft/pull/1516
* TST Report slowest tests by BenjaminBossan in https://github.com/huggingface/peft/pull/1556
* Changes to support fsdp+qlora and dsz3+qlora by pacman100 in https://github.com/huggingface/peft/pull/1550
* Update style with ruff 0.2.2 by BenjaminBossan in https://github.com/huggingface/peft/pull/1565
* FEAT Mixing different LoRA adapters in same batch by BenjaminBossan in https://github.com/huggingface/peft/pull/1558
* FIX [`CI`] Fix test docker CI by younesbelkada in https://github.com/huggingface/peft/pull/1535
* Fix LoftQ docs and tests by BenjaminBossan in https://github.com/huggingface/peft/pull/1532
* More convenient way to initialize LoftQ by BenjaminBossan in https://github.com/huggingface/peft/pull/1543

New Contributors
* DopeorNope-Lee made their first contribution in https://github.com/huggingface/peft/pull/1372
* kmehant made their first contribution in https://github.com/huggingface/peft/pull/1531
* gremlin97 made their first contribution in https://github.com/huggingface/peft/pull/1542
* SUNGOD3 made their first contribution in https://github.com/huggingface/peft/pull/1527
* insist93 made their first contribution in https://github.com/huggingface/peft/pull/1548
* PrakharSaxena24 made their first contribution in https://github.com/huggingface/peft/pull/1433
* siddartha-RE made their first contribution in https://github.com/huggingface/peft/pull/1368

**Full Changelog**: https://github.com/huggingface/peft/compare/v0.9.0...v0.10.0

0.9.0

Highlights

New methods for merging LoRA weights together
![cat_teapot](https://github.com/huggingface/peft/assets/13534540/5329d4f8-fe17-448e-94dc-b97a8e621659)

With PR 1364, we added new methods for merging LoRA weights together. This is _not_ about merging LoRA weights into the base model. Instead, this is about merging the weights from _different LoRA adapters_ into a single adapter by calling `add_weighted_adapter`. This allows you to combine the strength from multiple LoRA adapters into a single adapter, while being faster than activating each of these adapters individually.

Although this feature has already existed in PEFT for some time, we have added new merging methods that promise much better results. The first is based on [TIES](https://arxiv.org/abs/2306.01708), the second on [DARE](https://arxiv.org/abs/2311.03099) and a new one inspired by both called **Magnitude Prune**. If you haven't tried these new methods, or haven't touched the LoRA weight merging feature at all, you can find more information here:

- [Blog post](https://huggingface.co/blog/peft_merging)
- [PEFT docs](https://huggingface.co/docs/peft/main/en/developer_guides/lora#merge-adapters)
- [Example notebook using diffusers](https://github.com/huggingface/peft/blob/main/examples/multi_adapter_examples/multi_adapter_weighted_inference_diffusers.ipynb)
- [Example notebook using an LLM](https://github.com/huggingface/peft/blob/main/examples/multi_adapter_examples/Lora_Merging.ipynb)

AWQ and AQLM support for LoRA

Via 1394, we now support [AutoAWQ](https://github.com/casper-hansen/AutoAWQ) in PEFT. This is a new method for 4bit quantization of model weights.

<img width="1197" alt="Screenshot 2024-02-28 at 09 41 40" src="https://github.com/huggingface/peft/assets/49240599/431d485b-c2b9-4e49-b407-89977875e6ef">

Similarly, we now support [AQLM](https://github.com/Vahe1994/AQLM) via #1476. This method allows to quantize weights to as low as 2 bits. Both methods support quantizing `nn.Linear` layers. To find out more about all the quantization options that work with PEFT, check out our docs [here](https://huggingface.co/docs/peft/developer_guides/quantization).

<img width="1197" alt="Screenshot 2024-02-28 at 09 42 22" src="https://github.com/huggingface/peft/assets/49240599/6f1e250b-8981-4e2a-9fa2-028d76150912">

Note these integrations do not support `merge_and_unload()` yet, meaning for inference you need to always attach the adapter weights into the base model

DoRA support

We now support Weight-Decomposed Low-Rank Adaptation aka [DoRA](https://arxiv.org/abs/2402.09353) via #1474. This new method is builds on top of LoRA and has shown very promising results. Especially at lower ranks (e.g. `r=8`), it should perform much better than LoRA. Right now, only non-quantized `nn.Linear` layers are supported. If you'd like to give it a try, just pass `use_dora=True` to your `LoraConfig` and you're good to go.

Documentation

Thanks to stevhliu and many other contributors, there have been big improvements to the documentation. You should find it more organized and more up-to-date. Our [DeepSpeed](https://huggingface.co/docs/peft/accelerate/deepspeed) and [FSDP](https://huggingface.co/docs/peft/accelerate/fsdp) guides have also been much improved.

[Check out our improved docs](https://huggingface.co/docs/peft/index) if you haven't already!

Development

If you're implementing custom adapter layers, for instance a custom `LoraLayer`, note that all subclasses should now implement `update_layer` -- unless they want to use the default method by the parent class. In particular, this means you should no longer use different method names for the subclass, like `update_layer_embedding`. Also, we generally don't permit ranks (`r`) of 0 anymore. For more, see [this PR](https://github.com/huggingface/peft/pull/1268).

Developers should have an easier time now since we fully [embrace ruff](https://github.com/huggingface/peft/pull/1421). If you're the type of person who forgets to call `make style` before pushing to a PR, consider adding a [pre-commit hook](https://huggingface.co/docs/peft/developer_guides/contributing#tests-and-code-quality-checks). Tests are now a bit less verbose by using [plain asserts](https://github.com/huggingface/peft/pull/1448) and generally embracing pytest features more fully. All of this comes thanks to akx.

What's Changed

On top of these changes, we have added a lot of small changes since the last release, check out the full changes below. As always, we had a lot of support by many contributors, you're awesome!

* Release patch version 0.8.2 by pacman100 in https://github.com/huggingface/peft/pull/1428
* [docs] Polytropon API by stevhliu in https://github.com/huggingface/peft/pull/1422
* Fix `MatMul8bitLtBackward` view issue by younesbelkada in https://github.com/huggingface/peft/pull/1425
* Fix typos by szepeviktor in https://github.com/huggingface/peft/pull/1435
* Fixed saving for models that don't have _name_or_path in config by kovalexal in https://github.com/huggingface/peft/pull/1440
* [docs] README update by stevhliu in https://github.com/huggingface/peft/pull/1411
* [docs] Doc maintenance by stevhliu in https://github.com/huggingface/peft/pull/1394
* [`core`/`TPLinear`] Fix breaking change by younesbelkada in https://github.com/huggingface/peft/pull/1439
* Renovate quality tools by akx in https://github.com/huggingface/peft/pull/1421
* [Docs] call `set_adapters()` after add_weighted_adapter by sayakpaul in https://github.com/huggingface/peft/pull/1444
* MNT: Check only selected directories with ruff by BenjaminBossan in https://github.com/huggingface/peft/pull/1446
* TST: Improve test coverage by skipping fewer tests by BenjaminBossan in https://github.com/huggingface/peft/pull/1445
* Update Dockerfile to reflect how to compile bnb from source by younesbelkada in https://github.com/huggingface/peft/pull/1437
* [docs] Lora-like guides by stevhliu in https://github.com/huggingface/peft/pull/1371
* [docs] IA3 by stevhliu in https://github.com/huggingface/peft/pull/1373
* Add docstrings for set_adapter and keep frozen by EricLBuehler in https://github.com/huggingface/peft/pull/1447
* Add new merging methods by pacman100 in https://github.com/huggingface/peft/pull/1364
* FIX Loading with AutoPeftModel.from_pretrained by BenjaminBossan in https://github.com/huggingface/peft/pull/1449
* Support `modules_to_save` config option when using DeepSpeed ZeRO-3 with ZeRO init enabled. by pacman100 in https://github.com/huggingface/peft/pull/1450
* FIX Honor HF_HUB_OFFLINE mode if set by user by BenjaminBossan in https://github.com/huggingface/peft/pull/1454
* [docs] Remove iframe by stevhliu in https://github.com/huggingface/peft/pull/1456
* [docs] Docstring typo by stevhliu in https://github.com/huggingface/peft/pull/1455
* [`core` / `get_peft_state_dict`] Ignore all exceptions to avoid unexpected errors by younesbelkada in https://github.com/huggingface/peft/pull/1458
* [ `Adaptation Prompt`] Fix llama rotary embedding issue with transformers main by younesbelkada in https://github.com/huggingface/peft/pull/1459
* [`CI`] Add CI tests on transformers main to catch early bugs by younesbelkada in https://github.com/huggingface/peft/pull/1461
* Use plain asserts in tests by akx in https://github.com/huggingface/peft/pull/1448
* Add default IA3 target modules for Mixtral by arnavgarg1 in https://github.com/huggingface/peft/pull/1376
* add `magnitude_prune` merging method by pacman100 in https://github.com/huggingface/peft/pull/1466
* [docs] Model merging by stevhliu in https://github.com/huggingface/peft/pull/1423
* Adds an example notebook for showing multi-adapter weighted inference by sayakpaul in https://github.com/huggingface/peft/pull/1471
* Make tests succeed more on MPS by akx in https://github.com/huggingface/peft/pull/1463
* [`CI`] Fix adaptation prompt CI on transformers main by younesbelkada in https://github.com/huggingface/peft/pull/1465
* Update docstring at peft_types.py by eduardozamudio in https://github.com/huggingface/peft/pull/1475
* FEAT: add awq suppot in PEFT by younesbelkada in https://github.com/huggingface/peft/pull/1399
* Add pre-commit configuration by akx in https://github.com/huggingface/peft/pull/1467
* ENH [`CI`] Run tests only when relevant files are modified by younesbelkada in https://github.com/huggingface/peft/pull/1482
* FIX [`CI` / `bnb`] Fix failing bnb workflow by younesbelkada in https://github.com/huggingface/peft/pull/1480
* FIX [`PromptTuning`] Simple fix for transformers >= 4.38 by younesbelkada in https://github.com/huggingface/peft/pull/1484
* FIX: Multitask prompt tuning with other tuning init by BenjaminBossan in https://github.com/huggingface/peft/pull/1144
* previous_dtype is now inferred from F.linear's result output type. by MFajcik in https://github.com/huggingface/peft/pull/1010
* ENH: [`CI` / `Docker`]: Create a workflow to temporarly build docker images in case dockerfiles are modified by younesbelkada in https://github.com/huggingface/peft/pull/1481
* Fix issue with unloading double wrapped modules by BenjaminBossan in https://github.com/huggingface/peft/pull/1490
* FIX: [`CI` / `Adaptation Prompt`] Fix CI on transformers main by younesbelkada in https://github.com/huggingface/peft/pull/1493
* Update peft_bnb_whisper_large_v2_training.ipynb: Fix a typo by martin0258 in https://github.com/huggingface/peft/pull/1494
* covert SVDLinear dtype by PHOSPHENES8 in https://github.com/huggingface/peft/pull/1495
* Raise error on wrong type for to modules_to_save by BenjaminBossan in https://github.com/huggingface/peft/pull/1496
* AQLM support for LoRA by BlackSamorez in https://github.com/huggingface/peft/pull/1476
* Allow trust_remote_code for tokenizers when loading AutoPeftModels by OfficialDelta in https://github.com/huggingface/peft/pull/1477
* Add default LoRA and IA3 target modules for Gemma by arnavgarg1 in https://github.com/huggingface/peft/pull/1499
* FIX Bug in prompt learning after disabling adapter by BenjaminBossan in https://github.com/huggingface/peft/pull/1502
* add example and update deepspeed/FSDP docs by pacman100 in https://github.com/huggingface/peft/pull/1489
* FIX Safe merging with LoHa and LoKr by BenjaminBossan in https://github.com/huggingface/peft/pull/1505
* ENH: [`Docker`] Notify us when docker build pass or fail by younesbelkada in https://github.com/huggingface/peft/pull/1503
* Implement DoRA by BenjaminBossan in https://github.com/huggingface/peft/pull/1474

New Contributors
* szepeviktor made their first contribution in https://github.com/huggingface/peft/pull/1435
* akx made their first contribution in https://github.com/huggingface/peft/pull/1421
* EricLBuehler made their first contribution in https://github.com/huggingface/peft/pull/1447
* eduardozamudio made their first contribution in https://github.com/huggingface/peft/pull/1475
* MFajcik made their first contribution in https://github.com/huggingface/peft/pull/1010
* martin0258 made their first contribution in https://github.com/huggingface/peft/pull/1494
* PHOSPHENES8 made their first contribution in https://github.com/huggingface/peft/pull/1495
* BlackSamorez made their first contribution in https://github.com/huggingface/peft/pull/1476
* OfficialDelta made their first contribution in https://github.com/huggingface/peft/pull/1477

**Full Changelog**: https://github.com/huggingface/peft/compare/v0.8.2...v0.9.0

0.8.2

What's Changed

0.8.2.dev0

* Add IA3 Modules for Phi by arnavgarg1 in https://github.com/huggingface/peft/pull/1407
* Update custom_models.md by boyufan in https://github.com/huggingface/peft/pull/1409
* Add positional args to PeftModelForCausalLM.generate by SumanthRH in https://github.com/huggingface/peft/pull/1393
* [Hub] fix: subfolder existence check by sayakpaul in https://github.com/huggingface/peft/pull/1417
* FIX: Make merging of adapter weights idempotent by BenjaminBossan in https://github.com/huggingface/peft/pull/1355
* [`core`] fix critical bug in diffusers by younesbelkada in https://github.com/huggingface/peft/pull/1427

New Contributors
* boyufan made their first contribution in https://github.com/huggingface/peft/pull/1409

**Full Changelog**: https://github.com/huggingface/peft/compare/v0.8.1...v0.8.2

0.8.1

This is a small patch release of PEFT that should:
* Fix breaking change related to support for saving resized embedding layers and Diffusers models. Contributed by younesbelkada in https://github.com/huggingface/peft/pull/1414

What's Changed

Page 1 of 4

Releases

Has known vulnerabilities

Peft

Page 1 of 4

0.11.0

0.10.0

0.9.0

0.8.2

0.8.2.dev0

0.8.1

Page 1 of 4

Links

Releases