Highlights
![peft-v0 11 0](https://github.com/huggingface/peft/assets/6229650/ca652d10-c389-4163-ab62-1e0c821c9c5a)
New methods
BOFT
Thanks to yfeng95, Zeju1997, and YuliangXiu, PEFT was extended with BOFT: Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization (1326, [BOFT paper link](https://huggingface.co/papers/2311.06243)). In PEFT v0.7.0, we already added [OFT](https://huggingface.co/papers/2306.07280), but BOFT is even more parameter efficient. Check out the included [BOFT controlnet](https://github.com/huggingface/peft/tree/main/examples/boft_controlnet) and [BOFT dreambooth](https://github.com/huggingface/peft/tree/main/examples/boft_dreambooth) examples.
VeRA
If the parameter reduction of LoRA is not enough for your use case, you should take a close look at VeRA: Vector-based Random Matrix Adaptation (1564, [VeRA paper link](https://huggingface.co/papers/2310.11454)). This method resembles LoRA but adds two learnable scaling vectors to the two LoRA weight matrices. However, the LoRA weights themselves are shared across all layers, considerably reducing the number of trainable parameters.
The bulk of this PR was implemented by contributor vvvm23 with the help of dkopi.
PiSSA
PiSSA, Principal Singular values and Singular vectors Adaptation, is a new initialization method for LoRA, which was added by fxmeng (1626, [PiSSA paper link](https://huggingface.co/papers/2404.02948)). The improved initialization promises to speed up convergence and improve the final performance of LoRA models. When using models quantized with bitsandbytes, PiSSA initialization should reduce the quantization error, similar to LoftQ.
Quantization
HQQ
Thanks to fahadh4ilyas, PEFT LoRA linear layers now support Half-Quadratic Quantization, HQQ (1618, [HQQ repo](https://github.com/mobiusml/hqq/)). HQQ is fast and efficient (down to 2 bits), while not requiring calibration data.
EETQ
Another new quantization method supported in PEFT is Easy & Efficient Quantization for Transformers, EETQ (1675, [EETQ repo](https://github.com/NetEase-FuXi/EETQ)). This 8 bit quantization method works for LoRA linear layers and should be faster than bitsandbytes.
Show adapter layer and model status
We added a feature to show adapter layer and model status of PEFT models in 1663. With the newly added methods, you can easily check what adapters exist on your model, whether gradients are active, whether they are enabled, which ones are active or merged. You will also be informed if irregularities have been detected.
To use this new feature, call `model.get_layer_status()` for layer-level information, and `model.get_model_status()` for model-level information. For more details, check out our [docs on layer and model status](https://huggingface.co/docs/peft/main/en/developer_guides/troubleshooting#check-layer-and-model-status).
Changes
Edge case of how we deal with `modules_to_save`
We had the issue that when we were using classes such as PeftModelForSequenceClassification, we implicitly added the classifier layers to `model.modules_to_save`. However, this would only add a new `ModulesToSaveWrapper` instance for the first adapter being initialized. When initializing a 2nd adapter via `model.add_adapter`, this information was ignored. Now, `peft_config.modules_to_save` is updated explicitly to add the classifier layers (1615). This is a departure from how this worked previously, but it reflects the intended behavior better.
Furthermore, when merging together multiple LoRA adapters using `model.add_weighted_adapter`, if these adapters had `modules_to_save`, the original parameters of these modules would be used. This is unexpected and will most likely result in bad outputs. As there is no clear way to merge these modules, we decided to raise an error in this case (1615).
What's Changed
* Bump version to 0.10.1.dev0 by BenjaminBossan in https://github.com/huggingface/peft/pull/1578
* FIX Minor issues in docs, re-raising exception by BenjaminBossan in https://github.com/huggingface/peft/pull/1581
* FIX / Docs: Fix doc link for layer replication by younesbelkada in https://github.com/huggingface/peft/pull/1582
* DOC: Short section on using transformers pipeline by BenjaminBossan in https://github.com/huggingface/peft/pull/1587
* Extend PeftModel.from_pretrained() to models with disk-offloaded modules by blbadger in https://github.com/huggingface/peft/pull/1431
* [feat] Add `lru_cache` to `import_utils` calls that did not previously have it by tisles in https://github.com/huggingface/peft/pull/1584
* fix deepspeed zero3+prompt tuning bug. word_embeddings.weight shape i… by sywangyi in https://github.com/huggingface/peft/pull/1591
* MNT: Update GH bug report template by BenjaminBossan in https://github.com/huggingface/peft/pull/1600
* fix the torch_dtype and quant_storage_dtype by pacman100 in https://github.com/huggingface/peft/pull/1614
* FIX In the image classification example, Change the model to the LoRA… by changhwa in https://github.com/huggingface/peft/pull/1624
* Remove duplicated import by nzw0301 in https://github.com/huggingface/peft/pull/1622
* FIX: bnb config wrong argument names by BenjaminBossan in https://github.com/huggingface/peft/pull/1603
* FIX Make DoRA work with Conv1D layers by BenjaminBossan in https://github.com/huggingface/peft/pull/1588
* FIX: Send results to correct channel by younesbelkada in https://github.com/huggingface/peft/pull/1628
* FEAT: Allow ignoring mismatched sizes when loading by BenjaminBossan in https://github.com/huggingface/peft/pull/1620
* itemsize is torch>=2.1, use element_size() by winglian in https://github.com/huggingface/peft/pull/1630
* FIX Multiple adapters and modules_to_save by BenjaminBossan in https://github.com/huggingface/peft/pull/1615
* FIX Correctly call element_size by BenjaminBossan in https://github.com/huggingface/peft/pull/1635
* fix: allow load_adapter to use different device by yhZhai in https://github.com/huggingface/peft/pull/1631
* Adalora deepspeed by sywangyi in https://github.com/huggingface/peft/pull/1625
* Adding BOFT: Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization by yfeng95 in https://github.com/huggingface/peft/pull/1326
* Don't use deprecated `Repository` anymore by Wauplin in https://github.com/huggingface/peft/pull/1641
* FIX Errors in the transformers integration docs by BenjaminBossan in https://github.com/huggingface/peft/pull/1629
* update figure assets of BOFT by YuliangXiu in https://github.com/huggingface/peft/pull/1642
* print_trainable_parameters - format `%` to be sensible by stas00 in https://github.com/huggingface/peft/pull/1648
* FIX: Bug with handling of active adapters by BenjaminBossan in https://github.com/huggingface/peft/pull/1659
* Remove `dreambooth` Git link by charliermarsh in https://github.com/huggingface/peft/pull/1660
* add safetensor load in multitask_prompt_tuning by sywangyi in https://github.com/huggingface/peft/pull/1662
* Adds Vera (Vector Based Random Matrix Adaption) 2 by BenjaminBossan in https://github.com/huggingface/peft/pull/1564
* Update deepspeed.md by sanghyuk-choi in https://github.com/huggingface/peft/pull/1679
* ENH: Add multi-backend tests for bnb by younesbelkada in https://github.com/huggingface/peft/pull/1667
* FIX / Workflow: Fix Mac-OS CI issues by younesbelkada in https://github.com/huggingface/peft/pull/1680
* FIX Use trl version of tiny random llama by BenjaminBossan in https://github.com/huggingface/peft/pull/1681
* FIX: Don't eagerly import bnb for LoftQ by BenjaminBossan in https://github.com/huggingface/peft/pull/1683
* FEAT: Add EETQ support in PEFT by younesbelkada in https://github.com/huggingface/peft/pull/1675
* FIX / Workflow: Always notify on slack for docker image workflows by younesbelkada in https://github.com/huggingface/peft/pull/1682
* FIX: upgrade autoawq to latest version by younesbelkada in https://github.com/huggingface/peft/pull/1684
* FIX: Initialize DoRA weights in float32 if float16 is being used by BenjaminBossan in https://github.com/huggingface/peft/pull/1653
* fix bf16 model type issue for ia3 by sywangyi in https://github.com/huggingface/peft/pull/1634
* FIX Issues with AdaLora initialization by BenjaminBossan in https://github.com/huggingface/peft/pull/1652
* FEAT Show adapter layer and model status by BenjaminBossan in https://github.com/huggingface/peft/pull/1663
* Fixing the example by providing correct tokenized seq length by jpodivin in https://github.com/huggingface/peft/pull/1686
* TST: Skiping AWQ tests for now .. by younesbelkada in https://github.com/huggingface/peft/pull/1690
* Add LayerNorm tuning model by DTennant in https://github.com/huggingface/peft/pull/1301
* FIX Use different doc builder docker image by BenjaminBossan in https://github.com/huggingface/peft/pull/1697
* Set experimental dynamo config for compile tests by BenjaminBossan in https://github.com/huggingface/peft/pull/1698
* fix the fsdp peft autowrap policy by pacman100 in https://github.com/huggingface/peft/pull/1694
* Add LoRA support to HQQ Quantization by fahadh4ilyas in https://github.com/huggingface/peft/pull/1618
* FEAT Helper to check if a model is a PEFT model by BenjaminBossan in https://github.com/huggingface/peft/pull/1713
* support Cambricon MLUs device by huismiling in https://github.com/huggingface/peft/pull/1687
* Some small cleanups in docstrings, copyright note by BenjaminBossan in https://github.com/huggingface/peft/pull/1714
* Fix docs typo by NielsRogge in https://github.com/huggingface/peft/pull/1719
* revise run_peft_multigpu.sh by abzb1 in https://github.com/huggingface/peft/pull/1722
* Workflow: Add slack messages workflow by younesbelkada in https://github.com/huggingface/peft/pull/1723
* DOC Document the PEFT checkpoint format by BenjaminBossan in https://github.com/huggingface/peft/pull/1717
* FIX Allow DoRA init on CPU when using BNB by BenjaminBossan in https://github.com/huggingface/peft/pull/1724
* Adding PiSSA as an optional initialization method of LoRA by fxmeng in https://github.com/huggingface/peft/pull/1626
New Contributors
* tisles made their first contribution in https://github.com/huggingface/peft/pull/1584
* changhwa made their first contribution in https://github.com/huggingface/peft/pull/1624
* yhZhai made their first contribution in https://github.com/huggingface/peft/pull/1631
* yfeng95 made their first contribution in https://github.com/huggingface/peft/pull/1326
* YuliangXiu made their first contribution in https://github.com/huggingface/peft/pull/1642
* charliermarsh made their first contribution in https://github.com/huggingface/peft/pull/1660
* sanghyuk-choi made their first contribution in https://github.com/huggingface/peft/pull/1679
* jpodivin made their first contribution in https://github.com/huggingface/peft/pull/1686
* DTennant made their first contribution in https://github.com/huggingface/peft/pull/1301
* fahadh4ilyas made their first contribution in https://github.com/huggingface/peft/pull/1618
* huismiling made their first contribution in https://github.com/huggingface/peft/pull/1687
* NielsRogge made their first contribution in https://github.com/huggingface/peft/pull/1719
* abzb1 made their first contribution in https://github.com/huggingface/peft/pull/1722
* fxmeng made their first contribution in https://github.com/huggingface/peft/pull/1626
**Full Changelog**: https://github.com/huggingface/peft/compare/v0.10.0...v0.11.0