GPTQ Integration
Now, you can finetune GPTQ quantized models using PEFT. Here are some examples of how to use PEFT with a GPTQ model: [colab notebook](https://colab.research.google.com/drive/1_TIrmuKOFhuRRiTWN94iLKUFu6ZX4ceb?usp=sharing) and [finetuning](https://gist.github.com/SunMarc/dcdb499ac16d355a8f265aa497645996) script.
* GPTQ Integration by SunMarc in https://github.com/huggingface/peft/pull/771
Low-level API
Enables users and developers to use PEFT as a utility library, at least for injectable adapters (LoRA, IA3, AdaLoRA). It exposes an API to modify the model in place to inject the new layers into the model.
* [`core`] PEFT refactor + introducing inject_adapter_in_model public method by younesbelkada https://github.com/huggingface/peft/pull/749
* [`Low-level-API`] Add docs about LLAPI by younesbelkada in https://github.com/huggingface/peft/pull/836
Support for XPU and NPU devices
Leverage the support for more devices for loading and fine-tuning PEFT adapters.
* Support XPU adapter loading by abhilash1910 in https://github.com/huggingface/peft/pull/737
* Support Ascend NPU adapter loading by statelesshz in https://github.com/huggingface/peft/pull/772
Mix-and-match LoRAs
Stable support and new ways of merging multiple LoRAs. There are currently 3 ways of merging loras supported: `linear`, `svd` and `cat`.
* Added additional parameters to mixing multiple LoRAs through SVD, added ability to mix LoRAs through concatenation by kovalexal in https://github.com/huggingface/peft/pull/817
What's Changed