New
New model integrations
- Add BEiT integration (jannik-brinkmann via 428, 439)
- Add GPT-J integration (ChiragBSavani via 426)
- Add CLIP integration (calpt via 483)
- Add ALBERT integration (lenglaender via 488)
- Add BertGeneration (hSterz via 480)
Misc
- Add support for adapter configuration strings (calpt via 465, 486)
This enables you to easily configure adapter configs. To create a Pfeiffer adapter with reduction factor 16 you can know use `pfeiffer[reduction_factor=16]`. Especially for experiments using different hyperparameters or the example scripts, this can come in handy. [Learn more](https://docs.adapterhub.ml/overview.html#configuration-strings)
- Add for `Stack`, `Parallel` & `BatchSplit` composition to prefix tuning (calpt via 476)
In previous `adapter-transformers` versions, you could combine multiple bottleneck adapters. You could use them in parallel or stack them. Now, this is also possible for prefix-tuning adapters. Add multiple prefixes to the same model to combine the functionality of multiple adapters (Stack) or perform several tasks simultaneously (Parallel, BatchSplit) [Learn more](https://docs.adapterhub.ml/adapter_composition.html#stack)
- Enable parallel sequence generation with adapters (calpt via 436)
Changed
- Removal of the `MultiLingAdapterArguments` class. Use the [`AdapterArguments`](https://docs.adapterhub.ml/classes/adapter_training.html#transformers.adapters.training.setup_adapter_training) class and [`setup_adapter_training`](https://docs.adapterhub.ml/classes/adapter_training.html#transformers.adapters.training.setup_adapter_training) method instead. [Learn more](https://docs.adapterhub.ml/training.html).
- Upgrade of underlying transformers version to 4.26.1 (calpt via 455, hSterz via 503)
Fixed
- Fixes for GLUE & dependency parsing example script (calpt via 430, 454)
- Fix access to shared parameters of compacter (e.g. during sequence generation) (calpt via 440)
- Fix reference to adapter configs in `T5EncoderModel` (calpt via 437)
- Fix DeBERTa prefix tuning with enabled relative attention (calpt via 451)
- Fix gating for prefix tuning layers (calpt via 471)
- Fix input to T5 adapter layers (calpt via 479)
- Fix AdapterTrainer hyperparameter tuning (dtuit via 482)
- Move loading best adapter to AdapterTrainer class (MaBeHen via 487)
- Make HuggingFace Hub Mixin work with newer utilities (Helw150 via 473)
- Only compute fusion reg loss if fusion layer is trained (calpt via 505)