Trl

Latest version: v0.12.2

Safety actively analyzes 688238 Python packages for vulnerabilities to keep your Python projects secure.

Page 8 of 8

0.4.1

Large models training, Naive Pipeline Parallelism, `peft` Data Parallelism support and distributed training bug fixes

This release includes a set of features and bug fixes to scale up your RLHF experiments for much larger models leveraging `peft` and `bitsandbytes`.

Naive Pipeline Parallelism support

* Let's support naive Pipeline Parallelism by younesbelkada in https://github.com/lvwerra/trl/pull/210

We introduce a new paradigm in `trl` , termed as Naive Pipeline Parallelism, to fit large scale models on your training setup and apply RLHF on them. This feature uses `peft` to train adapters and `bitsandbytes` to reduce the memory foot print of your active model

![image](https://huggingface.co/datasets/trl-internal-testing/example-images/resolve/main/images/trl-npp.png)

`peft` Data Parallelism support

* [`peft`] Fix DP issues by younesbelkada in https://github.com/lvwerra/trl/pull/221
* [`core`] fix DP issue by younesbelkada in https://github.com/lvwerra/trl/pull/222

There were some bugs with respect to `peft` integration and DP. This release includes the bug fixes to enable multi-GPU training using `accelerate` + DDP (DIstributed Data Parallel)

Memory optimization

Your training runs can be now much more memory efficient thanks to few tricks / bug fixes:
Now `PPOConfig` also supports the flag `optimize_cuda_cache` (set to `False` by default) to avoid increasing CUDA memory issues

* Grad accumulation and memory bugfix by edbeeching in https://github.com/lvwerra/trl/pull/220
* adds a missing detach to the ratio by edbeeching in https://github.com/lvwerra/trl/pull/224

Pytorch 2.0 fixes

This release also includes minor fixes related to PyTorch 2.0 release

* [`test`] attempt to fix CI test for PT 2.0 by younesbelkada in https://github.com/lvwerra/trl/pull/225

What's Changed

* adds sentiment example for a 20b model by edbeeching in https://github.com/lvwerra/trl/pull/208
* Update README.md blog post link by TeamDman in https://github.com/lvwerra/trl/pull/212
* spell mistakes by k-for-code in https://github.com/lvwerra/trl/pull/213
* spell corrections by k-for-code in https://github.com/lvwerra/trl/pull/214
* Small changes when integrating into H4 by natolambert in https://github.com/lvwerra/trl/pull/216

New Contributors
* TeamDman made their first contribution in https://github.com/lvwerra/trl/pull/212
* k-for-code made their first contribution in https://github.com/lvwerra/trl/pull/213

**Full Changelog**: https://github.com/lvwerra/trl/compare/v0.4.0...v0.4.1

0.4.0

Apply RLHF and fine-tune your favorite large model on consumer GPU using `peft` and `trl` ! Share also easily your trained RLHF adapters on the Hub with few lines of code

With this integration you can train `gpt-neo-x` (20B parameter model - 40GB in `bfloat16`) on a 24GB consumer GPU!

What's Changed

* Allow running evaluate-toxicity with cpu by jordimas in https://github.com/lvwerra/trl/pull/195
* [`core`] Fix quality issue by younesbelkada in https://github.com/lvwerra/trl/pull/197
* Add 1.12.1 torch compatibility in sum method by PanchenkoYehor in https://github.com/lvwerra/trl/pull/190
* `peft` integration by edbeeching in https://github.com/lvwerra/trl/pull/163
* [`core`] Update dependency by younesbelkada in https://github.com/lvwerra/trl/pull/206

New Contributors

* PanchenkoYehor made their first contribution in https://github.com/lvwerra/trl/pull/190

**Full Changelog**: https://github.com/lvwerra/trl/compare/v0.3.1...v0.4.0

0.3.1

What's Changed
* Clarifications of acronyms and initialisms by meg-huggingface in https://github.com/lvwerra/trl/pull/185
* Update detoxifying_a_lm.mdx by younesbelkada in https://github.com/lvwerra/trl/pull/186
* Fix reference to example by jordimas in https://github.com/lvwerra/trl/pull/184

New Contributors
* meg-huggingface made their first contribution in https://github.com/lvwerra/trl/pull/185
* jordimas made their first contribution in https://github.com/lvwerra/trl/pull/184

**Full Changelog**: https://github.com/lvwerra/trl/compare/v0.3.0...v0.3.1

0.3.0

What's Changed
* fix style, typos, license by natolambert in https://github.com/lvwerra/trl/pull/103
* fix re-added file by natolambert in https://github.com/lvwerra/trl/pull/116
* add citation by natolambert in https://github.com/lvwerra/trl/pull/124
* add manual seeding for RL experiments by natolambert in https://github.com/lvwerra/trl/pull/118
* add `set_seed` to __init__.py by lvwerra in https://github.com/lvwerra/trl/pull/127
* update docs with Seq2seq models, set_seed, and create_reference_model by lvwerra in https://github.com/lvwerra/trl/pull/128
* [`bug`] Update gpt2-sentiment.py by younesbelkada in https://github.com/lvwerra/trl/pull/132
* Fix Sentiment control notebook by lvwerra in https://github.com/lvwerra/trl/pull/126
* realign values by lvwerra in https://github.com/lvwerra/trl/pull/137
* Change unclear variables & fix typos by natolambert in https://github.com/lvwerra/trl/pull/134
* Feat/reward summarization example by TristanThrush in https://github.com/lvwerra/trl/pull/115
* [`core`] Small refactor of forward pass by younesbelkada in https://github.com/lvwerra/trl/pull/136
* [`tests`] Add correct repo name by younesbelkada in https://github.com/lvwerra/trl/pull/138
* fix forward batching for seq2seq and right padding models. by lvwerra in https://github.com/lvwerra/trl/pull/139
* fix bug in batched_forward_pass by ArvinZhuang in https://github.com/lvwerra/trl/pull/144
* [`core`] Add `torch_dtype` support by younesbelkada in https://github.com/lvwerra/trl/pull/147
* [`core`] Fix dataloader issue by younesbelkada in https://github.com/lvwerra/trl/pull/154
* [`core`] enable `bf16` training by younesbelkada in https://github.com/lvwerra/trl/pull/156
* [`core`] fix saving multi-gpu by younesbelkada in https://github.com/lvwerra/trl/pull/157
* Added imports by BirgerMoell in https://github.com/lvwerra/trl/pull/159
* Add CITATION.cff by kashif in https://github.com/lvwerra/trl/pull/169
* [Doc] Add how to use Lion optimizer by younesbelkada in https://github.com/lvwerra/trl/pull/152
* policy kl [old | new] by kashif in https://github.com/lvwerra/trl/pull/168
* add minibatching by lvwerra in https://github.com/lvwerra/trl/pull/153
* fix bugs in tutorial by shizhediao in https://github.com/lvwerra/trl/pull/175
* [`core`] Add `max_grad_norm` support by younesbelkada in https://github.com/lvwerra/trl/pull/177
* Add toxcitiy example by younesbelkada in https://github.com/lvwerra/trl/pull/162
* [`Docs`] Fix barplot by younesbelkada in https://github.com/lvwerra/trl/pull/181

New Contributors
* natolambert made their first contribution in https://github.com/lvwerra/trl/pull/103
* ArvinZhuang made their first contribution in https://github.com/lvwerra/trl/pull/144
* BirgerMoell made their first contribution in https://github.com/lvwerra/trl/pull/159
* kashif made their first contribution in https://github.com/lvwerra/trl/pull/169
* shizhediao made their first contribution in https://github.com/lvwerra/trl/pull/175

**Full Changelog**: https://github.com/lvwerra/trl/compare/v0.2.1...v0.3.0

0.2.1

What's Changed
* Update customization.mdx by younesbelkada in https://github.com/lvwerra/trl/pull/109
* add `datasets` as a dependancy by lvwerra in https://github.com/lvwerra/trl/pull/110
* [Docs] Add hlinks to scripts & notebooks by younesbelkada in https://github.com/lvwerra/trl/pull/111
* Fix `Mapping` in core for Python 3.10 by lvwerra in https://github.com/lvwerra/trl/pull/112

**Full Changelog**: https://github.com/lvwerra/trl/compare/v0.2.0...v0.2.1

0.2.0

Highlights
* General decoder model support in addition to GPT-2 in https://github.com/lvwerra/trl/pull/53
* Encoder-decoder model support (such as T5) in https://github.com/lvwerra/trl/pull/93
* New, shiny docs with the `doc-builder` in https://github.com/lvwerra/trl/pull/59
* `push_to_hub` with PPOTrainer in https://github.com/lvwerra/trl/pull/68
* Simple reference model creation with layer sharing in https://github.com/lvwerra/trl/pull/61

What's Changed

* Remove `nbdev` dependency by younesbelkada in https://github.com/lvwerra/trl/pull/52
* Adds github actions and dummy test by edbeeching in https://github.com/lvwerra/trl/pull/55
* Update README.md by Keith-Hon in https://github.com/lvwerra/trl/pull/51
* Update README.md by TristanThrush in https://github.com/lvwerra/trl/pull/49
* Adds Python highlighting to the code block by JulesGM in https://github.com/lvwerra/trl/pull/45
* `xxxForCausalLM` support by younesbelkada in https://github.com/lvwerra/trl/pull/53
* [`VHead`] Fix slow convergence issue by younesbelkada in https://github.com/lvwerra/trl/pull/60
* add docbuilder skeleton by lvwerra in https://github.com/lvwerra/trl/pull/59
* fix docs workflow by lvwerra in https://github.com/lvwerra/trl/pull/63
* `accelerate` integration by younesbelkada in https://github.com/lvwerra/trl/pull/58
* add create_reference_model by lvwerra in https://github.com/lvwerra/trl/pull/61
* Improve Makefile and code quality by lvwerra in https://github.com/lvwerra/trl/pull/62
* Relax requirements by lvwerra in https://github.com/lvwerra/trl/pull/66
* modeling - change namings by younesbelkada in https://github.com/lvwerra/trl/pull/65
* [`PPOTrainer`] make the reference model optional by younesbelkada in https://github.com/lvwerra/trl/pull/67
* Improvements 1a by edbeeching in https://github.com/lvwerra/trl/pull/70
* update GitHub actions to `main` by lvwerra in https://github.com/lvwerra/trl/pull/77
* [core] refactor `step` method by younesbelkada in https://github.com/lvwerra/trl/pull/76
* [`PPOTrainer`] Support generic optimizers by younesbelkada in https://github.com/lvwerra/trl/pull/78
* Update sentiment_tuning.mdx by eltociear in https://github.com/lvwerra/trl/pull/69
* Remove references to "listify_batch" by xiaoyesoso in https://github.com/lvwerra/trl/pull/81
* Collater -> collator by LysandreJik in https://github.com/lvwerra/trl/pull/88
* Model as kwarg in pipeline by LysandreJik in https://github.com/lvwerra/trl/pull/89
* Small typo correction by LysandreJik in https://github.com/lvwerra/trl/pull/87
* [API] Make `dataset` attribute optional by younesbelkada in https://github.com/lvwerra/trl/pull/85
* [Doc] Improve docs by younesbelkada in https://github.com/lvwerra/trl/pull/91
* [core] Push `v_head` when using `AutoModelForCausalLMWithValueHead` by younesbelkada in https://github.com/lvwerra/trl/pull/86
* [core] remove `wandb` dependency by younesbelkada in https://github.com/lvwerra/trl/pull/92
* add logo by lvwerra in https://github.com/lvwerra/trl/pull/95
* Encoder-Decoder models support by younesbelkada in https://github.com/lvwerra/trl/pull/93
* Fix docs hyperlinks by lewtun in https://github.com/lvwerra/trl/pull/98
* [API] LR scheduler support by younesbelkada in https://github.com/lvwerra/trl/pull/96
* Version should have `dev0` unless it is a release version by mishig25 in https://github.com/lvwerra/trl/pull/99
* [core] improve API by younesbelkada in https://github.com/lvwerra/trl/pull/97
* Add push to Hub for PPOTrainer by lewtun in https://github.com/lvwerra/trl/pull/68
* [`core`] Advise to use `fbs=1` by younesbelkada in https://github.com/lvwerra/trl/pull/102
* [Doc] New additions by younesbelkada in https://github.com/lvwerra/trl/pull/105
* restructure examples by lvwerra in https://github.com/lvwerra/trl/pull/107
* Fix nits & missing things by younesbelkada in https://github.com/lvwerra/trl/pull/108
* Convert notebook 05 by edbeeching in https://github.com/lvwerra/trl/pull/80

New Contributors
* lvwerra made their first contribution in https://github.com/lvwerra/trl/pull/2
* vblagoje made their first contribution in https://github.com/lvwerra/trl/pull/16
* dependabot made their first contribution in https://github.com/lvwerra/trl/pull/26
* younesbelkada made their first contribution in https://github.com/lvwerra/trl/pull/52
* edbeeching made their first contribution in https://github.com/lvwerra/trl/pull/55
* Keith-Hon made their first contribution in https://github.com/lvwerra/trl/pull/51
* TristanThrush made their first contribution in https://github.com/lvwerra/trl/pull/49
* JulesGM made their first contribution in https://github.com/lvwerra/trl/pull/45
* eltociear made their first contribution in https://github.com/lvwerra/trl/pull/69
* xiaoyesoso made their first contribution in https://github.com/lvwerra/trl/pull/81
* LysandreJik made their first contribution in https://github.com/lvwerra/trl/pull/88
* lewtun made their first contribution in https://github.com/lvwerra/trl/pull/98
* mishig25 made their first contribution in https://github.com/lvwerra/trl/pull/99

**Full Changelog**: https://github.com/lvwerra/trl/commits/v0.2.0

Page 8 of 8

Releases

Has known vulnerabilities

Trl

Page 8 of 8

0.4.1

0.4.0

0.3.1

0.3.0

0.2.1

0.2.0

Page 8 of 8

Links

Releases