Openrlhf

Latest version: v0.5.5.post2

Safety actively analyzes 702474 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 8

0.5.5.post2

What's Changed

- [revert transformer/deepspeed versions](https://github.com/OpenRLHF/OpenRLHF/commit/ce5d3cc8e902b4586851c0708f29e8606883ef36) xiaoxigua999

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.5.5.post1...v0.5.5.post2

0.5.5.post1

What's Changed

- [Fix _validate_args](https://github.com/OpenRLHF/OpenRLHF/commit/1f95a1707ee17513bcdab816a30ab3fa46c12ee9) xiaoxigua999
- [process_experiences on gpu](https://github.com/OpenRLHF/OpenRLHF/commit/1e62a776f36c317ec40d608703ee0dfcfbfa7714) xiaoxigua999
- [bump deepspeed version](https://github.com/OpenRLHF/OpenRLHF/commit/4c330de97be29293999e2f242232252982fca1fc) xiaoxigua999

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.5.5...v0.5.5.post1

0.5.5

High Lights
* Fix vLLM nccl sync in single node by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/604

What's Changed
* Update batch_inference.py by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/612
* Offload training experiences to CPU memory in RLHF xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/620
* Fixing KL Divergence Precision and vllm generate timeout by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/620
* [support overlap_comm option](https://github.com/OpenRLHF/OpenRLHF/commit/5fd51011b57784a835784a55fe4c00cf3fdace3c) xiaoxigua999
* Fix 622: support string format in SFT template by Freder-chen in https://github.com/OpenRLHF/OpenRLHF/pull/623

New Contributors
* Freder-chen made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/623

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.5.4...v0.5.5

0.5.4

What's Changed
* Fixed typos: advatanges -> advantages by songxxzp in https://github.com/OpenRLHF/OpenRLHF/pull/570
* Only decode the queries once for multiple remote rm by zhuzilin in https://github.com/OpenRLHF/OpenRLHF/pull/572
* overlap vllm init and actor/reward model loading by zhuzilin in https://github.com/OpenRLHF/OpenRLHF/pull/575
* correct the order of multiplication in grad acc by zhuzilin in https://github.com/OpenRLHF/OpenRLHF/pull/577
* Support ring-attention during sft phase by UbeCc in https://github.com/OpenRLHF/OpenRLHF/pull/576
* Add better error message for empty datasets by frrad in https://github.com/OpenRLHF/OpenRLHF/pull/581
* Fix nan for sft-ring when labels are all IGNORE_INDEX by UbeCc in https://github.com/OpenRLHF/OpenRLHF/pull/583
* explicitly ignore attention_mask for packing_samples. by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/588
* [Set default grad_accum_dtype to None](https://github.com/OpenRLHF/OpenRLHF/commit/47f7cd8fc76de6d057d053251c1b55c00421cc24) xiaoxigua999
* update global batch size in eval model compatible to ring-attn-size by ShomyLiu in https://github.com/OpenRLHF/OpenRLHF/pull/590

New Contributors
* songxxzp made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/570
* UbeCc made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/576
* frrad made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/581
* ShomyLiu made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/590

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.5.3...v0.5.4

0.5.3

Highlights
* Support using `AutoModelForSequenceClassification.from_pretrained` load reward model xiaoxigua999 https://github.com/OpenRLHF/OpenRLHF/commit/a9c482a3cdabc94f34f3c5670fc964d8a9f86c63
- The default `value_head` name of the reward model has been changed to `score`
* Support RLOO with per-token KL penalty by zhuzilin in https://github.com/OpenRLHF/OpenRLHF/pull/515

What's Changed
* Fix bug on prm trainer w.r.t no packing samples and ring attn by zhuzilin in https://github.com/OpenRLHF/OpenRLHF/pull/551
* Fix issue 549 ZeroDivisionError during DPO/SFT/RM/KTO/KD training eval step by MarxistZ in https://github.com/OpenRLHF/OpenRLHF/pull/561 xiaoxigua999


New Contributors
* MarxistZ made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/561

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.5.2...v0.5.3

0.5.2.post1

Highlights
* Support vLLM NCCL weights sync for multi-nodes RLHF by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/543

What's Changed
* Update docker container to 24.07 and vLLM to 0.6.4.post1 xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/543

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.5.2...v0.5.2.post1

Page 1 of 8

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.