Openrlhf

Latest version: v0.6.3.post2

Safety actively analyzes 723177 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 11

0.6.0.post2

What's Changed
* [Fix vLLM sleep RuntimeError: CUDA error: an illegal memory access was encountered](https://github.com/OpenRLHF/OpenRLHF/commit/1d02c896982ae733b803580f34287e7642f742b0) xiaoxigua999
* fix deepspeed offload sync issue by Freder-chen in https://github.com/OpenRLHF/OpenRLHF/pull/819
* fix: load_from_disk by Zeyi-Lin in https://github.com/OpenRLHF/OpenRLHF/pull/816

New Contributors
* Zeyi-Lin made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/816

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.6.0.post1...v0.6.0.post2

0.6.0.post1

Highlights
* feat: add hybrid engine deepspeed offload integration by Freder-chen xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/808

What's Changed
* Fix ring attn when n_samples_per_prompt > 1 by ChaosCodes in https://github.com/OpenRLHF/OpenRLHF/pull/803
* Add k2_loss by LYMDLUT in https://github.com/OpenRLHF/OpenRLHF/pull/797
* Make remote_rm_url type consistent by coding-famer in https://github.com/OpenRLHF/OpenRLHF/pull/807
* Fix vLLM tp=1 when RAY_EXPERIMENTAL_NOSET_*_VISIBLE_DEVICES is set by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/800
* fix ring attn in create_vllm_engines by ChaosCodes in https://github.com/OpenRLHF/OpenRLHF/pull/809
* [Fix hybrid engine / vllm sleep with ring attention](https://github.com/OpenRLHF/OpenRLHF/commit/3e3b24958f939b1d4e3ca69ea8a2b0d252991d3f) xiaoxigua999

New Contributors
* ChaosCodes made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/803

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.6.0...v0.6.0.post1

0.6.0

Highlights
* Support ring attention in ppo by Wraythh in https://github.com/OpenRLHF/OpenRLHF/pull/685

What's Changed
* Make sure vLLM driver (LLMRayActor) gets scheduled to the correct placement_group by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/732
* args use_ms for modelscope by lxline in https://github.com/OpenRLHF/OpenRLHF/pull/731
* experimental feature: support reinforce++baseline by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/730
* offload ref when kl=0 by richardodliu in https://github.com/OpenRLHF/OpenRLHF/pull/746
* add grpo training by richardodliu in https://github.com/OpenRLHF/OpenRLHF/pull/764
* add grpo scripts by richardodliu in https://github.com/OpenRLHF/OpenRLHF/pull/766
* fix grpo bug when non-packing samples by richardodliu in https://github.com/OpenRLHF/OpenRLHF/pull/769
* fix grpo bug in kl=0 by richardodliu in https://github.com/OpenRLHF/OpenRLHF/pull/773
* [support labels in prompt datasets for reinforced finetuning](https://github.com/OpenRLHF/OpenRLHF/commit/1eb14991ba863e6974f94dd6388a6ed71da43d24) xiaoxigua999

New Contributors
* lxline made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/731
* richardodliu made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/746
* Wraythh made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/685

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.5.9...v0.6.0

0.5.9.post1

What's Changed
* Make sure vLLM driver (LLMRayActor) gets scheduled to the correct placement_group by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/732
* args use_ms for modelscope by lxline in https://github.com/OpenRLHF/OpenRLHF/pull/731

New Contributors
* lxline made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/731

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.5.9...v0.5.9.post1

0.5.9

What's Changed
* Fix hybrid engine when vLLM tp size is 1 by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/725
* Make sure vLLM requests less resources than actors for hybrid engine by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/728
* support cuda ipc in hybrid engine by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/729
* [fix del CUDA_VISIBLE_DEVICES](https://github.com/OpenRLHF/OpenRLHF/commit/1c5c50081c8775b3464e296b80d22a29a222f4fe) xiaoxigua999


**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.5.8...v0.5.9

0.5.8

Highlights
* Support hybrid engine and vllm rlhf API by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/677

What's Changed
* custom reward func for reinforced finetuning by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/707
* early_stopping conditional on beam search by cemiu in https://github.com/OpenRLHF/OpenRLHF/pull/714
* [small change] Use selective log-softmax to reduce peak vram consumption by tyler-romero in https://github.com/OpenRLHF/OpenRLHF/pull/718
* Fix vLLM instance scheduling when tp size is 1 by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/721
* Send all prompts to vLLM engines zhuzilin in https://github.com/OpenRLHF/OpenRLHF/pull/677

New Contributors
* tyler-romero made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/718

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.5.7...v0.5.8

Page 2 of 11

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.