Openrlhf

Latest version: v0.6.3.post2

Safety actively analyzes 722491 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 11

0.6.3.post2

What's changed
- [support dr. grpo](https://github.com/OpenRLHF/OpenRLHF/commit/6dcfcc3ddc385eb8037eb7a34f39025b3c46a64f) xiaoxigua999
- [fix vllm generate nccl timeout](https://github.com/OpenRLHF/OpenRLHF/commit/daafcac694cd0a1d57480fa5fb1215c1070c04cd) xiaoxigua999

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.6.3...v0.6.3.post2

0.6.3

What's Changed
* Support DeepSpeed universal checkpoints by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/891
* Get datasets from ModelScope use `--use_ms`. by lxline in https://github.com/OpenRLHF/OpenRLHF/pull/893
* Pop ROCR_VISIBLE_DEVICES as well when starting LLMRayActor by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/895
* Refactor ppo_trainer.py and make_experience / Support vLLM 0.8.1 by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/900
* refactor make_experience (batch forward) and advantage compute by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/902
* Fix make experience when not using ring attention by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/905
* fix: resolve UnboundLocalError when training without packed samples by mananshah99 in https://github.com/OpenRLHF/OpenRLHF/pull/906
* Fix: inconsistent vllm performance when tp > 1 by whksmo in https://github.com/OpenRLHF/OpenRLHF/pull/907

New Contributors
* mananshah99 made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/906
* whksmo made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/907

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.6.2...v0.6.3

0.6.2

What's Changed
* Use environ instead of args to store local_rank value by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/838
* Fix SFT packing_samples by wangcho2k in https://github.com/OpenRLHF/OpenRLHF/pull/781
* fix typo by ji-huazhong in https://github.com/OpenRLHF/OpenRLHF/pull/854
* Fix temperature for rollout and forward by dingyuan-shi xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/857
* Support full determinism option by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/868
* CUDA synchronize before empty cache when make experience by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/874
* optimize logpros from logits temperary memory by gzpan in https://github.com/OpenRLHF/OpenRLHF/pull/878
* fix: resolve vLLM engine not found issue during resume by Freder-chen in https://github.com/OpenRLHF/OpenRLHF/pull/879
* fix spelling mistake by BearBiscuit05 in https://github.com/OpenRLHF/OpenRLHF/pull/880
* remove train_ppo.py by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/882

New Contributors
* wangcho2k made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/781
* gzpan made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/878
* BearBiscuit05 made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/880

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.6.1...v0.6.2

0.6.1.post1

What's Changed
* Use environ instead of args to store local_rank value by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/838
* [fix ring attention vllm generate](https://github.com/OpenRLHF/OpenRLHF/commit/cdcabf3548ed67f7454eed4fb70905ac8faa8694) xiaoxigua999

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.6.1...v0.6.1.post1

0.6.1

What's Changed
* Fix generation without vLLM by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/837
* [fix ring attention vllm generate miss actor_rank](https://github.com/OpenRLHF/OpenRLHF/commit/c05a994d9bafb97d893d783ee3198ef1044ef526) xiaoxigua999
* [fix init_kl_coef == 0](https://github.com/OpenRLHF/OpenRLHF/commit/1451ba435cf41eb7c19fd879af3eff6cb8d8f5ee) xiaoxigua999

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.6.0.post3...v0.6.1

0.6.0.post3

What's Changed
* Fix typo in README_zh.md by yuxinzuo in https://github.com/OpenRLHF/OpenRLHF/pull/824
* Pack vLLM engines together when tp>1 in distributed RLHF by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/759
* [fix load checkpoint for vllm sleep](https://github.com/OpenRLHF/OpenRLHF/commit/c3f0776b76e078162cb973b9f814c20ecaee0248) xiaoxigua999
* [add torch.cuda.empty_cache() for offload_deepspeed_states](https://github.com/OpenRLHF/OpenRLHF/commit/b256377febe2ce61e4ff81cc118012a6c3dabf85) xiaoxigua999

New Contributors
* yuxinzuo made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/824

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.6.0.post2...v0.6.0.post3

Page 1 of 11

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.