Openrlhf

Latest version: v0.6.4

Safety actively analyzes 723947 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 11

0.6.4

What's Changed
* Upgrade vLLM to 0.8.2 (V1 engine) and DeepSpeed to 0.16.5 xiaoxigua999
* fix generation attn_mask in ppo_train by gzpan in https://github.com/OpenRLHF/OpenRLHF/pull/913
* Add progress bar during forward batch when making experience by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/925
* Replace deprecated vLLM generate API by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/926
* Update for recent HIP_VISIBLE_DEVICES changes in ray by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/933
* Fix full determinism mode when using vLLM V1 by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/932

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.6.3...v0.6.4

0.6.3.post2

What's changed
- [support dr. grpo](https://github.com/OpenRLHF/OpenRLHF/commit/6dcfcc3ddc385eb8037eb7a34f39025b3c46a64f) xiaoxigua999
- [fix vllm generate nccl timeout](https://github.com/OpenRLHF/OpenRLHF/commit/daafcac694cd0a1d57480fa5fb1215c1070c04cd) xiaoxigua999

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.6.3...v0.6.3.post2

0.6.3

What's Changed
* Support DeepSpeed universal checkpoints by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/891
* Get datasets from ModelScope use `--use_ms`. by lxline in https://github.com/OpenRLHF/OpenRLHF/pull/893
* Pop ROCR_VISIBLE_DEVICES as well when starting LLMRayActor by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/895
* Refactor ppo_trainer.py and make_experience / Support vLLM 0.8.1 by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/900
* refactor make_experience (batch forward) and advantage compute by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/902
* Fix make experience when not using ring attention by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/905
* fix: resolve UnboundLocalError when training without packed samples by mananshah99 in https://github.com/OpenRLHF/OpenRLHF/pull/906
* Fix: inconsistent vllm performance when tp > 1 by whksmo in https://github.com/OpenRLHF/OpenRLHF/pull/907

New Contributors
* mananshah99 made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/906
* whksmo made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/907

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.6.2...v0.6.3

0.6.2

What's Changed
* Use environ instead of args to store local_rank value by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/838
* Fix SFT packing_samples by wangcho2k in https://github.com/OpenRLHF/OpenRLHF/pull/781
* fix typo by ji-huazhong in https://github.com/OpenRLHF/OpenRLHF/pull/854
* Fix temperature for rollout and forward by dingyuan-shi xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/857
* Support full determinism option by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/868
* CUDA synchronize before empty cache when make experience by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/874
* optimize logpros from logits temperary memory by gzpan in https://github.com/OpenRLHF/OpenRLHF/pull/878
* fix: resolve vLLM engine not found issue during resume by Freder-chen in https://github.com/OpenRLHF/OpenRLHF/pull/879
* fix spelling mistake by BearBiscuit05 in https://github.com/OpenRLHF/OpenRLHF/pull/880
* remove train_ppo.py by xiaoxigua999 in https://github.com/OpenRLHF/OpenRLHF/pull/882

New Contributors
* wangcho2k made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/781
* gzpan made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/878
* BearBiscuit05 made their first contribution in https://github.com/OpenRLHF/OpenRLHF/pull/880

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.6.1...v0.6.2

0.6.1.post1

What's Changed
* Use environ instead of args to store local_rank value by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/838
* [fix ring attention vllm generate](https://github.com/OpenRLHF/OpenRLHF/commit/cdcabf3548ed67f7454eed4fb70905ac8faa8694) xiaoxigua999

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.6.1...v0.6.1.post1

0.6.1

What's Changed
* Fix generation without vLLM by HollowMan6 in https://github.com/OpenRLHF/OpenRLHF/pull/837
* [fix ring attention vllm generate miss actor_rank](https://github.com/OpenRLHF/OpenRLHF/commit/c05a994d9bafb97d893d783ee3198ef1044ef526) xiaoxigua999
* [fix init_kl_coef == 0](https://github.com/OpenRLHF/OpenRLHF/commit/1451ba435cf41eb7c19fd879af3eff6cb8d8f5ee) xiaoxigua999

**Full Changelog**: https://github.com/OpenRLHF/OpenRLHF/compare/v0.6.0.post3...v0.6.1

Page 1 of 11

Releases

Has known vulnerabilities

Openrlhf

Page 1 of 11

0.6.4

0.6.3.post2

0.6.3

0.6.2

0.6.1.post1

0.6.1

Page 1 of 11

Links

Releases