Openrlhf

Latest version: v0.5.3

Safety actively analyzes 688867 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 5 of 8

0.2.6

Changes
- Upgraded vLLM to v0.4.1 mgerstgrasser wuxibin89 hijkzzz
- Upgraded Transformers to v4.40.1 and DeepSpeed to v0.14.0 hijkzzz
- Fixed typo in train_ppo_ray.py mickelliu
- Fixed mismatch size output_state_dict(148) and state_dict(149) in model saving hijkzzz
- Added support for --colocate_actor_ref and --colocate_critic_reward in train_ppo_ray.py hijkzzz
- Added support for Ray PPO reward ref models offloading hijkzzz

0.2.5

Changes
- Added Chinese README.md khazic
- Added KD Trainer and Loss ifromeast
- Fixed num_training_steps wuxibin89
- Updated requirements.txt kfertakis
- Fixed error due to 'margin' variable type being list in rm_trainer.py StwayneXG

0.2.4

Changes
- Fixed DPO masked loss function hijkzzz
- Fixed Yi-34B tokenizer (--disable_fast_tokenizer) 240 hijkzzz
- Supported `wandb.login()` (--wandb True) 231 mgerstgrasser

0.2.3

Changes
- Fixed 191 "deepspeed.zero.Init causes very strange spikes in PPO policy_loss" hijkzzz
- Added dockerfile for vLLM hijkzzz

0.2.2

Changes
- Fixed [LlamaRotaryEmbedding](https://github.com/OpenLLMAI/OpenRLHF/commit/9ccbcddf51551db4dd9ba1edc08ff094f764928a) for Transformers v4.38.1 hijkzzz
- Use lazy vLLM engine wuxibin89
- Added Chinese PR docs catqaq
- Fixed tensor shape docs Thecats-Jfm

0.2.1

Changes
- Fixed position_ids for left padding 217 hijkzzz
- Supported input_key for custom dataset hijkzzz

Page 5 of 8

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.