Openrlhf

Latest version: v0.5.3

Safety actively analyzes 688844 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 7 of 8

0.1.5

Changes
- Optimized deepspeed configuration and improved performance by 30%+ with Adam Offload hijkzzz
- Added support for QLora and Lora in all stages hijkzzz
- Fixed Mixtral 8*7b balancing loss bugs hijkzzz

0.1.4

Changes
- Fixed reward model training when using the Huggingface ZeRO3 initialization API (for models with 70 billion+ parameters) wuxibin89
- Added support for Mixtral 8x7b balancing loss (--balancing_loss_coef) hijkzzz
- Fixed issue with vllm_engine when tp=1 wuxibin89
- Fixed ZeRO2 model saving bugs hijkzzz
- Added --grad_accum_dtype args to save memory of the CPUAdam hijkzzz

0.1.3

Changes
- Fixed Huggingface Reward model saving wuxibin89
- Improved `mask_mean` for loss function hijkzzz
- Fixed `num_actions` and `action_mask` ZiyiLiubird
- Optimized PPO performance of example scripts (set micro_batch_size=4) hijkzzz

0.1.2

Changes
- Fix Reward model hidden size and value_head initialization wuxibin89
- Fix save bugs hijkzzz

0.1.1

Changes
- Using Huggingface format Actor and Reward/Critic models wuxibin89
https://huggingface.co/OpenLLMAI/Llama-2-7b-sft-model-ocra-500k
https://huggingface.co/OpenLLMAI/Llama-2-7b-rm-anthropic_hh-lmsys-oasst-webgpt
- Upgrade PyTorch NGC container to 23.12
- Upgrade FlashAttention2 to 2.4.2
- Added continue pre-train script hijkzzz

0.1.0

Changes
- Added support for vLLM generation in RLHF wuxibin89
- Added 70B RLHF training scripts wuxibin89
- Optimized padding removal using `torch.argmax` li-plus
- Upgraded the container to NVIDIA PyTorch 23.10
- Upgraded Transformers and DeepSpeed
- Fixed FlashAttention 2 hijkzzz

Page 7 of 8

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.