Changes - Fixed save_models for named_buffer wuxibin89 - Fixed vLLM generation hang bug (requires vLLM<0.2.7) hijkzzz
0.1.9
Changes - Supported input_template 203 rbao2018 - Supported KTO 201 Dylancer1998 - Upgrade HuggingFace Transformers to 4.37.1
0.1.8
Changes - Upgraded transformers to version 4.37 - Fixed gradient checkpoint configuration in Ray RLHF wuxibin89 - Fixed loss coefficient for PPO-ptx hijkzzz
0.1.7
Changes - Fixed LLaMA RoPE initialization bug for ZeRO3 wuxibin89 - Fixed a DPO training script bug hijkzzz
0.1.6
Changes - Fixed DeepSpeed configs to improve PPO training stability hijkzzz