Llamafactory

Latest version: v0.9.2

Safety actively analyzes 715033 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 6

0.9.2

- Fix 1204 3306 3462 5121 5270 5404 5444 5472 5518 5616 5712 5714 5756 5944 5986 6020 6056 6092 6136 6139 6149 6165 6213 6287 6320 6345 6345 6346 6348 6358 6362 6391 6415 6439 6448 6452 6482 6499 6543 6546 6551 6552 6610 6612 6636 6639 6662 6669 6738 6772 6776 6780 6782 6793 6806 6812 6819 6826 6833 6839 6850 6854 6860 6878 6885 6889 6937 6948 6952 6960 6966 6973 6981 7036 7064 7072 7116 7125 7130 7171 7173 7180 7182 7184 7192 7198 7213 7234 7243

**Full Changelog**: https://github.com/hiyouga/LLaMA-Factory/compare/v0.9.1...v0.9.2

0.9.1

- Fix 3881 4712 5411 5542 5549 5611 5668 5705 5747 5749 5768 5796 5797 5883 5904 5966 5988 6050 6061

**Full Changelog**: https://github.com/hiyouga/LLaMA-Factory/compare/v0.9.0...v0.9.1

0.9.0

Not secure
Congratulations on 30,000 stars πŸŽ‰ Follow us at *[X (twitter)](https://twitter.com/llamafactory_ai)*

New features

- πŸ”₯Support fine-tuning **[Qwen2-VL](https://github.com/QwenLM/Qwen2-VL)** model on multi-image datasets by simonJJJ in #5290
- πŸ”₯Support time&memory-efficient **[Liger-Kernel](https://github.com/linkedin/Liger-Kernel)** via the `enable_liger_kernel` argument by hiyouga
- πŸ”₯Support memory-efficient **[Adam-mini](https://github.com/zyushun/Adam-mini)** optimizer via the `use_adam_mini` argument by relic-yuexi in #5095
- Support fine-tuning Qwen2-VL model on video datasets by hiyouga in 5365 and BUAADreamer in 4136 (needs patch https://github.com/huggingface/transformers/pull/33307)
- Support fine-tuning vision language models (VLMs) using RLHF/DPO/ORPO/SimPO approaches by hiyouga
- Support [Unsloth](https://unsloth.ai/blog/long-context)'s asynchronous activation offloading method via the `use_unsloth_gc` argument
- Support [vLLM](https://github.com/vllm-project/vllm) 0.6.0 version
- Support MFU calculation by yzoaim in 5388

New models

- Base models
- Qwen2-Math (1.5B/7B/72B) πŸ“„πŸ”’
- Yi-Coder (1.5B/9B) πŸ“„πŸ–₯️
- InternLM2.5 (1.8B/7B/20B) πŸ“„
- Gemma-2-2B πŸ“„
- Meta-Llama-3.1 (8B/70B) πŸ“„
- Instruct/Chat models
- MiniCPM/MiniCPM3 (1B/2B/4B) by LDLINGLINGLING in 4996 5372 πŸ“„πŸ€–
- Qwen2-Math-Instruct (1.5B/7B/72B) πŸ“„πŸ€–πŸ”’
- Yi-Coder-Chat (1.5B/9B) πŸ“„πŸ€–πŸ–₯️
- InternLM2.5-Chat (1.8B/7B/20B) πŸ“„πŸ€–
- Qwen2-VL-Instruct (2B/7B) πŸ“„πŸ€–πŸ–ΌοΈ
- Gemma-2-2B-it by codemayq in 5037 πŸ“„πŸ€–
- Meta-Llama-3.1-Instruct (8B/70B) πŸ“„πŸ€–
- Mistral-Nemo-Instruct (12B) πŸ“„πŸ€–

New datasets

- Supervised fine-tuning datasets
- Magpie-ultra-v0.1 (en) πŸ“„
- Pokemon-gpt4o-captions (en&zh) πŸ“„πŸ–ΌοΈ
- Preference datasets
- RLHF-V (en) πŸ“„πŸ–ΌοΈ
- VLFeedback (en) πŸ“„πŸ–ΌοΈ

Changes

- Due to compatibility consideration, fine-tuning vision language models (VLMs) requires `transformers>=4.35.0.dev0`, try `pip install git+https://github.com/huggingface/transformers.git` to install it.
- `visual_inputs` has been deprecated, now you do not need to specify this argument.
- LlamaFactory now adopts lazy loading for multimodal inputs, see 5346 for details. Please use `preprocessing_batch_size` to restrict the batch size in dataset pre-processing (supported by naem1023 in 5323 ).
- LlamaFactory now supports `lmf` (equivalent to `llamafactory-cli`) as a shortcut command.

Bug fix

- Fix LlamaBoard export by liuwwang in 4950
- Add ROCm dockerfiles by HardAndHeavy in 4970
- Fix deepseek template by piamo in 4892
- Fix pissa savecallback by codemayq in 4995
- Add Korean display language in LlamaBoard by Eruly in 5010
- Fix deepseekcoder template by relic-yuexi in 5072
- Fix examples by codemayq in 5109
- Fix `mask_history` truncate from last by YeQiuO in 5115
- Fix jinja template by YeQiuO in 5156
- Fix PPO optimizer and lr scheduler by liu-zichen in 5163
- Add SailorLLM template by chenhuiyu in 5185
- Fix XPU device count by Zxilly in 5188
- Fix bf16 check in NPU by Ricardo-L-C in 5193
- Update NPU docker image by MengqingCao in 5230
- Fix image input api by marko1616 in 5237
- Add liger-kernel link by ByronHsu in 5317
- Fix 4684 4696 4917 4925 4928 4944 4959 4992 5035 5048 5060 5092 5228 5252 5292 5295 5305 5307 5308 5324 5331 5334 5338 5344 5366 5384

0.8.3

Not secure
New features

- πŸ”₯Support [contamination-free packing](https://github.com/MeetKai/functionary/tree/main/functionary/train/packing) via the `neat_packing` argument by chuan298 in #4224
- πŸ”₯Support split evaluation via the `eval_dataset` argument by codemayq in 4691
- πŸ”₯Support HQQ/EETQ quantization via the `quantization_method` argument by hiyouga
- πŸ”₯Support ZeRO-3 when using BAdam by Ledzy in 4352
- Support train on the last turn via the `mask_history` argument by aofengdaxia in 4878
- Add NPU Dockerfile by MengqingCao in 4355
- Support building FlashAttention2 in Dockerfile by hzhaoy in 4461
- Support `batch_eval_metrics` at evaluation by hiyouga

New models

- Base models
- InternLM2.5-7B πŸ“„
- Gemma2 (9B/27B) πŸ“„
- Instruct/Chat models
- TeleChat-1B-Chat by hzhaoy in 4651 πŸ“„πŸ€–
- InternLM2.5-7B-Chat πŸ“„πŸ€–
- CodeGeeX4-9B-Chat πŸ“„πŸ€–
- Gemma2-it (9B/27B) πŸ“„πŸ€–

Changes

- Fix DPO cutoff len and deprecate `reserved_label_len` argument
- Improve loss function for reward modeling

Bug fix

- Fix numpy version by MengqingCao in 4382
- Improve cli by kno10 in 4409
- Add `tool_format` parameter to control prompt by mMrBun in 4417
- Automatically label npu issue by MengqingCao in 4445
- Fix flash_attn args by stceum in 4446
- Fix docker-compose path by MengqingCao in 4544
- Fix torch-npu dependency by hashstone in 4561
- Fix deepspeed + pissa by hzhaoy in 4580
- Improve cli by injet-zhou in 4590
- Add project by wzh1994 in 4662
- Fix docstring by hzhaoy in 4673
- Fix Windows command preview in WebUI by marko1616 in 4700
- Fix vllm 0.5.1 by T-Atlas in 4706
- Fix save value head model callback by yzoaim in 4746
- Fix CUDA Dockerfile by hzhaoy in 4781
- Fix examples by codemayq in 4804
- Fix evaluation data split by codemayq in 4821
- Fix CI by codemayq in 4822
- Fix 2290 3974 4113 4379 4398 4402 4410 4419 4432 4456 4458 4549 4556 4579 4592 4609 4617 4674 4677 4683 4684 4699 4705 4731 4742 4779 4780 4786 4792 4820 4826

0.8.2

Not secure
New features

- Support GLM-4 tools and parallel function calling by mMrBun in 4173
- Support **PiSSA** fine-tuning by hiyouga in 4307

New models

- Base models
- DeepSeek-Coder-V2 (16B MoE/236B MoE) πŸ“„
- Instruct/Chat models
- MiniCPM-2B πŸ“„πŸ€–
- DeepSeek-Coder-V2-Instruct (16B MoE/236B MoE) πŸ“„πŸ€–

New datasets

- Supervised fine-tuning datasets
- Neo-sft (zh)
- Magpie-Pro-300K-Filtered (en) by EliMCosta in 4309
- WebInstruct (en) by EliMCosta in 4309

Bug fix

- Fix DPO+ZeRO3 problem by hiyouga
- Add MANIFEST.in by iamthebot in 4191
- Fix eos_token in llama3 pretrain by dignfei in 4204
- Fix vllm version by kimdwkimdw and hzhaoy in 4234 and 4246
- Fix Dockerfile by EliMCosta in 4314
- Fix pandas version by zzxzz12345 in 4334
- Fix 3162 3196 3778 4198 4209 4221 4227 4238 4242 4271 4292 4295 4326 4346 4357 4362

0.8.1

Not secure
- Fix 2666: Unsloth+DoRA
- Fix 4145: The PyTorch version of the docker image does not match the vLLM requirement
- Fix 4160: The problem in LongLoRA implementation with the help of f-q23
- Fix 4167: The installation problem in the Windows system by yzoaim

Page 1 of 6

Β© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.