Ms-swift

Latest version: v3.2.2

Safety actively analyzes 723158 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 7

2.5.1

English Version

New Features:
1. Support for RM for LLM and MLLM, as well as PPO for LLM.

New Models:
1. molmo series
2. mplug-owl3 1b/2b
3. llama3.1-nemotron-70b-instruct
4. deepseek-janus

中文版

新特性:
1. 支持LLM和MLLM的RM, 以及LLM的PPO.

新模型:
1. molmo系列
2. mplug-owl3 1b/2b
3. llama3.1-nemotron-70b-instruct
4. deepseek-janus

What's Changed
* support reward modeling and ppo by hjh0119 in https://github.com/modelscope/ms-swift/pull/2093
* fix rescale_image by tastelikefeet in https://github.com/modelscope/ms-swift/pull/2223
* fix deploy timeout by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2230
* Fix qwen2 vl batch size by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2239
* Fix ovis1.6 infer by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2242
* fix publish by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2245
* fix qwen2vl video args by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2251
* Update FAQ by slin000111 in https://github.com/modelscope/ms-swift/pull/2252
* Support molmo series vlm by mi804 in https://github.com/modelscope/ms-swift/pull/2260
* fix sft system by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2262
* support mplug3 1b/2b by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2271
* Fix deploy openai by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2278
* fix vllm ignore suffix by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2287
* fix lora_target_modules in PPO by hjh0119 in https://github.com/modelscope/ms-swift/pull/2274
* fix quant blocks by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2292
* Support Llama3.1-nemotron-70b-inst-hf by DaozeZhang in https://github.com/modelscope/ms-swift/pull/2299
* fix ppo citest by hjh0119 in https://github.com/modelscope/ms-swift/pull/2302
* support deepseek-janus by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2300
* update molmo by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2305

New Contributors
* mi804 made their first contribution in https://github.com/modelscope/ms-swift/pull/2260

**Full Changelog**: https://github.com/modelscope/ms-swift/compare/v2.5.0...v2.5.1

2.5.0

English Version

New Features:
1. Support for GPTQ & AWQ quantization of multimodal LLMs.
2. Support for dynamic addition of gradient checkpointing in the ViT section to reduce memory consumption.
3. Support for multimodal model pre-training.

New Models:
1. llama3.2, llama3.2-vision series
2. got-ocr2
3. llama3.1-omni
5. ovis1.6-gemma2
6. pixtral-12b
7. telechat2-115b
8. mistral-small-inst-2409

New Datasets:
1. egoschema

中文版

新特性:
1. 支持多模态LLM的gptq&awq量化.
2. 支持动态在vit部分增加gradient_checkpointing, 减少显存消耗.
3. 支持多模态模型预训练.

新模型:
1. llama3.2, llama3.2-vision系列
2. got-ocr2
3. llama3.1-omni
4. ovis1.6-gemma2
5. pixtral-12b
6. telechat2-115b
7. mistral-small-inst-2409

新数据集:
1. egoschema

What's Changed
* fix win32 quote by tastelikefeet in https://github.com/modelscope/ms-swift/pull/2065
* Fix yi template by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2067
* fix rlhf zero3 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2072
* Update qwen2-vl最佳实践.md by Digital2Slave in https://github.com/modelscope/ms-swift/pull/2058
* fix RLHF & max_length by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2075
* Support Mistral-small-inst-2409 by DaozeZhang in https://github.com/modelscope/ms-swift/pull/2077
* dynamic vit gradient_checkpointing by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2071
* fix qwen2.5 template by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2081
* fix multiprocess remove_columns by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2088
* Support for fine-tuning Pixtral-12B. by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2090
* fix vllm tokenizer by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2099
* Fix the issue with media_offset in owl3 when batch_size > 1. by LukeForeverYoung in https://github.com/modelscope/ms-swift/pull/2100
* fix deploy openai compat by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2101
* fix dataset preprocess by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2102
* fix cpu infer device_map by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2103
* fix infer device_map by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2105
* Support for fine-tuning Llama 3.1 Omni. by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2106
* support vllm & qwen2-vl video by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2110
* Fix qwen2-vl zero2/3 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2114
* fix qwen2-audio by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2116
* [TorchAcc] fix: fix find_labels and can_return_loss by baoleai in https://github.com/modelscope/ms-swift/pull/2120
* support got-ocr2 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2123
* Support for fine-tuning and deployment of the Llama 3.2 series models. by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2130
* Support fine-tuning MLLama. by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2132
* fix not impl bug by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2134
* Compat vllm & qwen2-vl by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2136
* fix requirements by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2137
* fix model_type by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2138
* fix deploy vllm by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2141
* fix docs by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2142
* Fix VLM lora by tastelikefeet in https://github.com/modelscope/ms-swift/pull/2140
* support mllm pt by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2146
* [TorchAcc] fix: fix save config and additional file for swift and peft by baoleai in https://github.com/modelscope/ms-swift/pull/2149
* update quant_device_map by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2154
* fix qwen2-audio by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2157
* fix template by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2160
* compat trl==0.11 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2169
* Support for Egoschema, a new video dataset by DaozeZhang in https://github.com/modelscope/ms-swift/pull/2173
* Update FAQ by slin000111 in https://github.com/modelscope/ms-swift/pull/2165
* fix mplug-owl3 infer by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2175
* Support quant mllm by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2177
* update setup.py by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2205
* fix bugs by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2207
* support telechat2 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2210
* Support ovis 1.6 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2211

New Contributors
* Digital2Slave made their first contribution in https://github.com/modelscope/ms-swift/pull/2058
* LukeForeverYoung made their first contribution in https://github.com/modelscope/ms-swift/pull/2100

**Full Changelog**: https://github.com/modelscope/ms-swift/compare/v2.4.2...v2.5.0

2.4.2

English Version

New Features:
1. RLHF reconstruction, supporting all integrated multimodal models, compatible with DeepSpeed Zero2/Zero3, and supports lazy_tokenize.
2. Using infer_backend vllm, inference deployment of multimodal large models supports multiple images.

New Models:
1. Qwen2.5 series, Qwen2-vl-72b series (base/instruct/gptq-int4/gptq-int8/awq)
3. Qwen2.5-math, Qwen2.5-coder series (base/instruct)
4. Deepseek-v2.5

New Datasets:
1. longwriter-6k-filtered

中文版

新特性:
1. RLHF重构,支持所有已接入的多模态模型,兼容deepspeed zero2/zero3,支持lazy_tokenize
2. 使用infer_backend vllm,推理部署多模态大模型支持多图.

新模型:
1. qwen2.5系列、qwen2-vl-72b系列(base/instruct/gptq-int4/gptq-int8/awq)
3. qwen2.5-math, qwen2.5-coder系列(base/instruct)
5. deepseek-v2.5

新数据集:
1. longwriter-6k-filtered

What's Changed
* fix model_mapping by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1982
* fix patch by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1997
* fix by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1995
* Support Deepseek 2.5 by DaozeZhang in https://github.com/modelscope/ms-swift/pull/1992
* fix EngineGenerationConfig importError of lmdeploy by irexyc in https://github.com/modelscope/ms-swift/pull/1990
* compat lmdeploy==0.6 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2001
* Fix rlhf ref model by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2003
* Support llava1.6-llama3.1-8b-instruct by DaozeZhang in https://github.com/modelscope/ms-swift/pull/2005
* fix lmdeploy qwen_vl by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2009
* Add FAQ Document by slin000111 in https://github.com/modelscope/ms-swift/pull/2013
* Florence use _post_encode & template support encoder-decoder by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2019
* refactor rlhf by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1975
* update code by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2028
* fix deploy eval kill by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2029
* Fix olora and pissa saving files which will cause the second saving failed by tastelikefeet in https://github.com/modelscope/ms-swift/pull/2032
* fix rlhf & zero3 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2034
* Add longwriter filtered dataset by wangxingjun778 in https://github.com/modelscope/ms-swift/pull/2037
* fix mplug-owl3 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2042
* support multi bbox grounding by tastelikefeet in https://github.com/modelscope/ms-swift/pull/2045
* Fix multi coordinate grounding by tastelikefeet in https://github.com/modelscope/ms-swift/pull/2047
* llama3 tool calling by tastelikefeet in https://github.com/modelscope/ms-swift/pull/2048
* update docs by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2050
* fix qwen2vl position_ids by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2051
* support qwen2-vl-base by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2052
* Support qwen2.5 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2054
* support qwen2-vl -72b/qwen2.5-math/qwen2.5-coder by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2056
* vllm support mutli image by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2059
* support qwen2.5-coder by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2061
* fix notebook gradio by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2062
* update qwen2-vl docs by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/2063

New Contributors
* irexyc made their first contribution in https://github.com/modelscope/ms-swift/pull/1990
* wangxingjun778 made their first contribution in https://github.com/modelscope/ms-swift/pull/2037

**Full Changelog**: https://github.com/modelscope/ms-swift/compare/v2.4.1...v2.4.2

2.4.1

English Version

New Features:
1. Inference and deployment support for logprobs.
2. RLHF support for lazy_tokenize.
3. Multimodal model support for neftune.
4. dynamic_eos compatibility with glm4 series and other models.

New Models:
1. mplug-owl3, best practices can be found [here](https://github.com/modelscope/ms-swift/issues/1969).
2. yi-coder 1.5b, base/chat model of 9b.
3. minicpm3-4b.
4. reflection-llama3.1-70b.

中文版

新功能:
1. 推理和部署支持 logprobs。
2. RLHF支持lazy_tokenize。
3. 多模态模型支持neftune。
4. dynamic_eos兼容glm4系列等模型。

新模型:
1. mplug-owl3,最佳实践可以查看[这里](https://github.com/modelscope/ms-swift/issues/1969)。
2. yi-coder 1.5b、9b 的base/chat模型。
3. minicpm3-4b。
4. reflection-llama3.1-70b。

What's Changed
* Fix push_to_hub when last-checkpoint by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1897
* support custom quantized dataset by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1893
* fix push_to_ms by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1901
* support logprobs by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1900
* deepspeed use cosine lr_schduler by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1907
* update docs by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1908
* fix web-ui push to hub strategy by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1909
* Refactor docs by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1912
* refactor docs by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1915
* [TorchAcc] perf: use xm.save instead of torch.save by baoleai in https://github.com/modelscope/ms-swift/pull/1916
* update wechat by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1925
* update docs & fix bug by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1926
* [TorchAcc] fix: fix the judegement of fsdp_num by baoleai in https://github.com/modelscope/ms-swift/pull/1903
* Support deploy & logprobs by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1833
* fix typing by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1933
* fix swift deploy by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1936
* update yi-coder by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1942
* fix lmdeploy seed by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1945
* fix do_sample by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1946
* refactor rlhf by hjh0119 in https://github.com/modelscope/ms-swift/pull/1885
* fix file rename error in megatron when there are multi process by Zhikaiiii in https://github.com/modelscope/ms-swift/pull/1948
* fix qwen2-vl & video by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1950
* support dynamic_eos by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1947
* fix rlhf by hjh0119 in https://github.com/modelscope/ms-swift/pull/1949
* Support minicpm 3 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1952
* Add lazy_tokenize to RLHF by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1956
* Fix data info print in rlhf by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1964
* Fix the lora hook by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1963
* fix bugs by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1959
* support mplug_owl3 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1957
* update docs by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1970
* Add reflection model by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1973
* fix typo by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1980


**Full Changelog**: https://github.com/modelscope/ms-swift/compare/v2.4.0...v2.4.1

2.4.0

English Version

New Features:
1. Support for Liger, which accommodates models like LLaMA, Qwen, Mistral, etc., and reduces memory usage by 10% to 60%.
2. Support for custom loss function training using a registration mechanism.
3. Training now supports pushing models to ModelScope and HuggingFace.
4. Support for the `freeze_vit` parameter to control the behavior of full parameter training for multimodal models.

New Models:
1. Qwen2-VL series includes GPTQ/AWQ quantized models. For best practices, see [here](https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Multi-Modal/qwen2-vl-best-practice.md).
2. InternVL2 AWQ quantized models.

New Datasets:
1. qwen2-pro series

中文版

新特性:
1. 支持 Liger训练LLaMA、Qwen、Mistral 等模型,内存使用降低 10% 至 60%。
2. 支持使用注册机制进行自定义损失函数的训练。
3. 训练支持将模型推送至 ModelScope 和 HuggingFace。
4. 支持 freeze_vit 参数,以控制多模态模型全参数训练的行为。

新模型:
1. Qwen2-VL 系列包括 GPTQ/AWQ 量化模型,最佳实践可以查看[这里](https://github.com/modelscope/ms-swift/blob/main/docs/source_en/Multi-Modal/qwen2-vl-best-practice.md)。
2. InternVL2 AWQ 量化模型。

新数据集:
1. qwen2-pro 系列

What's Changed
* compat with vllm==0.5.5 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1812
* Support zero2 offload by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1814
* fix mp+ddp & resume_from_checkpoint by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1815
* fix preprocess_num_proc by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1818
* Support liger by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1819
* fix dora deployment by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1821
* Support register loss func by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1822
* use default-lora by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1823
* fix minicpm-v 2.6 infer device_map by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1832
* Fix code by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1824
* fix inject by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1835
* support qwen2-pro dataset by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1834
* add ddp_timeout parameter by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1836
* fix internlm-xcomposer rlhf by hjh0119 in https://github.com/modelscope/ms-swift/pull/1838
* Support eval_nproc by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1843
* support qwen2-vl by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1842
* Add internvl2 awq models by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1846
* Fix some datasets for streaming by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1848
* Fix Pissa and OLoRA by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1852
* Support qwen2 vl grounding by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1854
* support qwen2-vl & video finetune by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1849
* Update new datasets by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1855
* update qwen2-vl docs by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1856
* update qwen2-vl docs by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1858
* fix qwen2-vl docs by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1861
* fix requirements by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1864
* update docs qwen2-vl by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1869
* Support faster data map by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1871
* [TorchAcc] fix serveral bugs for torchacc FSDP. by baoleai in https://github.com/modelscope/ms-swift/pull/1872
* Add train record by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1873
* Fix num_proc by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1874
* Fix neftune doc by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1875
* add duet by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1877
* use model.generation_config by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1850
* Support freeze vit by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1880
* support qwen2-vl gptq awq by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1884
* Refactor push_to_hub by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1883
* Fix push to hub logic by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1888
* add vllm lmdeploy benchmark by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1889
* Add some warnings and fix RLHF by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1890


**Full Changelog**: https://github.com/modelscope/ms-swift/compare/v2.3.2...v2.4.0

2.3.2

English Version
New Features:
1. ReFT support: achieves parameter efficiency that is 15× to 65× greater than LoRA.
2. Multimodal model supports zero3.
3. Supports using environment variables to control parameters such as hd_num, max_num, and video_segments.

New Models:
1. longwriter-glm4-9b, longwriter-llama3_1-8b
2. phi3_5-mini-instruct, phi3_5-moe-instruct, phi3_5-vision-instruct
3. llava-onevision-qwen2-0_5b-ov, llava-onevision-qwen2-7b-ov, llava-onevision-qwen2-72b-ov

New Datasets:
1. longwriter-6k
2. rlaif-v
3. latex-ocr-print, latex-ocr-handwrite

中文版

新功能:
1. 支持ReFT,实现了比 LoRA 高 15 倍到 65 倍的参数效率。
2. 多模态模型支持 zero3。
3. 支持使用环境变量控制模型特有的参数,如 hd_num、max_num 和 video_segments。

新模型:
1. longwriter-glm4-9b, longwriter-llama3_1-8b
2. phi3_5-mini-instruct, phi3_5-moe-instruct, phi3_5-vision-instruct
3. llava-onevision-qwen2-0_5b-ov, llava-onevision-qwen2-7b-ov, llava-onevision-qwen2-72b-ov

新数据集:
1. longwriter-6k
2. rlaif-v
3. latex-ocr-print, latex-ocr-handwrite

What's Changed
* fix imports by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1748
* compat with torch=1.12/1.13 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1752
* update rlaif-v hf dataset by hjh0119 in https://github.com/modelscope/ms-swift/pull/1755
* fix lmdeploy: AssertionError: failed to match chat template, please explicit set chat_template_config by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1759
* use eager -> sdpa by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1764
* Fix GLM4 agent toolcall by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1767
* Support LongWriter-llama3.1-8b and LongWriter-glm4-9b. by DaozeZhang in https://github.com/modelscope/ms-swift/pull/1762
* Support llava onevision by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1761
* [TorchAcc] fix: fix saving and loading checkpoint for full sft FSDP by baoleai in https://github.com/modelscope/ms-swift/pull/1765
* Fix deepseek-coder-v2-lite template by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1771
* Fix qwen2-audio & zero3 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1774
* Fix zero3 & minicpm-v/internvl2/xcomposer by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1772
* fix infer dataset_test_ratio by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1779
* fix moe & gradient_checkpointing by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1782
* support phi3.5-vision by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1780
* ReFT by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1785
* update doc by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1789
* support qwen-vl & base64 by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1790
* fix yi-vl template by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1793
* fix bugs by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1794
* fix imports by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1796
* fix history_roles by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1798
* fix mllm rlhf with full sft type by hjh0119 in https://github.com/modelscope/ms-swift/pull/1800
* fix CI by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1797
* fix megatron_patch_path by wning13 in https://github.com/modelscope/ms-swift/pull/1804
* Support hd num by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1801
* Support Latex OCR dataset by Jintao-Huang in https://github.com/modelscope/ms-swift/pull/1810
* fix offline export by wning13 in https://github.com/modelscope/ms-swift/pull/1805
* fix by tastelikefeet in https://github.com/modelscope/ms-swift/pull/1811

New Contributors
* wning13 made their first contribution in https://github.com/modelscope/ms-swift/pull/1804

**Full Changelog**: https://github.com/modelscope/ms-swift/compare/v2.3.1...v2.3.2

Page 3 of 7

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.