Lmdeploy

Latest version: v0.7.2.post1

Safety actively analyzes 723625 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 8

0.7.0

<!-- Release notes generated using configuration in .github/release.yml at main -->

What's Changed
πŸš€ Features
* Support moe w8a8 in pytorch engine by grimoire in https://github.com/InternLM/lmdeploy/pull/2894
* Support DeepseekV3 fp8 by grimoire in https://github.com/InternLM/lmdeploy/pull/2967
* support new backend cambricon by JackWeiw in https://github.com/InternLM/lmdeploy/pull/3002
* support-moe-fp8 by RunningLeon in https://github.com/InternLM/lmdeploy/pull/3007
* add internlm3-dense(turbomind) & chat template by irexyc in https://github.com/InternLM/lmdeploy/pull/3024
* support internlm3 on pt by RunningLeon in https://github.com/InternLM/lmdeploy/pull/3026
* Support internlm3 quantization by AllentDan in https://github.com/InternLM/lmdeploy/pull/3027
πŸ’₯ Improvements
* Optimize awq kernel in pytorch engine by grimoire in https://github.com/InternLM/lmdeploy/pull/2965
* Support fp8 w8a8 for pt backend by RunningLeon in https://github.com/InternLM/lmdeploy/pull/2959
* Optimize lora kernel by grimoire in https://github.com/InternLM/lmdeploy/pull/2975
* Remove threadsafe by grimoire in https://github.com/InternLM/lmdeploy/pull/2907
* Refactor async engine & turbomind IO by lzhangzz in https://github.com/InternLM/lmdeploy/pull/2968
* [dlinfer]rope refine by JackWeiw in https://github.com/InternLM/lmdeploy/pull/2984
* Expose spaces_between_special_tokens by AllentDan in https://github.com/InternLM/lmdeploy/pull/2991
* [dlinfer]change llm op interface of paged_prefill_attention. by JackWeiw in https://github.com/InternLM/lmdeploy/pull/2977
* Update request logger by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2981
* remove decoding by grimoire in https://github.com/InternLM/lmdeploy/pull/3016
🐞 Bug fixes
* Fix build crash in nvcr.io/nvidia/pytorch:24.06-py3 image by zgjja in https://github.com/InternLM/lmdeploy/pull/2964
* add tool role in BaseChatTemplate as tool response in messages by AllentDan in https://github.com/InternLM/lmdeploy/pull/2979
* Fix ascend dockerfile by jinminxi104 in https://github.com/InternLM/lmdeploy/pull/2989
* fix internvl2 qk norm by grimoire in https://github.com/InternLM/lmdeploy/pull/2987
* fix xcomposer2 when transformers is upgraded greater than 4.46 by irexyc in https://github.com/InternLM/lmdeploy/pull/3001
* Fix get_ppl & get_logits by lvhan028 in https://github.com/InternLM/lmdeploy/pull/3008
* Fix typo in w4a16 guide by Yan-Xiangjun in https://github.com/InternLM/lmdeploy/pull/3018
* fix blocked fp8 moe kernel by grimoire in https://github.com/InternLM/lmdeploy/pull/3009
* Fix async engine by lzhangzz in https://github.com/InternLM/lmdeploy/pull/3029
* [hotfix] Fix get_ppl by lvhan028 in https://github.com/InternLM/lmdeploy/pull/3023
* Fix MoE gating for DeepSeek V2 by lzhangzz in https://github.com/InternLM/lmdeploy/pull/3030
* Fix empty response for pipeline by lzhangzz in https://github.com/InternLM/lmdeploy/pull/3034
* Fix potential hang during TP model initialization by lzhangzz in https://github.com/InternLM/lmdeploy/pull/3033
🌐 Other
* [ci] add w8a8 and internvl2.5 models into testcase by zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/2949
* bump version to v0.7.0 by lvhan028 in https://github.com/InternLM/lmdeploy/pull/3010

New Contributors
* zgjja made their first contribution in https://github.com/InternLM/lmdeploy/pull/2964
* Yan-Xiangjun made their first contribution in https://github.com/InternLM/lmdeploy/pull/3018

**Full Changelog**: https://github.com/InternLM/lmdeploy/compare/0.6.5...v0.7.0

0.6.5

<!-- Release notes generated using configuration in .github/release.yml at main -->

What's Changed
πŸš€ Features
* [dlinfer] feat: add DlinferFlashAttention to support qwen vl. by Reinerzhou in https://github.com/InternLM/lmdeploy/pull/2952
πŸ’₯ Improvements
* refactor PyTorchEngine check env by grimoire in https://github.com/InternLM/lmdeploy/pull/2870
* refine multi-backend setup.py by jinminxi104 in https://github.com/InternLM/lmdeploy/pull/2880
* Refactor VLM modules by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2810
* [dlinfer] only compile the language model in vl models by tangzhiyi11 in https://github.com/InternLM/lmdeploy/pull/2893
* Optimize tp broadcast by grimoire in https://github.com/InternLM/lmdeploy/pull/2889
* unfeeze torch version in dockerfile by RunningLeon in https://github.com/InternLM/lmdeploy/pull/2906
* support tp > n_kv_heads for pt engine by RunningLeon in https://github.com/InternLM/lmdeploy/pull/2872
* replicate kv for some models when tp is divisble by kv_head_num by irexyc in https://github.com/InternLM/lmdeploy/pull/2874
* Fallback to pytorch engine when the model is quantized by smooth quant by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2953
* Torchrun launching multiple api_server by AllentDan in https://github.com/InternLM/lmdeploy/pull/2402
🐞 Bug fixes
* [Feature] Support for loading lora adapter weights in safetensors format by Galaxy-Husky in https://github.com/InternLM/lmdeploy/pull/2860
* fix cpu cache by grimoire in https://github.com/InternLM/lmdeploy/pull/2881
* Fix args type in docstring by Galaxy-Husky in https://github.com/InternLM/lmdeploy/pull/2888
* Fix llama3.1 chat template by fzyzcjy in https://github.com/InternLM/lmdeploy/pull/2862
* Fix typo by ghntd in https://github.com/InternLM/lmdeploy/pull/2916
* fix: Incorrect stats size during inference of throughput benchmark when concurrency > num_prompts by pancak3 in https://github.com/InternLM/lmdeploy/pull/2928
* fix lora name and rearange wqkv for internlm2 by RunningLeon in https://github.com/InternLM/lmdeploy/pull/2912
* [dlinfer] fix moe op for dlinfer. by Reinerzhou in https://github.com/InternLM/lmdeploy/pull/2917
* [side effect] fix vlm quant failed by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2914
* fix torch_dtype by RunningLeon in https://github.com/InternLM/lmdeploy/pull/2933
* support unaligned qkv heads by grimoire in https://github.com/InternLM/lmdeploy/pull/2930
* fix mllama inference without image by RunningLeon in https://github.com/InternLM/lmdeploy/pull/2947
* Support torch_dtype modification and update FAQs for AWQ quantization by AllentDan in https://github.com/InternLM/lmdeploy/pull/2898
* Fix exception handler for proxy server by AllentDan in https://github.com/InternLM/lmdeploy/pull/2901
* Fix torch_dtype in lite by AllentDan in https://github.com/InternLM/lmdeploy/pull/2956
* [side-effect] bring back quantization of qwen2-vl, glm4v and etc. by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2954
* add a thread pool executor to control the vl engine traffic by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2970
* [side-effect] fix gradio demo error by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2976
🌐 Other
* [dlinfer] fix engine checker by tangzhiyi11 in https://github.com/InternLM/lmdeploy/pull/2891
* Bump version to v0.6.5 by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2955

New Contributors
* Galaxy-Husky made their first contribution in https://github.com/InternLM/lmdeploy/pull/2860
* fzyzcjy made their first contribution in https://github.com/InternLM/lmdeploy/pull/2862
* ghntd made their first contribution in https://github.com/InternLM/lmdeploy/pull/2916
* pancak3 made their first contribution in https://github.com/InternLM/lmdeploy/pull/2928

**Full Changelog**: https://github.com/InternLM/lmdeploy/compare/v0.6.4...0.6.5

0.6.4

<!-- Release notes generated using configuration in .github/release.yml at main -->

What's Changed
πŸš€ Features
* feature: support qwen2.5 fuction_call by akai-shuuichi in https://github.com/InternLM/lmdeploy/pull/2737
* [Feature] support minicpm-v_2_6 for pytorch engine. by Reinerzhou in https://github.com/InternLM/lmdeploy/pull/2767
* Support qwen2-vl AWQ quantization by AllentDan in https://github.com/InternLM/lmdeploy/pull/2787
* Add DeepSeek-V2 support by lzhangzz in https://github.com/InternLM/lmdeploy/pull/2763
* [ascend]feat: support kv int8 by yao-fengchen in https://github.com/InternLM/lmdeploy/pull/2736
πŸ’₯ Improvements
* Optimize update_step_ctx on Ascend by jinminxi104 in https://github.com/InternLM/lmdeploy/pull/2804
* Add Ascend installation adapter by zhabuye in https://github.com/InternLM/lmdeploy/pull/2817
* Refactor turbomind (2/N) by lzhangzz in https://github.com/InternLM/lmdeploy/pull/2818
* add openssh-server installation in dockerfile by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2830
* Add version restrictions in runtime_ascend.txt to ensure functionality by zhabuye in https://github.com/InternLM/lmdeploy/pull/2836
* better kv allocate by grimoire in https://github.com/InternLM/lmdeploy/pull/2814
* Update internvl chat template by AllentDan in https://github.com/InternLM/lmdeploy/pull/2832
* profile throughput without new threads by grimoire in https://github.com/InternLM/lmdeploy/pull/2826
* [dlinfer] change dlinfer kv_cache layout and ajust paged_prefill_attention api. by Reinerzhou in https://github.com/InternLM/lmdeploy/pull/2847
* [maca] add env to support different mm layout on maca. by Reinerzhou in https://github.com/InternLM/lmdeploy/pull/2835
* Supports W8A8 quantization for more models by AllentDan in https://github.com/InternLM/lmdeploy/pull/2850
🐞 Bug fixes
* disable prefix-caching for vl model by grimoire in https://github.com/InternLM/lmdeploy/pull/2825
* Fix gemma2 accuracy through the correct softcapping logic by AllentDan in https://github.com/InternLM/lmdeploy/pull/2842
* fix accessing before initialization by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2845
* fix the logic to verify whether AutoAWQ has been successfully installed by grimoire in https://github.com/InternLM/lmdeploy/pull/2844
* check whether backend_config is None or not before accessing its attr by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2848
* [ascend] convert kv cache to nd format in ascend graph mode by tangzhiyi11 in https://github.com/InternLM/lmdeploy/pull/2853
πŸ“š Documentations
* Update supported models & Ascend doc by jinminxi104 in https://github.com/InternLM/lmdeploy/pull/2765
* update supported models by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2849
🌐 Other
* [CI] Split vl testcases into turbomind and pytorch backend by zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/2751
* [dlinfer] Fix qwenvl rope error for dlinfer backend by JackWeiw in https://github.com/InternLM/lmdeploy/pull/2795
* [CI] add more testcase for mllm models by zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/2791
* Update dlinfer-ascend version in runtime_ascend.txt by jinminxi104 in https://github.com/InternLM/lmdeploy/pull/2865
* bump version to v0.6.4 by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2864

New Contributors
* akai-shuuichi made their first contribution in https://github.com/InternLM/lmdeploy/pull/2737
* JackWeiw made their first contribution in https://github.com/InternLM/lmdeploy/pull/2795
* zhabuye made their first contribution in https://github.com/InternLM/lmdeploy/pull/2817

**Full Changelog**: https://github.com/InternLM/lmdeploy/compare/v0.6.3...v0.6.4

0.6.3

<!-- Release notes generated using configuration in .github/release.yml at main -->

What's Changed
πŸš€ Features
* support yarn in turbomind backend by irexyc in https://github.com/InternLM/lmdeploy/pull/2519
* add linear op on dlinfer platform by yao-fengchen in https://github.com/InternLM/lmdeploy/pull/2627
* support turbomind head_dim 64 by irexyc in https://github.com/InternLM/lmdeploy/pull/2715
* [Feature]: support LlavaForConditionalGeneration with turbomind inference by deepindeed2022 in https://github.com/InternLM/lmdeploy/pull/2710
* Support Mono-InternVL with PyTorch backend by wzk1015 in https://github.com/InternLM/lmdeploy/pull/2727
* Support Qwen2-MoE models by lzhangzz in https://github.com/InternLM/lmdeploy/pull/2723
* Support mixtral moe AWQ quantization. by AllentDan in https://github.com/InternLM/lmdeploy/pull/2725
* Support chemvlm by RunningLeon in https://github.com/InternLM/lmdeploy/pull/2738
* Support molmo in turbomind by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2716
πŸ’₯ Improvements
* Call cuda empty_cache to prevent OOM when quantizing model by AllentDan in https://github.com/InternLM/lmdeploy/pull/2671
* feat: support dynamic/llama3 rotary embedding in ascend graph mode by tangzhiyi11 in https://github.com/InternLM/lmdeploy/pull/2670
* Add ensure_ascii = False for json.dumps by AllentDan in https://github.com/InternLM/lmdeploy/pull/2707
* Flatten cache and add flashattention by grimoire in https://github.com/InternLM/lmdeploy/pull/2676
* Support ep, column major moe kernel. by grimoire in https://github.com/InternLM/lmdeploy/pull/2690
* Remove one of the duplicate bos tokens by AllentDan in https://github.com/InternLM/lmdeploy/pull/2708
* Check server input by irexyc in https://github.com/InternLM/lmdeploy/pull/2719
* optimize dlinfer moe by tangzhiyi11 in https://github.com/InternLM/lmdeploy/pull/2741
🐞 Bug fixes
* Support min_tokens, min_p parameters for api_server by AllentDan in https://github.com/InternLM/lmdeploy/pull/2681
* fix index error when computing ppl on long-text prompt by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2697
* Better tp exit log. by grimoire in https://github.com/InternLM/lmdeploy/pull/2677
* miss to read moe_ffn weights from converted tm model by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2698
* Fix turbomind TP by lzhangzz in https://github.com/InternLM/lmdeploy/pull/2706
* fix decoding kernel for deepseekv2 by grimoire in https://github.com/InternLM/lmdeploy/pull/2688
* fix tp exit code for pytorch engine by RunningLeon in https://github.com/InternLM/lmdeploy/pull/2718
* fix assert pad >= 0 failed when inter_size is not a multiple of group… by Vinkle-hzt in https://github.com/InternLM/lmdeploy/pull/2740
* fix issue that mono-internvl failed to fallback pytorch engine by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2744
* Remove use_fast=True when loading tokenizer for lite auto_awq by AllentDan in https://github.com/InternLM/lmdeploy/pull/2758
* set wrong head_dim for mistral-nemo by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2761
πŸ“š Documentations
* Update ascend readme by jinminxi104 in https://github.com/InternLM/lmdeploy/pull/2756
* fix ascend get_started.md link by CyCle1024 in https://github.com/InternLM/lmdeploy/pull/2696
* Fix llama3.2 VL vision in "Supported Modals" documents by blankanswer in https://github.com/InternLM/lmdeploy/pull/2703
🌐 Other
* [ci] support v100 dailytest by zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/2665
* [ci] add more testcase into evaluation and daily test by zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/2721
* feat: support multi cards in ascend graph mode by tangzhiyi11 in https://github.com/InternLM/lmdeploy/pull/2755
* bump version to v0.6.3 by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2754

New Contributors
* blankanswer made their first contribution in https://github.com/InternLM/lmdeploy/pull/2703
* tangzhiyi11 made their first contribution in https://github.com/InternLM/lmdeploy/pull/2670
* wzk1015 made their first contribution in https://github.com/InternLM/lmdeploy/pull/2727
* Vinkle-hzt made their first contribution in https://github.com/InternLM/lmdeploy/pull/2740

**Full Changelog**: https://github.com/InternLM/lmdeploy/compare/v0.6.2...v0.6.3

0.6.2.post1

<!-- Release notes generated using configuration in .github/release.yml at 0.6.2.post1 -->

What's Changed
Bugs
* Fix llama3.2 VL vision in "Supported Modals" documents blankanswer in 2703
* miss to read moe_ffn weights from converted tm model lvhan028 in 2698
* better tp exit log grimoire in 2677
* fix index error when computing ppl on long-text prompt lvhan028 in 2697
* Support min_tokens, min_p parameters for api_server AllentDan in 2681
* fix ascend get_started.md link CyCle1024 in 2696
* Call cuda empty_cache to prevent OOM when quantizing model AllentDan in 2671
* Fix turbomind TP for v0.6.2 by lzhangzz in https://github.com/InternLM/lmdeploy/pull/2713
🌐 Other
* [[ci] support v100 dailytest (](https://github.com/InternLM/lmdeploy/commit/434195ea0c80b38dc2cf80c79d53a30f22b53aab)https://github.com/InternLM/lmdeploy/pull/2665[)](https://github.com/InternLM/lmdeploy/commit/434195ea0c80b38dc2cf80c79d53a30f22b53aab)
* bump version to 0.6.2.post1 by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2717


**Full Changelog**: https://github.com/InternLM/lmdeploy/compare/v0.6.2...v0.6.2.post1

0.6.2

<!-- Release notes generated using configuration in .github/release.yml at main -->

Highlights

- PyTorch engine supports graph mode on ascend platform, doubling the inference speed
- Support llama3.2-vision models in PyTorch engine
- Support Mixtral in TurboMind engine, achieving 20+ RPS using SharedGPT dataset with 2 A100-80G GPUs


What's Changed
πŸš€ Features
* support downloading models from openmind_hub by cookieyyds in https://github.com/InternLM/lmdeploy/pull/2563
* Support pytorch engine kv int4/int8 quantization by AllentDan in https://github.com/InternLM/lmdeploy/pull/2438
* feat(ascend): support w4a16 by yao-fengchen in https://github.com/InternLM/lmdeploy/pull/2587
* [maca] add maca backend support. by Reinerzhou in https://github.com/InternLM/lmdeploy/pull/2636
* Support mllama for pytorch engine by AllentDan in https://github.com/InternLM/lmdeploy/pull/2605
* add --eager-mode to cli by RunningLeon in https://github.com/InternLM/lmdeploy/pull/2645
* [ascend] add ascend graph mode by CyCle1024 in https://github.com/InternLM/lmdeploy/pull/2647
* MoE support for turbomind by lzhangzz in https://github.com/InternLM/lmdeploy/pull/2621
πŸ’₯ Improvements
* [Feature] Add argument to disable FastAPI docs by mouweng in https://github.com/InternLM/lmdeploy/pull/2540
* add check for device with cap 7.x by grimoire in https://github.com/InternLM/lmdeploy/pull/2535
* Add tool role for langchain usage by AllentDan in https://github.com/InternLM/lmdeploy/pull/2558
* Fix llama3.2-1b inference error by handling tie_word_embedding by grimoire in https://github.com/InternLM/lmdeploy/pull/2568
* Add a workaround for saving internvl2 with latest transformers by AllentDan in https://github.com/InternLM/lmdeploy/pull/2583
* optimize paged attention on triton3 by grimoire in https://github.com/InternLM/lmdeploy/pull/2553
* refactor for multi backends in dlinfer by CyCle1024 in https://github.com/InternLM/lmdeploy/pull/2619
* Copy sglang/bench_serving.py to lmdeploy as serving benchmark script by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2620
* Add barrier to prevent TP nccl kernel waiting. by grimoire in https://github.com/InternLM/lmdeploy/pull/2607
* [ascend] refactor fused_moe on ascend platform by yao-fengchen in https://github.com/InternLM/lmdeploy/pull/2613
* [ascend] support paged_prefill_attn when batch > 1 by yao-fengchen in https://github.com/InternLM/lmdeploy/pull/2612
* Raise an error for the wrong chat template by AllentDan in https://github.com/InternLM/lmdeploy/pull/2618
* refine pre-post-process by jinminxi104 in https://github.com/InternLM/lmdeploy/pull/2632
* small block_m for sm7.x by grimoire in https://github.com/InternLM/lmdeploy/pull/2626
* update check for triton by grimoire in https://github.com/InternLM/lmdeploy/pull/2641
* Support llama3.2 LLM models in turbomind engine by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2596
* Check whether device support bfloat16 by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2653
* Add warning message about `do_sample` to alert BC by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2654
* update ascend dockerfile by CyCle1024 in https://github.com/InternLM/lmdeploy/pull/2661
* fix supported model list in ascend graph mode by jinminxi104 in https://github.com/InternLM/lmdeploy/pull/2669
* remove dlinfer version by CyCle1024 in https://github.com/InternLM/lmdeploy/pull/2672
🐞 Bug fixes
* set outlines<0.1.0 by AllentDan in https://github.com/InternLM/lmdeploy/pull/2559
* fix: make exit_flag verification for ascend more general by CyCle1024 in https://github.com/InternLM/lmdeploy/pull/2588
* set capture mode thread_local by grimoire in https://github.com/InternLM/lmdeploy/pull/2560
* Add distributed context in pytorch engine to support torchrun by grimoire in https://github.com/InternLM/lmdeploy/pull/2615
* Fix error in python3.8. by Reinerzhou in https://github.com/InternLM/lmdeploy/pull/2646
* Align UT with triton fill_kv_cache_quant kernel by AllentDan in https://github.com/InternLM/lmdeploy/pull/2644
* miss device_type when checking is_bf16_supported on ascend platform by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2663
* fix syntax in Dockerfile_aarch64_ascend by CyCle1024 in https://github.com/InternLM/lmdeploy/pull/2664
* Set history_cross_kv_seqlens to 0 by default by AllentDan in https://github.com/InternLM/lmdeploy/pull/2666
* fix build error in ascend dockerfile by CyCle1024 in https://github.com/InternLM/lmdeploy/pull/2667
* bugfix: llava-hf/llava-interleave-qwen-7b-hf (2497) by deepindeed2022 in https://github.com/InternLM/lmdeploy/pull/2657
* fix inference mode error for qwen2-vl by irexyc in https://github.com/InternLM/lmdeploy/pull/2668
πŸ“š Documentations
* Add instruction for downloading models from openmind hub by cookieyyds in https://github.com/InternLM/lmdeploy/pull/2577
* Fix spacing in ascend user guide by Superskyyy in https://github.com/InternLM/lmdeploy/pull/2601
* Update get_started tutorial about deploying on ascend platform by jinminxi104 in https://github.com/InternLM/lmdeploy/pull/2655
* Update ascend get_started tutorial about installing nnal by jinminxi104 in https://github.com/InternLM/lmdeploy/pull/2662
🌐 Other
* [ci] add oc infer test in stable test by zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/2523
* update copyright by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2579
* [Doc]: Lock sphinx version by RunningLeon in https://github.com/InternLM/lmdeploy/pull/2594
* [ci] use local requirements for test workflow by zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/2569
* [ci] add pytorch kvint testcase into function regresstion by zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/2584
* [ci] React dailytest workflow by zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/2617
* [ci] fix restful script by zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/2635
* [ci] add internlm2_5_7b_batch_1 into evaluation testcase by zhulinJulia24 in https://github.com/InternLM/lmdeploy/pull/2631
* match torch and torch_vision version by grimoire in https://github.com/InternLM/lmdeploy/pull/2649
* Bump version to v0.6.2 by lvhan028 in https://github.com/InternLM/lmdeploy/pull/2659

New Contributors
* mouweng made their first contribution in https://github.com/InternLM/lmdeploy/pull/2540
* cookieyyds made their first contribution in https://github.com/InternLM/lmdeploy/pull/2563
* Superskyyy made their first contribution in https://github.com/InternLM/lmdeploy/pull/2601
* Reinerzhou made their first contribution in https://github.com/InternLM/lmdeploy/pull/2636
* deepindeed2022 made their first contribution in https://github.com/InternLM/lmdeploy/pull/2657

**Full Changelog**: https://github.com/InternLM/lmdeploy/compare/v0.6.1...v0.6.2

Page 2 of 8

Β© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.