Modelscope

Latest version: v1.24.1

Safety actively analyzes 723625 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 7 of 8

1.7.0

中文版本

新模型推荐
| 序号 | 模型名称&快捷链接 |
| --- | --- |
| 1 | [读光-文字识别-轻量化端侧识别模型-中英-通用领域](https://modelscope.cn/models/damo/cv_LightweightEdge_ocr-recognitoin-general_damo/summary) |
| 2 | [读光-文字检测-轻量化端侧DBNet行检测模型-中英-通用领域](https://modelscope.cn/models/damo/cv_proxylessnas_ocr-detection-db-line-level_damo/summary) |
| 3 | [CAM++说话人转换点定位-两人-中文](https://modelscope.cn/models/damo/speech_campplus-transformer_scl_zh-cn_16k-common/summary) |



高亮功能

- 新增轻量化端侧识别模型LightweightEdge
- 新增轻量化端侧DBNet行检测模型
- 新增CAM++说话人转换点定位
- llama模型支持finetune和deepseed
- llama模型支持lora
- 对于transformers模型支持device_map
- 数据集支持jsonl格式
- 对于大型模型文件支持并行下载(dsw或eais环境)
- 提升youku超大型数据集下载体验

功能列表
- 新增轻量化端侧识别模型LightweightEdge
- 新增轻量化端侧DBNet行检测模型
- 新增flextrain ner样例
- 新增在文本分类finetune的training args中增加模型版本
- 新增StreamingMixin
- 支持torch extension
- 支持llama模型微调和deepspeed
- pipeline中支持第三方的key
- 支持说话人分离pipeline
- 新增eres2net_aug v2模型
- 支持transformers model device_map
- 支持模型权重diff
- 数据集支持jsonl格式
- 新增Lora/Adapter/Prompt/Chatglm6b
- 部分tests增加teardown
- 解除datasets包版本限制
- 支持从model id加载
- llama模型支持lora
- 对于大型模型文件支持并行下载


功能提升

- 提升mPLUG-youku大型数据集的下载体验


BugFix
- 修复DeepspeedHook.register_processor
- dockerfile的兼容性修改(py37和py38)
- 修复 extra_args
- 修复ngpu bug和移除easyasr
- 修复mplug-youku超大数据集下载相关问题
- 修复gpt3 finetune nan的问题
- 修复torch extension ci hang住的问题
- 修复easycv lr hook 错误
- 修复torch2.x 兼容性问题
- 修复diffusers版本冲突问题
- 修复eval RecursionError
- 对于DiffusionForTextToImageSynthesis修复device_map问题
- 修复stable diffusion pipeline cpu推理问题
- 修复llama lora问题

English Version

New Model List and Quick Access

| No | Model Name & Link |
| --- | --- |
| 1 | [cv_LightweightEdge_ocr-recognitoin-general_damo](https://modelscope.cn/models/damo/cv_LightweightEdge_ocr-recognitoin-general_damo/summary) |
| 2 | [cv_proxylessnas_ocr-detection-db-line-level_damo](https://modelscope.cn/models/damo/cv_proxylessnas_ocr-detection-db-line-level_damo/summary) |
| 3 | [speech_campplus-transformer_scl_zh-cn_16k-common](https://modelscope.cn/models/damo/speech_campplus-transformer_scl_zh-cn_16k-common/summary) |


Highlight
- Add new OCR recognition model (LightweightEdge) and some functions
- Add ocr detection new model db-nas
- Add CAM++ model
- Support llama model finetune and deepspeed
- Support lora for llama model
- Support device_map for transformers
- Support jsonl format in datasets
- Support parallel download large model file
- Improve mPLUG-YOUKU dataset downloading experience

Breaking changes


Feature
- Add new OCR recognition model (LightweightEdge) and some functions
- Add ocr detection new model db-nas
- Add ner example for flextrain
- Add model revision in training_args and modify dataset loading in finetune text classification
- Add StreamingMixin
- Support pre build torch extension build image, first extension megatron_util
- Add llama finetune + deepspeed
- Support third_party key in pipeline
- Add speaker diarization pipeline and improve some speaker pipelines
- Add eres2net_aug v2
- Support device_map for transformers model
- Add make diff & recover for model weights
- Support jsonl format in meta data
- Add Lora/Adapter/Prompt/Chatglm6b
- Add teardown for tests
- Unfreeze datasets version setting
- Support load from model id
- Support lora for llama
- Support parallel download large model file

Improvements

- llama tuned model -> pipeline
- improve youku dataset downloading experience

BugFix

- Fix bug for DeepspeedHook.register_processor
- Docker file py38 and py37 compatible merge
- Fix extra_args
- ngpu bug and rm easyasr
- Fix issues for downloading mplug-youku dataset
- Fix gpt3 finetune nan
- Fix ci hang when build torch extension
- Fix easycv lr hook error
- Fix torch 2.x compatible issue
- Fix diffuser version conflict cv and multi-modal
- Fix eval RecursionError
- Fix device_map for DiffusionForTextToImageSynthesis
- Fix cpu inference for stable diffusion pipeline
- Fix llama lora bug

1.6.1

中文版本


功能列表

- 支持跳过easycv三方依赖引入
- 支持Flextrain training args和push_to_hub
- 支持domain_specific_object_detection 的onnx格式导出


BugFix

- 修复test_cli CI报错
- 修复merge hook
- 修复NER tokenizer不能接收kwargs的问题
- 修复lineless_table_recognition功能遇到空白图片崩溃的bug
- 修复某些情况下private数据集鉴权失败的问题

English Version



Feature
- Add pattern to skip easycv.thirdparty
- Support flex train feature (training args and push_to_hub adaptions)
- Support onnx export for domain_specific_object_detection



BugFix

- Fix CI: test merge dataset failed
- Fix merge_hook
- Fix NER tokenizer which won't accept kwargs
- Fix lineless_table_recognition crashed when input blank images
- fix private dataset auth issue

1.6.0

中文版本

该版本共新增上架5个模型。


新模型列表及快捷访问

| **贡献组织** | **模型名称** | **是否支持Finetune** |
| --- | --- | --- |
| **达摩院** | [ERes2Net说话人确认-英文-VoxCeleb-16k-离线-pytorch](https://modelscope.cn/models/damo/speech_eres2net_sv_en_voxceleb_16k/summary) | 否 |
| **达摩院** | [mPLUG-Owl-多模态对话-英文-7B](https://modelscope.cn/models/damo/multi-modal_mplug_owl_multimodal-dialogue_7b/summary) | 否 |
| **达摩院** | [FastInst快速实例分割](https://modelscope.cn/models/damo/cv_resnet50_fast-instance-segmentation_coco/summary) | 否 |
| **达摩院** | [TransFace人脸识别模型](https://modelscope.cn/models/damo/cv_vit_face-recognition/summary) | 否 |
| **达摩院** | [Regularized DINO说话人确认-英文-VoxCeleb-16k-离线-pytorch](https://modelscope.cn/models/damo/speech_rdino_ecapa_tdnn_sv_en_voxceleb_16k/summary) | 否 |



非兼容性修改
* 支持Python3.8版本
* 移除demo check



English Version
Highlight
- Support Python3.8
- Add mPLUG-Owl model
- Add cvpr23 Fastinst model


Breaking changes
- Support Python3.8
- Remove demo check


Feature
- Add ERes2Net for speaker verification
- Add mPLUG-Owl model
- Support FlexTrain and update the structure of trainer
- Add cvpr23 fastinst model
- Support Virgo MaxCompute datasource for Ali-cloud inner applications
- Add clip_interrogator
- Add gpt3 example
- Add convert megatron ckpt script
- Add trainer for UniTE
- Add transface model
- Add verified if whl installed
- Support python3.8
- Add ONNX exporter for ans dfsmn
- Add rdino model



Improvements

- Update multi_modal_embedding example
- Refine easrasr
- Pipeline input, output and parameter normalization.
- Display hub error message
- Remove easycv codes, plugin access



BugFix

- Fix bug in **kwargs duplicated for audio module
- Fix distributed hook to lazyimport and an import bug
- Fix transformer examples
- Add pop for base class parameters
- Fix func update_local_model; change funasr version
- Remove pai-easycv requirement
- Fix hypotheses did't init in cpu device, make fid_dialogue_test available

1.5.0

中文版本

新模型推荐
| 序号 | 模型名称&快捷链接 |
| --- | --- |
| 1 | [ResNet50行人结构化属性识别模型](https://modelscope.cn/models/damo/cv_resnet50_pedestrian-attribute-recognition_image/summary) |
| 2 | [DamoFD人脸检测关键点模型-0.5G](https://modelscope.cn/models/damo/cv_ddsar_face-detection_iclr23-damofd/summary) |
| 3 | [CAM++说话人确认-英文-VoxCeleb-16k](https://modelscope.cn/models/damo/speech_campplus_sv_en_voxceleb_16k/summary) |
| 4 | [一种具有自我评估能力的机器翻译-中英-通用领域-large](https://modelscope.cn/models/damo/nlp_canmt_translation_zh2en_large/summary) |

高亮功能

- 支持 lora 生成扩散模型高效调优
- 增加 llama 模型
- 支持推送到 hub 的能力
- 为 chatglm-6B 类模型支持 chat 任务
- 增加常用模型和任务的 cli 调用 example

功能列表

- 支持了对使用 megatron tensor 并行模型保存的 checkpoint 拆分合并
- 支持 lora 生成扩散模型高效调优
- 增加 pedestrian attribute recognition 模型
- 增加 damofd 系列模型
- 增加 llama 模型
- 支持推送到 hub 的能力
- 增加 speaker cam++ 模型
- 增加 head 支持 XlmRoberta 模型
- 增加 canmt translation 模型
- 为 chatglm-6B 类模型支持 chat 任务

功能提升

- funasr 更新到 0.4.0 版本,支持 mac 运行
- plugin 支持 trainer
- fid_dialouge_pipeline 新增 3.7B 模型
- 增加 Mgeo 模型 token classification 任务的训练示例
- 增加 PALM 模型 text generation 任务的训练示例
- 增加 CLIP 模型 multi-modal embedding 任务的训练示例
- speech kws nearfield 训练增加梯度累积配置
- 重构优化人脸重建模型相关代码
- 更新图像着色指标
- 更新 github issue 模版

BugFix

- 修复文本生成任务模型 generate 报错
- 修复人脸重建模型 pipeline 报错
- 修复 pipeline 重复输出 warning 的问题
- 修复 plugin import 包失败时报错
- 修复 speech kws nearfield 多卡训练报错
- 修复生成模型输出英文结果缺少空格的问题
- 修复 jsonplus 不支持 ndarray 的问题

English Version

New Model List and Quick Access

| No | Model Name & Link |
| --- | --- |
| 1 | [ResNet50 pedestrian-attribute-recognition image](https://modelscope.cn/models/damo/cv_resnet50_pedestrian-attribute-recognition_image/summary) |
| 2 | [DamoFD face-detection 0.5G](https://modelscope.cn/models/damo/cv_ddsar_face-detection_iclr23-damofd/summary) |
| 3 | [Speech cam++ English-VoxCeleb-16k](https://modelscope.cn/models/damo/speech_campplus_sv_en_voxceleb_16k/summary) |
| 4 | [Canmt translation with self evaluation zh2en-large](https://modelscope.cn/models/damo/nlp_canmt_translation_zh2en_large/summary) |

Highlight

- Add efficient tunner modules
- Add llama to mslib from hf
- Support the ability to push to hub
- Add task chat for all chat models, like chatglm-6B
- Add common models and tasks cli call example

Breaking changes

Feature
- Support split and merge for megatron_base model
- Add efficient tunner modules
- Add pedestrian attribute recognition model
- Add damofd model
- Add llama to mslib from hf
- Support the ability to push to hub
- Add speaker model cam++ for speaker verification task
- New head support for XlmRoberta model
- Add canmt translation model
- Add task chat for all chat models, like chatglm-6B

Improvements

- support funasr for mac
- Plugin support trainer
- Add 3.7B size model for fid_dialouge_pipeline
- Add token classification example for MGeo
- Add PALM finetune example
- Add multi-modal embedding example for CLIP
- Speech kws nearfield training add gradient accumulation config
- Update face reconstruction to HRN(CVPR2023)
- Update image colorization metric
- Update issue templates

BugFix

- Fix generate for ModelForTextGeneration
- Fix issues for face pipeline
- Fix keep printing warnings in pipeline
- Bug fixed in plugin
- Fix speech kws nearfield training with multi-gpu
- Fix english words without space
- Fix jsonplus, support ndarray

1.4.1

中文版本

新模型推荐
| 序号 | 模型名称&快捷链接 | 贡献组织 | 是否支持finetune |
| --- | --- | --- | --- |
| 1 | [ChatGLM-中英对话大模型-6B](https://modelscope.cn/models/ZhipuAI/ChatGLM-6B/summary) | 智谱.AI | |
| 2 | [GLM130B-中英大模型](https://modelscope.cn/models/ZhipuAI/GLM130B/summary) | 智谱.AI | |
| 3 | [unidiffuser-v1](https://modelscope.cn/models/thu-ml/unidiffuser-v1/summary) | 清华TSAIL | |
| 4 | 元语功能型对话大模型v2 | 元语智能 | |
| 5 | [盘古α 2.6B](https://www.modelscope.cn/models/OpenICommunity/pangu_2_6B/summary) | 鹏城实验室 | |
| 6 | [openjourney](https://modelscope.cn/models/dienstag/openjourney/summary) | 个人开发者-dienstag | |
| 7 | [Rwkv-4-pile-14b](https://modelscope.cn/models/Blink_DL/rwkv-4-pile-14b/summary) | 个人开发者-Blink\_DL | |
| 8 | [SiameseUIE通用信息抽取-中文-base](https://modelscope.cn/models/damo/nlp_structbert_siamese-uie_chinese-base/summary) | | |
| 9 | [SiameseUniNLU零样本通用自然语言理解-中文-base · 模型库 (modelscope.cn)](https://modelscope.cn/models/damo/nlp_structbert_siamese-uninlu_chinese-base/summary) | | |

高亮功能

- 外部repo可以以插件形式和modelscope库协同工作
- 增加SCRFD模型的onnx导出
- 增加damoyolo模型的onnx导出
- 支持序列标注模型的onnx/torchscript导出
- 支持cartoon模型的pb文件导出
- 重构taskdataset模块,用户现在可定制自己的数据集逻辑了
- 增加text-generation任务的examples,同样适用于GPT3
- Siamese uie模型支持finetune

功能列表
- 推理和训练中支持torch2.0 compile,注意因为测试尚不充分因此有些模型可能遇到错误
- Add adadet库的trainer支持
- ddcolor image colorization支持训练
- 增加video_instance_segmentation推理能力
- 增加CLI工具的插件能力
- 增加human reconstruction任务
- 增加vidt模型
- 增加speech_timestamp任务
- 增加disco guided diffusion模型
- ocr_reco_crnn支持训练
- 增加action detection的训练
- 增加ocr_detection_db的训练
- 增加 lore lineness table recognition任务
- 增加PEER模型
- 增加damoyolo的烟雾探测模型
- 增加RLEG模型
- 增加用于视频实例追踪的ProContEXT模型
- 增加视频感知模型longshortnet
- 增加dingding去噪模型
- 支持vision efficient tuning
- 支持text-to-video-synthesis任务

功能提升

- text-generation推理支持args输入
- 在video temporal grounding中支持soonet
- trainer支持DDPHook
- Kws支持继续训练能力
- 支持GPU上的正确DDIM采样能力
- 增加更多的CLI工具
- 修改语音推理的输入输出
- 优化kws的配置
- ImagePaintbyexamplePipeline支持demoservice
- 支持easycv trainer的load_from


BugFix

- 修复安装detecron2的报错
- 修复generate_scp_from_url方法的报错
- 修复speaker_verification_pipeline和speaker_diarization_pipeline
- 修复data releate case失败的问题
- 修复ast扫描失败的问题
- 修复Word alignment预处理器的bug

English Version

New Model List and Quick Access

| No | Model Name & Link | Org | Finetune supported |
| --- | --- | --- | --- |
| 1 | [ChatGLM-English&Chinese-6B](https://modelscope.cn/models/ZhipuAI/ChatGLM-6B/summary) | ZhiPu.AI | |
| 2 | [GLM130B-LLM English&Chinese](https://modelscope.cn/models/ZhipuAI/GLM130B/summary) | ZhiPu.AI | |
| 3 | [unidiffuser-v1](https://modelscope.cn/models/thu-ml/unidiffuser-v1/summary) | TsingHua TSAIL | |
| 4 | ChatYuan-large-v2 | YuanYu | |
| 5 | [OpenICommunity/pangu_2_6B](https://www.modelscope.cn/models/OpenICommunity/pangu_2_6B/summary) | PengCheng Lab | |
| 6 | [openjourney](https://modelscope.cn/models/dienstag/openjourney/summary) | personal-dienstag | |
| 7 | [Rwkv-4-pile-14b](https://modelscope.cn/models/Blink_DL/rwkv-4-pile-14b/summary) | personal-Blink\_DL | |
| 8 | [SiameseUIE information extraction-Chinese-base](https://modelscope.cn/models/damo/nlp_structbert_siamese-uie_chinese-base/summary) | | |
| 9 | [SiameseUniNLU zero-shot NLU Chinese base model](https://modelscope.cn/models/damo/nlp_structbert_siamese-uninlu_chinese-base/summary) | | |

Highlight

- Support repos work with modelscope library via plugin
- Support onnx export for SCRFD model
- Add onnx exporter for damoyolo
- Add onnx/torchscript exporter for token classification models
- Add frozen graph def exporter for cartoon model
- Refactor taskdataset module, user now can write datasets with custom logics
- Add example for text-generation finetuning, also available for GPT3
- Siamese uie finetune support


Breaking changes


Feature
- Support torch2.0 compile in inference and training, this feature is not stable on all models
- Add ADADET && thirdparty arg for damoyolo trainer
- Add finetune for ddcolor image colorization
- Add video_instance_segmentation pipeline
- Add plugin with cli tool
- Add human reconstruction task
- Add vidt model
- Add task: speech_timestamp
- Add disco guided diffusion
- Add training support for ocr_reco_crnn
- Add action detection finetune
- Add ocr_detection_db training module
- Add lore lineness table recognition
- Add PEER model
- Add smoke and fire detection model using damoyolo
- Add generative multimodal embedding model RLEG
- Add vop_se for text video retrival
- Add ProContEXT model for video single object tracking
- Add video streaming perception models longshortnet
- Add dingding denoise model
- Support vision efficient tuning finetune
- Add text-to-video-synthesis
- Add MAN for image-quality-assessment



Improvements

-Support run text generation pipeline with args
- Add soonet for video temporal grounding
- Trainer support parallel_groups setting and DDP hook
- Kws support continue training from a checkpoint
- Correct DDIM sampling on GPU
- Add more cli tools
- Modify audio input types && punc postprocess
- Optimize kws pipeline and training conf
- Support ImagePaintbyexamplePipeline demo service
- Support load_from for easycv trainer


BugFix

- Fix bug for install detecron2
- Fix bug for modify function generate_scp_from_url
- Fix bug for speaker_verification_pipeline and speaker_diarization_pipeline: re-write the default config with configure.json
- Fix bug for data releate case failed bug
- Fix bug for ast scan funcitondef
- Word alignment preprocessor fix

1.3.2

中文版本

新模型列表及快捷访问

该小版本共新增上架6个模型,其中新增2个模型支持finetune能力。

| 序号 | 模型名称&链接 | 支持finetune |
| --- | --- | --- |
| 1 | [ControlNet可控图像生成](https://modelscope.cn/models/dienstag/cv_controlnet_controllable-image-generation_nine-annotators/summary) | |
| 2 | [兰丁宫颈细胞AI辅助诊断模型](https://modelscope.cn/models/landingAI/LD_CytoBrainCerv/summary) | |
| 3 | [读光-文字检测-DB行检测模型-中英-通用领域](https://modelscope.cn/models/damo/cv_resnet18_ocr-detection-db-line-level_damo/summary) | |
| 4 | [SOND说话人日志-中文-alimeeting-16k-离线-pytorch](https://www.modelscope.cn/models/damo/speech_diarization_sond-zh-cn-alimeeting-16k-n16k4-pytorch/summary) | |
| 5 | [NeRF快速三维重建模型](https://www.modelscope.cn/models/damo/cv_nerf-3d-reconstruction-accelerate_damo/summary) | **√** |
| 6 | [DCT-Net人像卡通化](https://modelscope.cn/models/damo/cv_unet_person-image-cartoon_compound-models/summary) | **√** |


Feature
* GPT3 Finetune功能完善,支持DDP+tensor parallel, finetune流程串接推理流程优化
* checkpoint保存逻辑优化,确保周期性保存和最优保存的文件可以直接用于推理
* Hooks方案重构,解耦各个功能hook,支持hooks间交互
* 支持ImagePaintbyExamplePipeline demo service
* 支持多种音频类型
* 支持Petr3D CPU推理支持兼容新版mmcv
* deberta v2 预处理器更新
* 支持NLP下游任务模型初始化仅加载backbone预训练权重
* 更新librosa.resample()参数支持最新版本
* 添加下游工具箱调用埋点统计功能

不兼容行问题
* checkpoint保存分拆了模型参数和训练状态参数,老版本的模型参数需要转换后加载

问题修复:

* 修复asr vad/lm/punc输入处理
* 修复gpt moe finetune checkpoint path error
* 修复args lm_train_conf is invalid
* 修复删除已有文件ci测试报错
* 修复OCR识别bug
* 移除preprocessing stage中图像分辨率的限制
* 修复输出wav文件是32-bit float而不是预期的16-bit int
* 设置num_workers=0,以防止在demo-service中创建子进程



English Version

New Model List and Quick Access
This minor version adds a total of six new models, including two models with finetuning capability.

| **No.** | **Model Name & Link** | **Finetuning Supported** |
| --- | --- | --- |
| 1 | [ControlNet Controllable Image Generation](https://modelscope.cn/models/dienstag/cv_controlnet_controllable-image-generation_nine-annotators/summary) | |
| 2 | [Landing AI Cervical Cell AI-assisted Diagnosis Model](https://modelscope.cn/models/landingAI/LD_CytoBrainCerv/summary) | |
| 3 | [Reading Light - Text Detection - DB Row Detection Model - Chinese and English - General Domain](https://modelscope.cn/models/damo/cv_resnet18_ocr-detection-db-line-level_damo/summary) | |
| 4 | [SOND Speaker Diary - Chinese - Alimeeting-16k - Offline - PyTorch](https://www.modelscope.cn/models/damo/speech_diarization_sond-zh-cn-alimeeting-16k-n16k4-pytorch/summary) | |
| 5 | [NeRF Fast 3D Reconstruction Model](https://www.modelscope.cn/models/damo/cv_nerf-3d-reconstruction-accelerate_damo/summary) | **√** |
| 6 | [DCT-Net Person Image Cartoonization](https://modelscope.cn/models/damo/cv_unet_person-image-cartoon_compound-models/summary) | **√** |

Features
- GPT-3 finetune has been improved to support DDP+tensor parallel
- Checkpoint saving logic has been optimized to ensure that files saved periodically and those saved as the best can be used directly by pipeline
- The Hooks scheme has been refactored to decouple various functional hooks and support interaction between hooks.
- Supports ImagePaintbyExamplePipeline demo service
- Supports multi-machine data and tensor parallel finetuning for cartoon task
- Supports various audio types
- Supports Petr3D CPU inference with compatibility for the latest version of mmcv
- Updates deberta v2 preprocessor
- Supports initialization of downstream NLP task models with only backbone pre-training weights loaded
- Updates librosa.resample() parameter support to the latest version
- Adds downstream toolbox call tracking function

Break changes
- Saving model parameters and training state seperately, so previous trained checkpoints should be converted before resume training

Bug Fixes:

- Fixes asr vad/lm/punc input processing
- Fixes gpt moe finetune checkpoint path error
- Fixes args lm_train_conf is invalid
- Fixes ci test errors when deleting existing files
- Fixes OCR recognition bugs
- Removes image resolution restrictions in preprocessing stage
- Fixes output wav file being 32-bit float instead of expected 16-bit int
- Sets num_workers=0 to prevent creating sub-processes in demo-service.

Page 7 of 8

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.