Paddle-rec

Latest version: v1.8.5.1

Safety actively analyzes 622904 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

1.8.5

重要更新
- 此版本适配飞桨版本为v1.8.5
- 框架升级,支持更加灵活的reader及模型适配, 支持更加灵活的训练模式定义及数据读取定义
- 新增9个模型,并对多个已支持模型进行了优化
- 取消内置`paddlerec.models.rank.`等模型的内置配置方法, 统一由用户根据yaml的路径进行配置
- 支持Kubernetes、PaddleCloud一键提交飞桨分布式训练
- 支持CPU/GPU下进行飞桨分布式训练, 支持GPU下collective模式训练,支持GPU下parameter server模式训练及CPU下parameter server模式训练

功能新增及修复
- 新增collective模式支持GPU多卡训练、parameter server模式支持GPU-PS训练、单机多卡训练等
- 新增分布式训练任务提交功能,支持在MPI/Kubernetes/PaddleCloud上一键启动训练
- 新增多个指标的计算和分布式计算功能,包括AUC、Recall_k(召回topk的准确率)、PN(正逆序)、Precison_Recall等
- 新增BatchReader功能, 可由用户在Reader中自行组batch
- 新增预训练Trainer及流式训练Trainer,可支持用户对预训练及流式训练的需求
- 新增本地文件列表shuffle的功能,在训练前进行数据文件粒度的shuffle支持
- 新增batch级别模型保存
- 数据读取优化,加入SlotReader, 用户只需要按照要求生成好数据并配置好数据格式即可使用飞桨高效训练
- 修复LOG打印,规范log级别及log输出格式
- 修复Windows下安装出错的bug
- 修复数据读取读取隐藏文件的bug
- 修复collective多卡数据不均匀划分导致训练异常的bug
- 修复learning rate不支持科学计数法的bug

模型新增及修复
- 新增模型DIEN、BST、AutoInt、FGCNN、Fibinet、FLEN、RALM、Match-pyramid、TDM 等模型
- 新增预训练模型TextCNN
- 为Fibinet、FLEN、youtubednn、gnn、word2vec等模型加入Readme,数据处理,运行结果展示等功能,修复模型效果问题
- 修复Rank目录下DNN、LR、FM、DeepFM等多个模型的Readme
- 修复Recall目录下多个Readme中模型配置及路径问题
- TDM加入完整训练流程,包括训练、建树、聚类及在线预测

教程更新
- 新增单机训练、分布式训练、流式训练及英文教程、 预训练模型教程

0.1.0

Major Features and Components:
- Start training with one-line command
- Training framework with four extensible modules supported
- Engine: local training and distributed training supported on CPU/GPU on multiple platforms
- Trainer: support user-defined training logics
- Model: easy to develop user-defined models and plugin models
- Reader: high performance data processing with user-defined processing functions.
Model zoos:
- more than 30 plugin deep learning algorithms in recommendation system pipelines, such as content understanding models, recall models, ranking models, multi-task models, reranking models, tree-based models and matching models, etc.
Documentation
- Quick start: 10 minutes hands on tutorial with movielens 1M dataset. Users can understand what is going on in recommender system offline training through data processing, training, validation.
- Basic tutorials, covering data preprocessing, model hyper-parameter tuning, training, prediction, deployment
- Advanced tutorials, including how to do user-defined data preprocessing, how to write a user-defined network, training pipeline customization.

Special Thanks to our Contributors
xiexionghang (for initial commit contribution)

Links

Releases

Has known vulnerabilities

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.