Release Note
Enhancements:
- Added CUDA support for various operations like conv2d, MatMul, dwconv, pool2d, and more.
- Improved performance for operations like MeanStdScale and softmax.
- Enhanced multi-core batch mm and added support for bm168x with CUDA.
- Refined CUDA code style and adjusted interfaces for various operations.
Bug Fixes:
- Fixed issues with matmul, calibration failures, conv pad problems, and various performance problems.
- Addressed bugs in model transformations, calibration, and various pattern issues.
- Resolved bugs in different model backends like ssd, vit, detr, and yolov5.
New Features:
- Added support for new models like resnet50, mobilenet_v2, shufflenet_v2, and yolox_s/alphapose_res50.
- Introduced new operations like RequantIntAxisOp and Depth2Space with CUDA support.
- Implemented new functionalities for better model inference and compilation.
Documentation Updates:
- Updated weight.md, calibration sections, and user interface details.
- Improved documentation for quick start, developer manual, and various tpulang interfaces.
- Enhanced documentation for model transformation parameters and tensor data arrangements.
Miscellaneous:
- Added new npz tools, modelzoo regression, and support for bmodel encryption.
- Fixed issues with various model performance, shape inference, and CUDA backend optimizations.
- Revived performance for models like yolov5s-6, bm1690 swin multicore, and more.