Deepsparse

Latest version: v1.7.1

Safety actively analyzes 641002 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 7

0.12.2

This is a patch release for 0.12.0 that contains the following changes:

- Protobuf is restricted to version < 4.0 as the newer version breaks ONNX.

0.12.1

This is a patch release for 0.12.0 that contains the following changes:

- Improper label mapping no longer crashes for validation flows within DeepSparse transformers.
- DeepSparse Server now exposes proper routes for SageMaker.
- Dependency issue with DeepSparse Server no longer installs an old version of a library that caused crashing issues in some use cases.

0.12.0

New Features:
**Documentation:**
* [SparseServer.UI](https://github.com/neuralmagic/deepsparse/tree/main/examples/sparseserver-ui): a Streamlit app for deploying the DeepSparse Server for exploring the inference performance of BERT on the question answering task.
* [DeepSparse Server README](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/server): `deepsparse.server` capabilities, including single model and multi-model inferencing.
* [Twitter NLP Inference Examples](https://github.com/neuralmagic/deepsparse/tree/main/examples/twitter-nlp) added.

Changes:
**Performance:**
* Speedup for large batch sizes when using sync mode on AMD EPYC processors.
* AVX2 improvements for
* Up to 40% speedup out of the box for dense quantized models.
* Up to 20% speedup for pruned quantized BERT, ResNet-50 and MobileNet.
* Speedup from sparsity realized for ConvInteger operators.
* Model compilation time decreased on systems with many cores.
* Multi-stream Scheduler: certain computations that were executed during runtime are now precomputed.
* Hugging Face Transformers integration updated to latest state from upstream main branch.

**Documentation:**
* [DeepSparse README](https://github.com/neuralmagic/deepsparse): references to `deepsparse.server`, `deepsparse.benchmark`, and Transformer pipelines.
* [DeepSparse Benchmark README](https://github.com/neuralmagic/deepsparse/tree/main/src/deepsparse/benchmark_model): highlights of `deepsparse.benchmark` CLI command.
* [Transformers 🤗 Inference Pipelines](https://github.com/neuralmagic/deepsparse/tree/main/examples/huggingface-transformers): examples included on how to run inference via Python for several NLP tasks.

Resolved Issues:
* When running quantized BERT with a sequence length not divisible by 4, the DeepSparse Engine will no longer disable optimizations and see very poor performance.
* Users executing `arch.bin` now receive a correct architecture profile of their system.

Known Issues:
* When running the DeepSparse engine on a system with a nonuniform system topology, for example, an AMD EPYC processor where some cores per core-complex (CCX) have been disabled, model compilation will never terminate. A workaround is to set the environment variable `NM_SERIAL_UNIT_GENERATION=1`.

0.11.2

This is a patch release for 0.11.0 that contains the following changes:

- Fixed an assertion error that would occur when using `deepsparse.benchmark` on AMD machines with the argument `-pin none`.

Known Issues:
- When running quantized BERT with a sequence length not divisible by 4, the DeepSparse Engine will disable optimizations and see very poor performance.

0.11.1

This is a patch release for 0.11.0 that contains the following changes:

* When running [NanoDet-Plus-m](https://github.com/RangiLyu/nanodet), the DeepSparse Engine will no longer fail with an assertion (See #279).
* The DeepSparse Engine now respects the cpu affinity set by the calling thread. This is essential for the new [Command-line (CLI) tool](https://github.com/neuralmagic/deepsparse/blob/main/examples/amd-azure/README.md) `multi-process-benchmark.py` to function correctly. This script allows users to measure the performance using multiple separate processes in parallel.
* Fixed a performance regression on BERT batch size 1 sequence length 128 models.

0.11.0

New Features:
* High-performance sparse quantized convolutional neural networks supported on AVX2 systems.
* CCX detection added to the DeepSparse Engine for AMD systems.
* [`deepsparse.server`](https://github.com/neuralmagic/deepsparse/blob/main/src/deepsparse/server/main.py) integration and CLIs added with Hugging Face transformers pipelines support.

Changes:
Performance improvements made for
* FP32 sparse BERT models
* batch size 1 networks
* quantized sparse BERT models
* Pooling operations

Resolved Issues:
* When hyperthreads are disabled in the BIOS, core/socket information on certain systems can now be detected.
* Hugging Face transformers validation flows for QQP now giving correct accuracy metrics.
* PyTorch downloaded for YOLO model stubs now supported.

Known Issues:
* When running NanoDet-Plus-m, the DeepSparse Engine will fail with an assertion (See 279). A hotfix is being pursued.

Page 4 of 7

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.