Deepsparse

Latest version: v1.8.0

Safety actively analyzes 724051 Python packages for vulnerabilities to keep your Python projects secure.

Page 3 of 7

1.3.1

This is a patch release for 1.3.0 that contains the following changes:

- Performance on some unstructured sparse quantized YOLOv5 models has been improved. This fixes a performance regression compared to DeepSparse 1.1.
- DeepSparse no longer throws an exception when it cannot determine L3 cache information and instead logs a warning message.
- An assertion failure on some compound sparse quantized transformer models has been fixed.
- Models with ONNX opset 13 Squeeze operators no longer exhibit poor performance, and DeepSparse now sees speedup from sparsity when running them.
- NumPy version pinned to <=1.21.6 to avoid deprecation warning/index errors in pipelines.

1.3.0

New Features:
* Bfloat16 is now supported on CPUs with the AVX512_BF16 extension. Users can expect up to 30% performance improvement for sparse FP32 networks and an up to 75% performance improvement for dense FP32 networks. This feature is opt-in and is specified with the `default_precision` parameter in the configuration file.
* Several options can now be specified using a configuration file.
* Max and min operators are now supported for performance.
* SQuAD 2.0 support provided.
* NLP multi-label and eval support added.
* Fraction of supported operations property added to `engine` class.
* New ML Ops logging capabilities implemented, including metrics logging, custom functions, and Prometheus support.

Changes:
* Minimum Python version set to 3.7.
* The default logging level has been changed to `warn`.
* Timing functions and a default no-op deallocator have been added to improve usability of the C++ API.
* DeepSparse now supports the `axes` parameter to be specified either as an input or an attribute in several ONNX operators.
* Model compilation times have been improved on machines with many cores.
* YOLOv5 pipelines upgraded to latest state from Ultralytics.
* Transformers pipelines upgraded to latest state from Hugging Face.

Resolved Issues:
* DeepSparse no longer crashes with an assertion failure for softmax operators on dimensions with a single element.
* DeepSparse no longer crashes with an assertion failure on some unstructured sparse quantized BERT models.
* Image classification evaluation script no longer crashes for larger batch sizes.

Known Issues:
* None

1.2.0

New Features:
* [DeepSparse Engine Trial](https://neuralmagic.com/deepsparse-engine-free-trial/) and Enterprise Editions now available, including license key activations.
* DeepSparse Pipelines document classification use case in NLP supported.

Changes:
* Mock engine tests added to enable faster and more precise unit tests in pipelines and Python code.
* DeepSparse Engine benchmarking updated to use `time.perf_counter` for more accurate benchmarks.
* Dynamic batch implemented to be more generic so it can support any pipeline.
* Minimum Python version changed to 3.7 as 3.6 reached EOL.

Performance:
* Performance improvements for unstructured sparse quantized convolutional neural networks implemented for throughput use cases.

Resolved Issues:
* In the C++ interface, the engine no longer crashes with a segmentation fault when the `num_streams` provided to the `engine_context_t` is greater than the number of physical CPU cores.
* The engine no longer crashes with assertion failures when running YOLOv4.
* YOLACT pipelines fixed where dynamic batch was not working and exported images had color channels improperly swapped.
* DeepSparse Server no longer crashes for hyphenated task names such as "question-answering."
* Computer vision pipelines now additionally accept single NumPy array inputs.
* Protobuf version for ONNX 1.12 compatibility pinned to prevent installation failures on some systems.

Known Issues:
* None

1.1.0

New Features:
* Python 3.10 support added.
* [Zero-shot text classification pipeline](https://github.com/neuralmagic/deepsparse/blob/release/1.1/src/deepsparse/transformers/pipelines/zero_shot_text_classification.py) implemented.
* [Haystack Information Retrieval pipeline](https://github.com/neuralmagic/deepsparse/blob/release/1.1/src/deepsparse/transformers/haystack/pipeline.py) implemented.
* [YOLACT pipeline](https://github.com/neuralmagic/deepsparse/tree/release/1.1/src/deepsparse/yolact) native integration for deployments is available.
* DeepSparse pipelines now support dynamic batch, dynamic shape through bucketing, and asynchronous execution support.
* [CustomTaskPipeline](https://github.com/neuralmagic/deepsparse/blob/release/1.1/src/deepsparse/pipelines/custom_pipeline.py) added to enable easier custom pipeline creation.

Changes:
* The behavior of the Multi-stream scheduler is now identical to the Elastic scheduler, and the old Multi-stream scheduler has been removed.
* NLP pipelines for question answering, text classification, and token classification upgraded to improve accuracy and better match the SparseML training pathways.
* Updates made across the repository for new SparseZoo Python APIs.
* Max torchvision version increased to 0.12.0 for computer vision deployment pathways.

Performance:
* Inference performance improvements for
* unstructured sparse quantized Transformer models.
* slow activation functions (such as Gelu or Swish) when they follow a QuantizeLinear operator.
* some sparse 1D convolutions. Speedups of up to 3x are observed.
* Squeeze, when operating on a single axis.

Resolved Issues:
* Assertion errors no longer when one node had multiple inputs, both coming from the same node no longer occurs.
* An assertion error no longer appears when a MatMul operator followed a Transpose or Reshape operator no longer occurs.
* Pipelines now support hyphenated versions of standard task names such as question-answering,

Known Issues:
* In the C++ interface, the engine will crash with a segmentation fault when the `num_streams` provided to the `engine_context_t` is greater than the number of physical CPU cores.

1.0.2

This is a patch release for 1.0.0 that contains the following changes:

* Question answering pipeline pre-processing now to exactly match the SparseML training pre-processing. Before there were differences between the logic of the two that was leading to minor drops in accuracy.

1.0.1

This is a patch release for 1.0.0 that contains the following changes:

Crashes with an assertion failure no longer happen in the following cases:
* during model compilation for a convolution with a 1x1 kernel with 2x2 convolution strides.
* when setting the `num_streams` parameter to fewer than the number of NUMA nodes.

The engine no longer enters an infinite loop when an operation has multiple inputs coming from the same source.

Error messaging improved for installation failures of non-supported operating systems.

Supported transformers `datasets` version capped for compatibility with pipelines.

Page 3 of 7

Releases

Has known vulnerabilities

Previous Next

Deepsparse

Page 3 of 7

1.3.1

1.3.0

1.2.0

1.1.0

1.0.2

1.0.1

Page 3 of 7

Links

Releases