Deepsparse

Latest version: v1.7.1

Safety actively analyzes 638681 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 7

1.3.0

New Features:
* Bfloat16 is now supported on CPUs with the AVX512_BF16 extension. Users can expect up to 30% performance improvement for sparse FP32 networks and an up to 75% performance improvement for dense FP32 networks. This feature is opt-in and is specified with the `default_precision` parameter in the configuration file.
* Several options can now be specified using a configuration file.
* Max and min operators are now supported for performance.
* SQuAD 2.0 support provided.
* NLP multi-label and eval support added.
* Fraction of supported operations property added to `engine` class.
* New ML Ops logging capabilities implemented, including metrics logging, custom functions, and Prometheus support.

Changes:
* Minimum Python version set to 3.7.
* The default logging level has been changed to `warn`.
* Timing functions and a default no-op deallocator have been added to improve usability of the C++ API.
* DeepSparse now supports the `axes` parameter to be specified either as an input or an attribute in several ONNX operators.
* Model compilation times have been improved on machines with many cores.
* YOLOv5 pipelines upgraded to latest state from Ultralytics.
* Transformers pipelines upgraded to latest state from Hugging Face.

Resolved Issues:
* DeepSparse no longer crashes with an assertion failure for softmax operators on dimensions with a single element.
* DeepSparse no longer crashes with an assertion failure on some unstructured sparse quantized BERT models.
* Image classification evaluation script no longer crashes for larger batch sizes.

Known Issues:
* None

1.2.0

New Features:
* [DeepSparse Engine Trial](https://neuralmagic.com/deepsparse-engine-free-trial/) and Enterprise Editions now available, including license key activations.
* DeepSparse Pipelines document classification use case in NLP supported.

Changes:
* Mock engine tests added to enable faster and more precise unit tests in pipelines and Python code.
* DeepSparse Engine benchmarking updated to use `time.perf_counter` for more accurate benchmarks.
* Dynamic batch implemented to be more generic so it can support any pipeline.
* Minimum Python version changed to 3.7 as 3.6 reached EOL.

Performance:
* Performance improvements for unstructured sparse quantized convolutional neural networks implemented for throughput use cases.

Resolved Issues:
* In the C++ interface, the engine no longer crashes with a segmentation fault when the `num_streams` provided to the `engine_context_t` is greater than the number of physical CPU cores.
* The engine no longer crashes with assertion failures when running YOLOv4.
* YOLACT pipelines fixed where dynamic batch was not working and exported images had color channels improperly swapped.
* DeepSparse Server no longer crashes for hyphenated task names such as "question-answering."
* Computer vision pipelines now additionally accept single NumPy array inputs.
* Protobuf version for ONNX 1.12 compatibility pinned to prevent installation failures on some systems.

Known Issues:
* None

1.1.0

New Features:
* Python 3.10 support added.
* [Zero-shot text classification pipeline](https://github.com/neuralmagic/deepsparse/blob/release/1.1/src/deepsparse/transformers/pipelines/zero_shot_text_classification.py) implemented.
* [Haystack Information Retrieval pipeline](https://github.com/neuralmagic/deepsparse/blob/release/1.1/src/deepsparse/transformers/haystack/pipeline.py) implemented.
* [YOLACT pipeline](https://github.com/neuralmagic/deepsparse/tree/release/1.1/src/deepsparse/yolact) native integration for deployments is available.
* DeepSparse pipelines now support dynamic batch, dynamic shape through bucketing, and asynchronous execution support.
* [CustomTaskPipeline](https://github.com/neuralmagic/deepsparse/blob/release/1.1/src/deepsparse/pipelines/custom_pipeline.py) added to enable easier custom pipeline creation.

Changes:
* The behavior of the Multi-stream scheduler is now identical to the Elastic scheduler, and the old Multi-stream scheduler has been removed.
* NLP pipelines for question answering, text classification, and token classification upgraded to improve accuracy and better match the SparseML training pathways.
* Updates made across the repository for new SparseZoo Python APIs.
* Max torchvision version increased to 0.12.0 for computer vision deployment pathways.

Performance:
* Inference performance improvements for
* unstructured sparse quantized Transformer models.
* slow activation functions (such as Gelu or Swish) when they follow a QuantizeLinear operator.
* some sparse 1D convolutions. Speedups of up to 3x are observed.
* Squeeze, when operating on a single axis.

Resolved Issues:
* Assertion errors no longer when one node had multiple inputs, both coming from the same node no longer occurs.
* An assertion error no longer appears when a MatMul operator followed a Transpose or Reshape operator no longer occurs.
* Pipelines now support hyphenated versions of standard task names such as question-answering,

Known Issues:
* In the C++ interface, the engine will crash with a segmentation fault when the `num_streams` provided to the `engine_context_t` is greater than the number of physical CPU cores.

1.0.2

This is a patch release for 1.0.0 that contains the following changes:

* Question answering pipeline pre-processing now to exactly match the SparseML training pre-processing. Before there were differences between the logic of the two that was leading to minor drops in accuracy.

1.0.1

This is a patch release for 1.0.0 that contains the following changes:

Crashes with an assertion failure no longer happen in the following cases:
* during model compilation for a convolution with a 1x1 kernel with 2x2 convolution strides.
* when setting the `num_streams` parameter to fewer than the number of NUMA nodes.

The engine no longer enters an infinite loop when an operation has multiple inputs coming from the same source.

Error messaging improved for installation failures of non-supported operating systems.

Supported transformers `datasets` version capped for compatibility with pipelines.

1.0.0

New Features:
* Support added for running multiple models with the same engine when using the Elastic Scheduler.
* When using the Elastic Scheduler, the caller can now use the `num_streams` argument to tune the number of requests that are processed in parallel.
* Pipeline and annotation support added and generalized for transformers, yolov5, and torchvision.
* Documentation additions made for transformers, yolov5, torchvision, and serving that focus on model deployment for the given integrations.
* AWS SageMaker example created.

Changes:
* Click as a root dependency added as the new preferred route for CLI invocation and arg management.

Performance:
* Inference performance has been improved for unstructured sparse quantized models on AVX2 and AVX-512 systems that do not support VNNI instructions. This includes up to 20% on BERT and 45% on ResNet-50.

Resolved Issues:
* When a layer operates on a dataset larger than 2GB, potential crashes no longer happen.
* Assertion error addressed for Reduce operations where the reduction axis is of length 1.
* Rare assertion failure addressed related to Tensor Columns.
* When running the DeepSparse Engine on a system with a non-uniform system topology, model compilation now properly terminates.

Known Issues:
* In rare cases, the engine may crash with an assertion failure during model compilation for a convolution with a 1x1 kernel with 2x2 convolution strides; hotfix forthcoming.
* The engine will crash with an assertion failure when setting the `num_streams` parameter to fewer than the number of NUMA nodes; hotfix forthcoming.
* In rare cases, the engine may enter an infinite loop when an operation has multiple inputs coming from the same source; hotfix forthcoming.

Page 3 of 7

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.