New Features:
* Python 3.10 support added.
* [Zero-shot text classification pipeline](https://github.com/neuralmagic/deepsparse/blob/release/1.1/src/deepsparse/transformers/pipelines/zero_shot_text_classification.py) implemented.
* [Haystack Information Retrieval pipeline](https://github.com/neuralmagic/deepsparse/blob/release/1.1/src/deepsparse/transformers/haystack/pipeline.py) implemented.
* [YOLACT pipeline](https://github.com/neuralmagic/deepsparse/tree/release/1.1/src/deepsparse/yolact) native integration for deployments is available.
* DeepSparse pipelines now support dynamic batch, dynamic shape through bucketing, and asynchronous execution support.
* [CustomTaskPipeline](https://github.com/neuralmagic/deepsparse/blob/release/1.1/src/deepsparse/pipelines/custom_pipeline.py) added to enable easier custom pipeline creation.
Changes:
* The behavior of the Multi-stream scheduler is now identical to the Elastic scheduler, and the old Multi-stream scheduler has been removed.
* NLP pipelines for question answering, text classification, and token classification upgraded to improve accuracy and better match the SparseML training pathways.
* Updates made across the repository for new SparseZoo Python APIs.
* Max torchvision version increased to 0.12.0 for computer vision deployment pathways.
Performance:
* Inference performance improvements for
* unstructured sparse quantized Transformer models.
* slow activation functions (such as Gelu or Swish) when they follow a QuantizeLinear operator.
* some sparse 1D convolutions. Speedups of up to 3x are observed.
* Squeeze, when operating on a single axis.
Resolved Issues:
* Assertion errors no longer when one node had multiple inputs, both coming from the same node no longer occurs.
* An assertion error no longer appears when a MatMul operator followed a Transpose or Reshape operator no longer occurs.
* Pipelines now support hyphenated versions of standard task names such as question-answering,
Known Issues:
* In the C++ interface, the engine will crash with a segmentation fault when the `num_streams` provided to the `engine_context_t` is greater than the number of physical CPU cores.