Tensorrt

Latest version: v10.9.0.34

Safety actively analyzes 723177 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 4

10.8.0

Key Features and Updates:

- Demo changes
- demoDiffusion
- Added [Image-to-Image](demo/Diffusiongenerate-an-image-guided-by-an-initial-image-and-a-text-prompt-using-flux) support for Flux-1.dev and Flux.1-schnell pipelines.
- Added [ControlNet](demo/Diffusiongenerate-an-image-guided-by-a-text-prompt-and-a-control-image-using-flux-controlnet) support for [FLUX.1-Canny-dev](https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev) and [FLUX.1-Depth-dev](https://huggingface.co/black-forest-labs/FLUX.1-Depth-dev) pipelines. Native FP8 quantization is also supported for these pipelines.
- Added support for ONNX model export only mode. See [--onnx-export-only](demo/Diffusionhttps://gitlab-master.nvidia.com/TensorRT/Public/oss/-/tree/release/10.8/demo/Diffusion?ref_type=heads#use-separate-directories-for-individual-onnx-models).
- Added FP16, BF16, FP8, and FP4 support for all Flux Pipelines.
- Plugin changes
- Added SM 100 and SM 120 support to bertQKVToContextPlugin. This enables demo/BERT on Blackwell GPUs.
- Sample changes
- Added a new `sampleEditableTimingCache` to demonstrate how to build an engine with the desired tactics by modifying the timing cache.
- Deleted the `sampleAlgorithmSelector` sample.
- Fixed `sampleOnnxMNIST` by updating the correct INT8 dynamic range.
- Parser changes
- Added support for `FLOAT4E2M1` types for quantized networks.
- Added support for dynamic axes and improved performance of `CumSum` operations.
- Fixed the import of local functions when their input tensor names aliased one from an outside scope.
- Added support for `Pow` ops with integer-typed exponent values.
- Fixed issues
- Fixed segmentation of boolean constant nodes - [4224](https://github.com/NVIDIA/TensorRT/issues/4224).
- Fixed accuracy issue when multiple optimization profiles were defined [4250](https://github.com/NVIDIA/TensorRT/issues/4250).

10.7.0

Key Feature and Updates:

- Demo Changes
- demoDiffusion
- Enabled low-vram for the Flux pipeline. Users can now run the pipelines on systems with 32GB VRAM.
- Added support for [FLUX.1-schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell) pipeline.
- Enabled weight streaming mode for Flux pipeline.

- Plugin Changes
- On Blackwell and later platforms, TensorRT will drop cuDNN support on the following categories of plugins
- User-written `IPluginV2Ext`, `IPluginV2DynamicExt`, and `IPluginV2IOExt` plugins that are dependent on cuDNN handles provided by TensorRT (via the `attachToContext()` API).
- TensorRT standard plugins that use cuDNN, specifically:
- `InstanceNormalization_TRT` (version: 1, 2, and 3) present in `plugin/instanceNormalizationPlugin/`.
- `GroupNormalizationPlugin` (version: 1) present in `plugin/groupNormalizationPlugin/`.
- Note: These normalization plugins are superseded by TensorRT’s native `INormalizationLayer` ([C++](https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_normalization_layer.html), [Python](https://docs.nvidia.com/deeplearning/tensorrt/operators/docs/Normalization.html)). TensorRT support for cuDNN-dependent plugins remain unchanged on pre-Blackwell platforms.

- Parser Changes
- Now prioritizes using plugins over local functions when a corresponding plugin is available in the registry.
- Added dynamic axes support for `Squeeze` and `Unsqueeze` operations.
- Added support for parsing mixed-precision `BatchNormalization` nodes in strongly-typed mode.

- Addressed Issues
- Fixed [4113](https://github.com/NVIDIA/TensorRT/issues/4113).

10.6.0

Key Feature and Updates:
- Demo Changes
- demoBERT: The use of `fcPlugin` in demoBERT has been removed.
- demoBERT: All TensorRT plugins now used in demoBERT (`CustomEmbLayerNormDynamic`, `CustomSkipLayerNormDynamic`, and `CustomQKVToContextDynamic`) now have versions that inherit from IPluginV3 interface classes. The user can opt-in to use these V3 plugins by specifying `--use-v3-plugins` to the builder scripts.
- Opting-in to use V3 plugins does not affect performance, I/O, or plugin attributes.
- There is a known issue in the V3 (version 4) of `CustomQKVToContextDynamic` plugin from TensorRT 10.6.0, causing an internal assertion error if either the batch or sequence dimensions differ at runtime from the ones used to serialize the engine. See the “known issues” section of the [TensorRT-10.6.0 release notes](https://docs.nvidia.com/deeplearning/tensorrt/release-notes/index.html#rel-10-6-0).
- For smoother migration, the default behavior is still using the deprecated `IPluginV2DynamicExt`-derived plugins, when the flag: `--use-v3-plugins` isn't specified in the builder scripts. The flag `--use-deprecated-plugins` was added as an explicit way to enforce the default behavior, and is mutually exclusive with `--use-v3-plugins`.
- demoDiffusion
- Introduced BF16 and FP8 support for the [Flux.1-dev](demo/Diffusiongenerate-an-image-guided-by-a-text-prompt-using-flux) pipeline.
- Expanded FP8 support on Ada platforms.
- Enabled LoRA adapter compatibility for SDv1.5, SDv2.1, and SDXL pipelines using Diffusers version 0.30.3.

- Sample Changes
- Added the Python sample [quickly_deployable_plugins](samples/python/quickly_deployable_plugins), which demonstrates quickly deployable Python-based plugin definitions (QDPs) in TensorRT. QDPs are a simple and intuitive decorator-based approach to defining TensorRT plugins, requiring drastically less code.

- Plugin Changes
- The `fcPlugin` has been deprecated. Its functionality has been superseded by the [IMatrixMultiplyLayer](https://docs.nvidia.com/deeplearning/tensorrt/api/c_api/classnvinfer1_1_1_i_matrix_multiply_layer.html) that is natively provided by TensorRT.
- Migrated `IPluginV2`-descendent version 1 of `CustomEmbLayerNormDynamic`, to version 6, which implements `IPluginV3`.
- The newer versions preserve the attributes and I/O of the corresponding older plugin version.
- The older plugin versions are deprecated and will be removed in a future release.

- Parser Changes
- Updated ONNX submodule version to 1.17.0.
- Fixed issue where conditional layers were incorrectly being added.
- Updated local function metadata to contain more information.
- Added support for parsing nodes with Quickly Deployable Plugins.
- Fixed handling of optional outputs.

- Tool Updates
- ONNX-Graphsurgeon updated to version 0.5.3
- Polygraphy updated to 0.49.14.

10.5.0

Key Features and Updates:

- Demo changes
- Added [Flux.1-dev](demo/Diffusion) pipeline
- Sample changes
- None
- Plugin changes
- Migrated `IPluginV2`-descendent versions of `bertQKVToContextPlugin` (1, 2, 3) to newer versions (4, 5, 6 respectively) which implement `IPluginV3`.
- Note:
- The newer versions preserve the attributes and I/O of the corresponding older plugin version
- The older plugin versions are deprecated and will be removed in a future release
- Quickstart guide
- None
- Parser changes
- Added support for real-valued `STFT` operations
- Improved error handling in `IParser`

Known issues:

- Demos:
- TensorRT engine might not be build successfully when using `--fp8` flag on H100 GPUs.

10.4.0

Key Features and Updates:

- Demo changes
- Added [Stable Cascade](demo/Diffusion) pipeline.
- Enabled INT8 and FP8 quantization for Stable Diffusion v1.5, v2.0 and v2.1 pipelines.
- Enabled FP8 quantization for Stable Diffusion XL pipeline.
- Sample changes
- Add a new python sample `aliased_io_plugin` which demonstrates how in-place updates to plugin inputs can be achieved through I/O aliasing.
- Plugin changes
- Migrated IPluginV2-descendent versions (a) of the following plugins to newer versions (b) which implement IPluginV3 (a->b):
- scatterElementsPlugin (1->2)
- skipLayerNormPlugin (1->5, 2->6, 3->7, 4->8)
- embLayerNormPlugin (2->4, 3->5)
- bertQKVToContextPlugin (1->4, 2->5, 3->6)
- Note
- The newer versions preserve the corresponding attributes and I/O of the corresponding older plugin version.
- The older plugin versions are deprecated and will be removed in a future release.

- Quickstart guide
- Updated deploy_to_triton guide and removed legacy APIs.
- Removed legacy TF-TRT code as the project is no longer supported.
- Removed quantization_tutorial as pytorch_quantization has been deprecated. Check out https://github.com/NVIDIA/TensorRT-Model-Optimizer for the latest quantization support. Check [Stable Diffusion XL (Base/Turbo) and Stable Diffusion 1.5 Quantization with Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/diffusers/quantization) for integration with TensorRT.
- Parser changes
- Added support for tensor `axes` for `Pad` operations.
- Added support for `BlackmanWindow`, `HammingWindow`, and `HannWindow` operations.
- Improved error handling in `IParserRefitter`.
- Fixed kernel shape inference in multi-input convolutions.

- Updated tooling
- polygraphy-extension-trtexec v0.0.9

10.3.0

Key Features and Updates:

- Demo changes
- Added [Stable Video Diffusion](demo/Diffusion)(`SVD`) pipeline.
- Plugin changes
- Deprecated Version 1 of [ScatterElements plugin](plugin/scatterElementsPlugin). It is superseded by Version 2, which implements the `IPluginV3` interface.
- Quickstart guide
- Updated the [SemanticSegmentation](quickstart/SemanticSegmentation) guide with latest APIs.
- Parser changes
- Added support for tensor `axes` inputs for `Slice` node.
- Updated `ScatterElements` importer to use Version 2 of [ScatterElements plugin](plugin/scatterElementsPlugin), which implements the `IPluginV3` interface.
- Updated tooling
- Polygraphy v0.49.13

Page 1 of 4

Releases

Has known vulnerabilities

Tensorrt

Page 1 of 4

10.8.0

10.7.0

10.6.0

10.5.0

10.4.0

10.3.0

Page 1 of 4

Links

Releases