Tensorrt

Latest version: v10.9.0.34

Safety actively analyzes 724051 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 4

9.1.0

Key Features and Updates:

- Update the [trt_python_plugin](samples/python/python_plugin) sample.
- Python plugins API reference is part of the offical TRT Python API.
- Added samples demonstrating the usage of the progress monitor API.
- Check [sampleProgressMonitor](samples/sampleProgressMonitor) for the C++ sample.
- Check [simple_progress_monitor](samples/python/simple_progress_monitor) for the Python sample.
- Remove dependencies related to python<3.8 in python samples as we no longer support python<3.8 for python samples.
- Demo changes
- Added LAMBADA dataset accuracy checks in the [HuggingFace](demo/HuggingFace) demo.
- Enabled structured sparsity and FP8 quantized batch matrix multiplication(BMM)s in attention in the [NeMo](demo/NeMo) demo.
- Replaced deprecated APIs in the [BERT](demo/BERT) demo.
- Updated tooling
- Polygraphy v0.49.1

9.0.1

Key Features and Updates:

- TensorRT plugin autorhing in Python is now supported
- See the [trt_python_plugin](samples/python/python_plugin) sample for reference.
- Updated default CUDA version to 12.2
- Support for BLIP models, Seq2Seq and Vision2Seq abstractions in HuggingFace demo.
- demoDiffusion refactoring and SDXL enhancements
- Additional validation asserts for NV Plugins
- Updated tooling
- TensorRT Engine Explorer v0.1.7: graph rendering for TensorRT 9.0 `kgen` kernels
- ONNX-GraphSurgeon v0.3.29
- PyTorch quantization toolkit v2.2.0

9.0.0

Key Features and Updates:

- Added the NeMo demo to demonstrate the performance benefit of using E4M3 FP8 data type with the GPT models trained with the [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo) and [TransformerEngine](https://github.com/NVIDIA/TransformerEngine).
- Demo Diffusion updates
- Added SDXL 1.0 txt2img pipeline
- Added ControlNet pipeline
- Huggingface demo updates
- Added Flan-T5, OPT, BLOOM, BLOOMZ, GPT-Neo, GPT-NeoX, Cerebras-GPT support with accuracy check
- Refactored code and extracted common utils into Seq2Seq class
- Optimized shape-changing overhead and achieved a >30% e2e performance gain
- Added stable KV-cache, beam search and fp16 support for all models
- Added dynamic batch size TRT inference
- Added uneven-length multi-batch inference with attention_mask support
- Added `chat` command – interactive CLI
- Upgraded PyTorch and HuggingFace version to support Hopper GPU
- Updated notebooks with much simplified demo API.

- Added two new TensorRT samples: sampleProgressMonitor (C++) and simple_progress_reporter (Python) that are examples for using Progress Monitor during engine build.
- The following plugins were deprecated:
- ``BatchedNMS_TRT``
- ``BatchedNMSDynamic_TRT``
- ``BatchTilePlugin_TRT``
- ``Clip_TRT``
- ``CoordConvAC``
- ``CropAndResize``
- ``EfficientNMS_ONNX_TRT``
- ``CustomGeluPluginDynamic``
- ``LReLU_TRT``
- ``NMSDynamic_TRT``
- ``NMS_TRT``
- ``Normalize_TRT``
- ``Proposal``
- ``SingleStepLSTMPlugin``
- ``SpecialSlice_TRT``
- ``Split``

- Ubuntu 18.04 has reached end of life and is no longer supported by TensorRT starting with 9.0, and the corresponding Dockerfile(s) have been removed.
- Support for aarch64 builds will not be available in this release, and the corresponding Dockerfiles have been removed.

8.6.1

TensorRT OSS release corresponding to TensorRT 8.6.1.6 GA release.
- Updates since [TensorRT 8.6.0 EA release](https://docs.nvidia.com/deeplearning/tensorrt/release-notes/#rel-8-6-0-EA).
- Please refer to the [TensorRT 8.6.1.6 GA release notes](https://docs.nvidia.com/deeplearning/tensorrt/release-notes/#rel-8-6-1) for more information.

Key Features and Updates:

- Added a new flag `--use-cuda-graph` to demoDiffusion to improve performance.
- Optimized GPT2 and T5 HuggingFace demos to use fp16 I/O tensors for fp16 networks.

8.6.0

TensorRT OSS release corresponding to TensorRT 8.6.0.12 EA release.
- Updates since [TensorRT 8.5.3 GA release](https://docs.nvidia.com/deeplearning/tensorrt/release-notes/#rel-8-5-3).
- Please refer to the [TensorRT 8.6.0.12 EA release notes](https://docs.nvidia.com/deeplearning/tensorrt/release-notes/#rel-8-6-0-EA) for more information.

Key Features and Updates:

- demoDiffusion acceleration is now supported out of the box in TensorRT without requiring plugins.
- The following plugins have been removed accordingly: GroupNorm, LayerNorm, MultiHeadCrossAttention, MultiHeadFlashAttention, SeqLen2Spatial, and SplitGeLU.
- Added a new sample called onnx_custom_plugin.

8.5.3

TensorRT OSS release corresponding to TensorRT 8.5.3.1 GA release.
- Updates since [TensorRT 8.5.2 GA release](https://docs.nvidia.com/deeplearning/tensorrt/release-notes/#rel-8-5-2).
- Please refer to the [TensorRT 8.5.3 GA release notes](https://docs.nvidia.com/deeplearning/tensorrt/release-notes/#rel-8-5-3) for more information.

Key Features and Updates:

- Added the following HuggingFace demos: GPT-J-6B, GPT2-XL, and GPT2-Medium
- Added nvinfer1::plugin namespace
- Optimized KV Cache performance for T5

Page 3 of 4

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.