Announcements
* For Execution Provider maintainers/owners: the [lightweight compile API](https://github.com/microsoft/onnxruntime/blob/master/include/onnxruntime/core/framework/execution_provider.h#L249) is now the default compiler API for all Execution Providers (this was previously only available for the mobile build). If you have an EP using the [legacy compiler API](https://github.com/microsoft/onnxruntime/blob/master/include/onnxruntime/core/framework/execution_provider.h#L237), please migrate to the lightweight compile API as soon as possible. The legacy API will be deprecated in next release (ORT 1.13).
* netstandard1.1 support is being deprecated in this release and will be removed in the next ORT 1.13 release
Key Updates
General
* ONNX spec support
* onnx opset 17
* onnx-ml opset 3 (TreeEnsemble update)
* BeamSearch operator for encoder-decoder transformers models
* Support for invoking individual ops without the need to create a separate graph
* For use with custom op development to reuse ORT code
* Support for feeding external initializers (for large models) as byte arrays for model inferencing
* Build switch to disable usage of abseil library to remove dependency
Packages
* Python 3.10 support
* Mac M1 support in Python and Java packages
* .NET 6/MAUI support in Nuget C package
* Additional target frameworks: net6.0, net6.0-android, net6.0-ios, net6.0-macos
* NOTE: netstandard1.1 support is being deprecated in this release and will be removed in the 1.13 release
* [onnxruntime-openvino](https://pypi.org/project/onnxruntime-openvino/1.12.0/) package available on Pypi (from Intel)
Performance and Quantization
* Improved C++ APIs that now utilize RAII for better memory management
* Operator performance optimizations, including GatherElements
* Memory optimizations to support compute-intensive real-time inferencing scenarios (e.g. audio inferencing scenarios)
* CPU usage savings for infrequent inference requests by reducing thread spinning
* Memory usage reduction through use of containers from the abseil library, especially inlined vectors used to store tensor shapes and inlined hash maps
* New quantized kernels for weight symmetry to improve performance on ARM64 little core (GEMM and Conv)
* Specialized kernel to improve performance of quantized Resize by up to 2x speedup
* Improved the thread job partition for QLinearConv, demonstrating up to ~20% perf gain for certain models
* Quantization tool: improved ONNX shape inference for large models
Execution Providers
* TensorRT EP
* TensorRT 8.4 support
* Provide option to share execution context memory between TensorRT subgraphs
* Workaround long CI test time caused by frequent initialization/de-initialization of TensorRT builder
* Improve subgraph partitioning and consolidate TensorRT subgraphs when possible
* Refactor engine cache serialization/deserialization logic
* Miscellaneous bug fixes and performance improvements
* OpenVINO EP
* Pre-Built ONNXRuntime binaries with OpenVINO now available on pypi: [onnxruntime-openvino](https://pypi.org/project/onnxruntime-openvino/1.12.0/)
* Performance optimizations of existing supported models
* New runtime configuration option ‘enable_dynamic_shapes’ added to enable dynamic shapes for each iteration
* ORTModule included as part of OVEP Python Package to enable Torch ORT Inference
* DirectML EP
* Updated to [DirectML 1.9](https://github.com/microsoft/DirectML/blob/master/Releases.md#directml-190)
* Opset 13-15 support: [11827](https://github.com/microsoft/onnxruntime/pull/11827), [#11814](https://github.com/microsoft/onnxruntime/pull/11814), [#11782](https://github.com/microsoft/onnxruntime/pull/11782), [#11772](https://github.com/microsoft/onnxruntime/pull/11772)
* Bug fixes: [Xbox command list reuse](https://github.com/microsoft/onnxruntime/pull/12063), [descriptor heap reset](https://github.com/microsoft/onnxruntime/pull/12059), [command allocator memory growth](https://github.com/microsoft/onnxruntime/pull/12114), [negative pad counts](https://github.com/microsoft/onnxruntime/pull/11974), [node suffix removal](https://github.com/microsoft/onnxruntime/pull/11879)
* TVM EP - [details](https://onnxruntime.ai/docs/execution-providers/TVM-ExecutionProvider.html)
* Updated to add model .dll ingestion and execution on Windows
* Updated documentation and CI tests
* ***[New]*** SNPE EP - [details](https://onnxruntime.ai/docs/execution-providers/SNPE-ExecutionProvider.html)
* ***[Preview]*** XNNPACK EP - initial infrastructure with limited operator support, for use with ORT Mobile and ORT Web
* Currently supports Conv and MaxPool, with work in progress to add more kernels
Mobile
* Binary size reductions in Android minimal build - 12% reduction in size of base build with no operator kernels
* Added new operator support to NNAPI and CoreML EPs to improve ability to run super resolution and BERT models using NPU
* NNAPI: DepthToSpace, PRelu, Gather, Unsqueeze, Pad
* CoreML: DepthToSpace, PRelu
* Added [Docker file](https://onnxruntime.ai/docs/build/custom.html#android) to simplify running a custom minimal build to create an ORT Android package
* Initial XNNPACK EP compatibility
Web
* Memory usage optimizations
* Initial XNNPACK EP compatibility
ORT Training
* ***[New]*** ORT Training acceleration is also natively available through [HuggingFace Optimum](https://github.com/huggingface/optimum#training)
* ***[New]*** FusedAdam Optimizer now available through the torch-ort package for easier training integration
* FP16_Optimizer Support for more DeepSpeed Versions
* Bfloat16 support for AtenOp
* Added gradient ops for ReduceMax and ReduceMin
* Updates to Min and Max grad ops to use distributed logic
* Optimizations
* Optimized perf for Gelu and GeluGrad kernels for mixed precision models
* Enabled fusions for SimplifiedLayerNorm
* Added bitmask versions of Dropout, BiasDropout and DropoutGrad which brings ~8x space savings for the mast output.
Known issues
* The [Microsoft.ML.OnnxRuntime.DirectML](https://www.nuget.org/packages/Microsoft.ML.OnnxRuntime.DirectML) package on Nuget has an issue and will be fixed in a patch. Fix: #12368
* The [Maven package](https://search.maven.org/artifact/com.microsoft.onnxruntime/onnxruntime) has a packaging issue for Mac M1 builds and will be fixed in a patch. Fix: #12335 / [Workaround discussion](https://github.com/microsoft/onnxruntime/issues/11054#issuecomment-1195391571)
* Windows builds are not compatible with Windows 8.x in this release. Please use v1.11 for now.
---
Contributions
Contributors to ONNX Runtime include members across teams at Microsoft, along with our community members:
[snnn](https://github.com/snnn), [edgchen1](https://github.com/edgchen1), [fdwr](https://github.com/fdwr), [skottmckay](https://github.com/skottmckay), [iK1D](https://github.com/iK1D), [fs-eire](https://github.com/fs-eire), [mszhanyi](https://github.com/mszhanyi), [WilBrady](https://github.com/WilBrady), [justinchuby](https://github.com/justinchuby), [tianleiwu](https://github.com/tianleiwu), [PeixuanZuo](https://github.com/PeixuanZuo), [garymm](https://github.com/garymm), [yufenglee](https://github.com/yufenglee), [adrianlizarraga](https://github.com/adrianlizarraga), [yuslepukhin](https://github.com/yuslepukhin), [dependabot[bot]](https://github.com/dependabot[bot]), [chilo-ms](https://github.com/chilo-ms), [vvchernov](https://github.com/vvchernov), [oliviajain](https://github.com/oliviajain), [ytaous](https://github.com/ytaous), [hariharans29](https://github.com/hariharans29), [sumitsays](https://github.com/sumitsays), [wangyems](https://github.com/wangyems), [pengwa](https://github.com/pengwa), [baijumeswani](https://github.com/baijumeswani), [smk2007](https://github.com/smk2007), [RandySheriffH](https://github.com/RandySheriffH), [gramalingam](https://github.com/gramalingam), [xadupre](https://github.com/xadupre), [yihonglyu](https://github.com/yihonglyu), [zhangyaobit](https://github.com/zhangyaobit), [YUNQIUGUO](https://github.com/YUNQIUGUO), [jcwchen](https://github.com/jcwchen), [chenfucn](https://github.com/chenfucn), [souptc](https://github.com/souptc), [chandru-r](https://github.com/chandru-r), [jstoecker](https://github.com/jstoecker), [hanbitmyths](https://github.com/hanbitmyths), [RyanUnderhill](https://github.com/RyanUnderhill), [georgen117](https://github.com/georgen117), [jywu-msft](https://github.com/jywu-msft), [mindest](https://github.com/mindest), [sfatimar](https://github.com/sfatimar), [HectorSVC](https://github.com/HectorSVC), [Craigacp](https://github.com/Craigacp), [jeffdaily](https://github.com/jeffdaily), [zhijxu-MS](https://github.com/zhijxu-MS), [natke](https://github.com/natke), [stevenlix](https://github.com/stevenlix), [jeffbloo](https://github.com/jeffbloo), [guoyu-wang](https://github.com/guoyu-wang), [daquexian](https://github.com/daquexian), [faxu](https://github.com/faxu), [jingyanwangms](https://github.com/jingyanwangms), [adtsai](https://github.com/adtsai), [wschin](https://github.com/wschin), [weixingzhang](https://github.com/weixingzhang), [wenbingl](https://github.com/wenbingl), [MaajidKhan](https://github.com/MaajidKhan), [ashbhandare](https://github.com/ashbhandare), [ajindal1](https://github.com/ajindal1), [zhanghuanrong](https://github.com/zhanghuanrong), [tiagoshibata](https://github.com/tiagoshibata), [askhade](https://github.com/askhade), [liqunfu](https://github.com/liqunfu)