This is a patch release containing the following changes to v2.6.2:
* Fixed potential integer overflow in BRGEMM-based convolution implementation (deb5595a0f96b54f9106cb846e6fc4e0af49aadf)
* Fixed a defect with incorrect caching of BRGEMM-based matmul primitive implementations with trivial dimensions (305bed526492f2400a1a7fdfcb54b0ee41adc67e)
* Extended benchdnn performance benchmarking capabilities on GPU with device-side performance measurement mode (ba8632592018070a46e4d349bbe3628756022c15)
* Fixed segfault in pooling primitive on CPUs (689d874bbf0a3e1bdc75e99ad2453e6aac9cfe84)
graph-v0.7
This is the Beta Update release for oneDNN Graph API based on [oneDNN v2.7 release](https://github.com/oneapi-src/oneDNN/releases/tag/v2.7).
Functionality
* Added operations `Select`, `LogicalAnd`, `LogicalOr`, `LogicalXor`, `LogicalNot`, `Greater`, `GreaterEqual`, `Equal`, `NoeEqual`, `Less`, and `LessEqual`.
* Added `boolean` data type to support logical operations.
* Added support for passing compilation context to the compile API. This feature allows passing additional information, like tensor shape context, for the backend to generate better kernel code.
* Introduced convolution block fusion via oneDNN Graph Compiler.
* **Experimental**: Introduced dynamic shapes support for multi-level perceptron (MLP) block via oneDNN Graph Compiler.
Known Issues and Limitations
* The weight’s opaque layout can be queried only from a compiled partition, which requires that input tensor shapes must be known at compilation time.
* MHA and MLP fusion are not activated on machines without Intel AVX-512 support.
Thanks to the Contributors
This release contains contributions from the project core teams as well as Jiong Gong, Chunyuan Wu, Sanchit Jain, Yiqiang Li, Yunfei Mao, Kiefer Kuah and others.
graph-v0.6
This is the Beta release for oneDNN Graph based on [oneDNN v2.7 release](https://github.com/oneapi-src/oneDNN/releases/tag/v2.7).
Functionality
* Introduced FP32, BF16, FP16, and INT8 inference support on GPU.
* Introduced FP32 and BF16 training support on GPU.
* Introduced support for floating point math mode at graph construction phase. The mode allows the implementation to use low precision datatype for computations when possible.
* Added `graph::finalize()` function to indicate that the user has finished adding operations into the graph and the graph is ready for partitioning.
* Added operations `AbsBackprop`, `Mish`, `MishBackprop`, and `LeakyReLU`.
* Updated API and operation definitions to comply with [oneDNN Graph Specification 1.0-beta](https://spec.oneapi.io/onednn-graph/v1.0-beta/index.html).
Usability
* Integrated Graph component headers, source and build system into oneDNN:
* Headers moved to `include/oneapi/dnnl`.
* Source moved to `src/graph`.
* Graph functionality is included into single shared object or dynamic library produced by the build system.
* Aligned API with oneDNN:
* Shared common `dnnl::engine` and `dnnl::stream`. The original `dnnl::graph::engine` and `dnnl::graph::stream` API were removed.
* Added a new `make_engine_with_allocator()` API to create `dnnl::engine` with `dnnl::graph::allocator`.
* A few common basic types were shared between oneDNN and oneDNN Graph, including `dnnl_status_t`, `dnnl_data_type_t`, and `dnnl_dims_t`, etc.
* Introduced `ONEDNN_BUILD_GRAPH` build option to manage Graph component build.
Validation
* Introduced `ONEDNN_GRAPH_DUMP` environment variable that serialized library graph and subgraph into JSON files.
* Added the initial version of benchdnn graph driver which can be used to benchmark the performance with a dumped graph JSON file.
Breaking changes
* Removed operations `HardTanh`, `Index`, `Pow`, etc. Please check the operation kind list for details.
Known Issues and Limitations
* Graph Compiler component is not included with this release. It will be reinstated in oneDNN Graph Beta Update release.
* The weight’s opaque layout can be queried only from a compiled partition, which requires that input tensor shapes must be known at compilation time.
* Build option `ONEDNN_BUILD_GRAPH` is not compatible with some of the build options supported by the build system including `ONEDNN_GPU_RUNTIME=OCL`, `ONEDNN_ENABLE_WORKLOAD=INFERENCE`, `ONEDNN_ENABLE_PRIMITIVE`, and others.
Thanks to the Contributors
This release contains contributions from the project core teams as well as Jiong Gong, Chunyuan Wu, Sanchit Jain, Yiqiang Li, Yunfei Mao, Kiefer Kuah and others.