Onednn

Latest version: v2025.0.0

Safety actively analyzes 679296 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 17 of 26

1.4rc

This is a release candidate for DNNL v1.4. Please provide feedback and report bugs in [Github issues](https://github.com/intel/mkl-dnn/issues).

1.3

Performance optimizations
* Introduced broad release quality optimizations for future Intel(R) Xeon(R) Scalable processor (code name Cooper Lake).
* Improved performance of matmul primitive for 3D tensors (batched matrix-matrix multiplication) on all supported processors.
* Improved performance of binary primitive for the case when one of the tensors have to be broadcasted on all supported processors.
* Improved performance of convolution primitive for 3D tensors and 1x1 kernel size on all supported processors.

New functionality
* Introduced fused depthwise convolution and convolution with 1x1 filter. The implementation is available for all supported processors and data types. The functionality is not implemented for Intel Processor Graphics.
* Introduced peephole support for [LSTM cell](http://intel.github.io/mkl-dnn/dev_guide_rnn.html) on all supported processors. The functionality is not implemented for Intel Processor Graphics.
* Implemented [matmul primitive]( http://intel.github.io/mkl-dnn/dev_guide_matmul.html) for Intel Processors Graphics.
* Extended [binary primitive]() with min and max algorithms support.
* Extended [eltwise primitive]( http://intel.github.io/mkl-dnn/dev_guide_eltwise.html):
* Introduced erf-based implementation of gelu algorithm
* Introduced pow algorithm
* Introduced backpropagation flavor relying on destination tensor as input for elu, exp, logistic, relu, sqrt, and tanh algorithms
* Extended set of operations for memory descriptors:
*Added support for changing the number of dimensions with existing [`dnnl::memory::desc::reshape()`](https://intel.github.io/mkl-dnn/v1.3/structdnnl_1_1memory_1_1desc.html#a0325f155fdaf54443a31df5bd7a1ab88) method
* Introduced [`dnnl::memory::desc::permute_axes()`](https://intel.github.io/mkl-dnn/v1.3/structdnnl_1_1memory_1_1desc.html#a9dd8e97d94b247910fcf2473da12844d)) method to change logical axes order

Thanks to the contributors
This release contains contributions from the project core team as well as Araujo Mitrano, Arthur aaraujom, Aaron Mark Johnson aaronjohnson, Benjamin Hipple bhipple, Sergey Nesterov cepera, gaurav1086, Ilya Taraban itaraban, Mesut Meterelliyoz mmeterel, nSircombe, Peter Caday petercad, and Rafik Saliev rsaliev. We would also like to thank everyone who asked questions and reported issues.

1.3rc

This is a release candidate for DNNL v1.3. Please provide feedback and report bugs in [Github issues](https://github.com/intel/mkl-dnn/issues).

1.2.2

This is a patch release containing following changes to v1.2.1:

* Fixed overflow in transposition in bfloat16 weights gradient convolution (0d283894be89ba22ba6251c1ab8cae816ebe3f24)
* Added work around corrupted unique_ptr usage in scratchpad (91c89a9628feee9e4539b53c7c96f7d1f3110269)
* Fixed int8 deconvolution with int32 output on Intel AVX2 systems (ef2d6527209b104efe8a7fd2c1ec7b7f70c695bc)
* Fixed fixed segmentation fault in concat due to incorrect memory alighment 668 (7a0c3a922827632308aafb03037dc4c3ae2af9da)
* Fixed performance regression in no-copy gemm dispatching 525 (89a303b68e7a3497490e37bf11025d7d31b5d283)
* Fixed segmentation fault in fp32 weights gradient convolution with dilation and large padding (50546ad4426ea48f4b6bb67665560f1c9cb26333)
* Fixed bfloat16/fp32 scalability for eltwise primitive (e281a4a5d312115cdd1f97d43b14e0d6eb494a43)

1.2.1

This is a patch release containing following changes to v1.2:
* Improved GEMM performance for 1 thread (1fd2bc010ba09b44e3e675d68d80d8f41c747fec)
* Fixed RNN cell backpropagation computations (4b15a0cbbf13e5c7e6aca66f40847e9b27619087)
* Fixed alpha and beta handling in vanilla RNN cell (70f8b879ea7a0c38caedb3320b7c85e8497ff50d)
* Reduced sizes in performance profiling example to avoid memory overflow for systems with less than 2 GB memory (f6e2ef9896d63302c5e6eba2094dca3ac346e5ad)
* Fix correctness for strided convolution with 1x1 filter with non-matching source and destination formats (0405c9a29f15899883ee62a905716cdeed5ce1fa)
* Removed lambda calls from OpenMP loops as a workaround for Intel C/C++ Compiler 19.1 (a603593fd6186ba0385cf5b1630c13f6909ab3ac)
* Added -O1 flag for backward convolution gtests as a workaround for Intel C/C++ Compiler 19.1 (495b91fdc6fdfd6647eac193e8c80e41d23c24e8)

1.2

Performance optimizations
* Improved 1D backward convolution performance on CPU.
* Improved int8 inference performance on pre-Intel AVX512 systems.
* Improved int8 inference performance for 3D spatial data on CPU.
* Improved performance of convolution and other primitives on GPU.

New functionality
* Introduced general purpose [matrix-matrix multiplication primitive](http://intel.github.io/mkl-dnn/dev_guide_matmul.html). The functionality supports fp32, bfloat16, and int8 data types with asymmetric quantization.
* Introduced [logsoftmax](http://intel.github.io/mkl-dnn/dev_guide_logsoftmax.html) and [resampling](http://intel.github.io/mkl-dnn/dev_guide_resampling.html) primitives.
* Introduced clip and log algorithms support in [elementwise](http://intel.github.io/mkl-dnn/dev_guide_eltwise.html) primitive.
* Introduced int8 and bf16 data types support for [binary primitive](http://intel.github.io/mkl-dnn/dev_guide_binary.html) (CPU only).
* Introduced fully functional support of int8 (inference) and bfloat16 (inference and training) datatypes on GPU. The functionality is not intended for getting performance improvement over f32 on current Intel Integrated Graphics, but to make conformance experiments.

Usability improvements
* Added JIT code annotations for [linux-perf profiler](http://intel.github.io/mkl-dnn/dev_guide_profilers.html).
* Added mechanism to control [CPU dispatcher behavior](http://intel.github.io/mkl-dnn/dev_guide_cpu_dispatcher_control.html) at runtime via DNNL_MAX_CPU_ISA environment variable or a function call.
* Extended DNNL_VERBOSE output with more information about runtimes and devices.

Thanks to the contributors
This release contains contributions from the project core team as well as Aaron Johnson aaronjohnson, Attila T. Áfra atafra, Ben Fitch, Ilya Taraban itaraban, Michał Gallus Sand3r-, Peter Caday petercad, Qiyou Chen chenqy4933 and Jun Luan junluan. We would also like to thank everyone who asked questions and reported issues.

Page 17 of 26

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.