Onednn

Latest version: v2025.1.0

Safety actively analyzes 723963 Python packages for vulnerabilities to keep your Python projects secure.

Page 20 of 27

1.0.4

This is a patch release containing following changes to v1.0.3:
* Resolved int8 batch normalization performance degradation in comparison to v0.21 (ec191189dc228ce65ca27b5e9fa1ee535ef0728f)

1.0.3

This is a patch release containing following changes to v1.0.2:
* Fixed zero padding for memory formats with rank 3 and below (4d78aafb99bff74784aabe0f3761c486c79cd995)
* Fixed tail scaling for int8 inner product (41b5a7e86446863c8ab76c6f3623019621848349)
* Sum does not override the data type for destination memory descriptor when used with `any` (e979edae6c3d39ccf39cfed8855980eb00777cf0)
* Improved s8s8 GEMM and inner product performance (4b44aa53edb86e1ea5b6085d00d6eae9202f4b7c)
* Reduced memory consumption of GEMM-based algorithm for convolution weight gradient (f46b044fc2623289f56e0305fd8453c5fc9683e6)
* Fixed negative padding processing in pooling (48ba96a242106a9a144faf23e53e4a79a9ddedee)
* Addressed memory leak in GPU deconvolution (686fc41f1188505b374e1e7a4f807ef0b5874824)
* Addressed memory leak in GPU stream (1206b2f711b8619171ad299d272c277cce2a768a)
* Fixed fp16 GEMM correctness on GPU (c2425d44e4d7df31b63eb2c2d1ac943ef95e67a2)
* Fixed GEMM correctness on GPU for the case of small M dimension (ac2683fd747e0eeb85b5c909e509a851bdf0287d)
* Addressed following corner cases in CPU convolution implementation:
* Fixed tail processing in int8 depthwise convolution (3a0943b8ce04747d3574a7028222548f986b2438)
* Fixed bias padding in bfloat16 depthwise convolution (3d9af7cd6fb5e65603025f09a1a8322e4e6af3f8)
* Fixed correctness issue in s8s8 flavor of depthwise convolution (e4d9049dbcac339007ae36672639ccdf96c29390)
* Fixed correctness issue in GEMM-based algorithm for 3D convolutions (161ac408a240f469d3c81e6913b2545ed45d026f)
* Fixed corner case issues in Intel AVX512 implementation of convolution weight gradient (68f51246d7f32ba35563395f1a46225bfaa02c83)
* Disabled not supported cases for depthwise convolution weight gradient (5e6e6c8ca499d3a094c32c4ec3f1361fa9ba6ec6)
* Convolution with 1x1 filter returns `unimplemented` for cases that have padding in spatial dimensions (9d7cc77816e374760693401a2ee05334e98d68f7)
* Fixed negative padding support in general convolution kernel (b1c602a57b948f08979f314476829cd85f2651f5)
* Fixed padding handling in depthwise convolution backpropagation (04712f6253294405bde2fe55dee597b04c7563e6)
* Added support for negative padding in `h` and `d` spatial dimensions (7ddce823f9363e55005bbd27eeee3c97436aa20b)
* Fixed segfault in strided convolution backpropagation (b04f3f5d71984cb3af87ef59ca579032ff2ade5b)
* Fixed memory corruption in convolution backpropagation (8877bc97572dfb18b172c8c60ece73cb84ad150b)

1.0.2

This is a patch release containing following changes to Intel MKL-DNN v1.0.1:
* Fixed issue with bfloat16 instructions detection in Xbyak (0f4ba114397358416917e7a078770811a395aa5b)
* Fixed buffer size in packed GEMM (9764940d0a7081040dc819e13db96df6b85b32a5)
* Fixed offset calculation issue in weight update depthwise convolution in fp32 and bfloat16 kernels (6b9d41242f0f5eb4ac13245ff29f32f39ae3ae6b, 061499d4af30d0a4f44bd6819e9d673ad68b4b6a)
* Added check that size of generated kernel doesn't exceed the maximum allowed bound in fp32 forward and backward kernels (67e8cd2da7f313f7246c0ae599b42107d30e37d6)
* Various fixes in RNN primitive:
* Proper handling of packed GEMM in extended GEMM (4eb9f5621677e1952cf851ac6514ce7e76156f37)
* Force no-copy GEMM only for Intel AVX+ systems (2fbc8ba5e4d02122d730d95b2e4af1a741d8599b)
* Avoid unaligned pointers usage in vex instructions in GRU cell (a147c08f728b8d85aff6bb282532944dd2729c1f)
* Fixed wrong dimension when creating GEMM primitive descriptor in reference RNN implementation for GPU (eb3c866d3b23aa38d5cf9f210090075df288b461)
* Fixed Tanh backward calculation in GPU RNN reference implementation (f6e4b97242cc716d2ddd06f87922a62105b6c729)
* Fixed pack GEMM dispatching for int8 (16b46c7d11ec4e3205b6d2de4b0fa3b02dfa1086)
* Addressed bugs in tests for RNNs (cf83e83fe7706e273b56ea37a66f88774acebf03, f7c2de21325ceee4bd502b0fb5151677a8423cf4, 960f3f3e7bc21903d614776aa2e5c32d841c615f)

1.0.1

This is a patch release containing following changes to Intel MKL-DNN v1.0.0:
* updated version and soversion to \<major\>.\<minor\> and \<major\> respectively (952a7778f8a624eb31c3bb79c2f299d0953f4030)

1.0

Performance optimizations
* Added SGEMM copy-based kernels for Intel SSE 4.1, Intel AVX, Intel AVX 2 and Intel AVX 512 architectures. With this optimization Intel MKL-DNN’ JIT SGEMM implementation achieves comparable performance to Intel MKL.
* Improved GEMM performance for n=1.
* Improved performance of s8s8s32 GEMM.

New functionality
* Introduced [Intel Processor Graphics support](https://github.com/intel/mkl-dnn#gpu-support) covering fp16 and fp32 inference, and fp32 training. Intel MKL-DNN relies on OpenCL\* runtime to execute computations on Intel Processor Graphics and provides [interoperability with user’s OpenCL code](http://intel.github.io/mkl-dnn/dev_guide_opencl_interoperability.html).
* Added post-ops support in Inner Product and GEMM-based convolution.
* Introduced [bfloat16 training and inference support](http://intel.github.io/mkl-dnn/dev_guide_training_bf16.html) in reorders, (de-)convolution, pooling, batch normalization, local response normalization, eltwise, inner product, shuffle, sum, and concat. The implementation relies on new instructions targeting future Intel Xeon Scalable processor (codename Cooper Lake). On Intel Xeon processors with Intel AVX512 support bfloat16 arithmetic is emulated.
* Added GELU activation support.

Usability improvements
* Introduced new [developer guide](http://intel.github.io/mkl-dnn/index.html) and new examples.
* Removed dependency on Intel MKL (or Intel MKL small libraries) as JIT implementation delivers comparable performance.
* Introduced explicit [scratchpad management](http://intel.github.io/mkl-dnn/dev_guide_attributes_scratchpad.html).
* Lowered requirements for Intel SSE4 optimizations to Intel SSE 4.1.
* Added out of the box [Intel VTune profiling](http://intel.github.io/mkl-dnn/dev_guide_vtune.html) support.
* Introduced binary distribution.

Breaking changes to the API
This is a major release that introduces several breaking changes. See [v1.0 transition guide](http://intel.github.io/mkl-dnn/dev_guide_transition_to_v1.html) for the full list of changes and replacement functions.
* Removed previously deprecated APIs.
* Removed experimental s16 data type support.
* Removed unused parameters `rounding_mode` and `padding_kind`
* Removed view primitive. The functionality is supported directly by memory descriptor.
* Separated RNN primitive into separate primitives for each cell type.
* Separated cell states and hidden states in LSTM cell.
* Changed matrix layout in GEMM to row-major and calling convention to C-style.
* Changed the offset handling in integer GEMM (now the offsets are subtracted from matrices A and B).
* Changed execution API to accept memory buffers at primitive execution
* Simplified memory descriptor and removed memory primitive descriptor entity

Thanks to the contributors
This release contains contributions from many Intel Performance Libraries developers as well as Andrew Senkevich, Benjamin Fitch, Nathan Greeneltch nathan-greeneltch-intel, Ilia Taraban, Shigeo Mitsunari herumi, Nikolay Tyukaev, Ivan Samsonov, Kalina Tomasz, basargin and Louie Tsai louie-tsai. We would also like to thank everyone who asked questions and reported issues.

*Other names and brands may be claimed as the property of others.

1.0rc

This is a release candidate for Intel MKL-DNN v1.0. Please provide feedback and report bugs in [Github issues](https://github.com/intel/mkl-dnn/issues).

Page 20 of 27

Releases

Has known vulnerabilities

Previous Next

Onednn

Page 20 of 27

1.0.4

1.0.3

1.0.2

1.0.1

1.0

1.0rc

Page 20 of 27

Links

Releases