This is a patch release containing the following changes to v1.6.3:
* Fixed performance regression in `dnnl_sgemm` with `N=1` (379a216b94393f17a37d5f042323fc923a7553af, f35e9917608925b57bb4e1486f77720f36970aef)
* Extended matmul to support multiple demensions and broadcast (0728f265f18448a3375574e622bdd6fcad0d2787)
* Fixed performance regression for convolution weight gradient implementation for Intel AVX2(9ab050b0f4a3d434cbb14b7ddb7056736564b9dc, 6cd0c352f9949191dac1938b8f16b53b5967c1ea)
* Fixed `unknown primitive kind` assertion on GPU (c95a01cea1bd43445497eae4f1323947bd56c977)
* Fixed build issue on Windows for the case when oneDNN is built as submodule (2fceddf2f564b729550b288eb2e7bba5523c223e)
* Fixed issues with `NaN` results produced by `dnnl_sgemm` in some scenarios (5ce95efe6f5e86cddbf704b637063cd8dc914125)
* Improved performance for convolution backpropagation with 1x1 filter and NHWC activations on systems with Intel AVX2 support (74bfc74ccb089c32829ffb1711842f880a1fb99b)
* Fixed correctness issue for convolution with 3D spatial (bf6ee840bef680223ccdb0c358bfce460f10d371)
* Fixed potential segmentation fault when destroying RNN primitive (0d9839b085263c0f4f6dcaf95e1bc2618a684297)
* Fixed performance regression for fp32 convolutions Intel AVX512 implementation (668e28289ccf17dad541238155c03a42e99802ba)