Highlights
* Continue VNNI acceleration support, we add optimization for more CNN models including object detection models, enhance model scales generation support for VNNI.
* Add attention based model support, we add Transformer implementation for both lanuage model and translation model.
* RNN optimization, We support LSTM integration with MKL-DNN which acheives ~3x performance speedup.
Details
* [New Feature] Add attention layer support
* [New Feature] Add FeedForwardNetwork layer support
* [New Feature] Add ExpandSize layer support
* [New Feature] Add TableOperation layer to support table calculation with different input sizes
* [New Feature] Add LayerNormalizaiton layer support
* [New Feature] Add Transformer support for both language and translation models
* [New Feature] Add beam search support in Transformer model
* [New Feature] Add Layer-wise adaptve rate scaling optim method
* [New Feature] Add LSTM integration with MKL-DNN support
* [New Feature] Add dilated convolution integration with MKL-DNN support
* [New Feature] Add parameter process for LarsSGD optim method
* [New Feature] Support Affinity binding option with mkl-dnn
* [Enhancement] Document enhancement for configuration and build
* [Enhancement] Reflection enhancement to get default values for constructor parameters
* [Enhhancement] User one AllReducemParameter for multi-optim method training
* [Enhancement] CAddTable layer enhancement to support input expansion along specific dimension
* [Enhancement] Resnet-50 preprocessing pipeline enhancement to replace RandomCropper with CenterCropper
* [Enhancement] Calculate model scales for arbitrary mask
* [Enhancment] Enable global average pooling
* [Enhancement] Check input shape and underlying MKL-DNN layout consistency
* [Enhancement] Threadpool enhancement to throw proper exception at executor runtime
* [Enhancement] Support mkl-dnn format conversion from ntc to tnc
* [Bug Fix] Fix backward graph generation topology ordering issue
* [Bug Fix] Fix MemoryData hash code calculation
* [Bug Fix] Fix log output for BCECriterion
* [Bug Fix] Fix setting mask for container quantization
* [Bug Fix] Fix validation accuracy issue when multi-executor running with the same worker
* [Bug Fix] Fix INT8 layer fusion between conlution with multi-group masks and BatchNormalization
* [Bug Fix] Fix JoinTable scales generation issue
* [Bug Fix] Fix CMul forward issue with special input format
* [Bug Fix] Fix weights change issue after model fusion issue
* [Bug Fix] Fix SpatinalConvolution primitives initializaiton issue