- Highlights
- New Features
- Improvements
- Bug Fixes
- Productivity
- Examples
**Highlights**
We are excited to announce the release of Intel® Neural Compressor v1.14! We release new Pruning API for PyTorch, allowing users select better combinations of criteria, pattern and scheduler to achieve better pruning accuracy. This release also supports Keras input for TensorFlow quantization, and self-distilled quantization for better quantization accuracy.
**New Features**
- Pruning/Sparsity
- Support new structured sparse patterns N in M and NxM (commit [6cec70](https://github.com/intel/neural-compressor/commit/6cec70bb2c5fd3079e4d572e22a89b152a229941))
- Add pruning criteria snip and snip momentum (commit [6cec70](https://github.com/intel/neural-compressor/commit/6cec70bb2c5fd3079e4d572e22a89b152a229941))
- Add iterative pruning and decay types (commit [6cec70](https://github.com/intel/neural-compressor/commit/6cec70bb2c5fd3079e4d572e22a89b152a229941))
- Quantization
- Support different Keras formats (h5, keras, keras saved model) as input and output of TensorFlow saved model (commit [5a6f09](https://github.com/intel/neural-compressor/commit/5a6f092088e0deaa64601ab5aa88a572180cca8a))
- Enable Distillation for Quantization (commit [03f1f3](https://github.com/intel/neural-compressor/commit/03f1f3e049494192200c304e051a34d2ce654c18) & [e20c76](https://github.com/intel/neural-compressor/commit/e20c76a148b4aaf97492e297413795aacfdad987))
- GUI
- Add mixed precision (commit [26e902](https://github.com/intel/neural-compressor/commit/26e902d24e2993a43d8fb52373ab4841377d0efb))
**Improvement**
- Enhance tuning for Quantization with IPEX 1.12 to remove additional Quant/DeQuant (commit [192100](https://github.com/intel/neural-compressor/commit/1921007997d281121bf36d5356629b471800b101))
- Add upstream and download API for HuggingFace model hub, which can handle configuration files, tokenizer files and int8 model weights in the format of transformers (commit 46d945)
- Align with Intel PyTorch extension new API (commit [cc368a](https://github.com/intel/neural-compressor/commit/cc368a8f7433d98fedf699dfcde98b9b6ffe6cc7))
- Add load with yaml and pt to be compatible with older PyTorch model saving type (commit [a28705](https://github.com/intel/neural-compressor/commit/a28705c09f7be415fdd348a56cc1a300f9159a44))
**Bug Fixes**
- Quantization
- Fix data type of ONNX Runtime quantization from fp64 to fp32 (commit [cb7b48](https://github.com/intel/neural-compressor/commit/cb7b4859bf3c9c6b6ca6d4140c4d896d97364e74))
- Fix MXNET config issue with default config (commit [b75ff2](https://github.com/intel/neural-compressor/commit/b75ff270979f2612d82b509dbbb186dcc16e508c))
- Export
- Fix export_to_onnx API (commit [158c7f](https://github.com/intel/neural-compressor/commit/158c7f41f40c7b18ef0eb9f295e9f82b57491ebd))
**Productivity**
- Support TensorFlow 2.10.0 (commit [d6b6c9](https://github.com/intel/neural-compressor/commit/d6b6c9d2b59403fd40476361c0b1aa9f345bcdf8) & [8130e7](https://github.com/intel/neural-compressor/commit/8130e7fcdad97e6a098d59538316449b7a125d8e))
- Support OnnxRuntime 1.12 (commit [498ac4](https://github.com/intel/neural-compressor/commit/498ac48c67db61105e5c83322b2b737c7e7b3760))
- Export PyTorch QAT to Onnx (commit [029a63](https://github.com/intel/neural-compressor/commit/029a6325748210e102a566603ad7220a0fc70eea))
- Add Tensorflow and PyTorch container tpp file (commit [d245b5](https://github.com/intel/neural-compressor/commit/d245b51e369f51a0706d78803bc64089d03655a4))
**Examples**
- Add example of download from HuggingFace model hub and example of upstream models to the hub (commit [46d945](https://github.com/intel/neural-compressor/commit/46d945348c3144e20ab3f54854a9f4e6566220c4))
- Add notebooks for Neural Coder (commit [105db7](https://github.com/intel/neural-compressor/commit/105db7b1c141ef78ac98e83f9c42d37b9b3d6cce))
- Add 2 IPEX examples: bert_large (squad), distilbert_base (squad) (commit [192100](https://github.com/intel/neural-compressor/commit/1921007997d281121bf36d5356629b471800b101))
- ADD 2 DDP for prune once for all examples: roberta-base and Bert Base (commit [26a476](https://github.com/intel/neural-compressor/commit/26a47627895072d7d7bc1ecfa2537cdcf3917e10))
**Validated Configurations**
- Python 3.7, 3.8, 3.9, 3.10
- Centos 8.3 & Ubuntu 18.04 & Win10
- TensorFlow 2.9, 2.10
- Intel TensorFlow 2.7, 2.8, 2.9
- PyTorch 1.10.0+cpu, 1.11.0+cpu, 1.12.0+cpu
- IPEX 1.10.0, 1.11.0, 1.12.0
- MxNet 1.7, 1.9
- ONNX Runtime 1.10, 1.11, 1.12