- Half-precision Inference Doubles On-Device Inference Performance.
- On certain ARM cores, specifying a hint that the model is a Float16 precision model doubles the conventional inference speed.
-eatfp16, --enable_accumulation_type_float16 ENABLE_ACCUMULATION_TYPE_FLOAT16
Hint for XNNPack fp16 inference on float16 tflite model.
- https://blog.tensorflow.org/2023/11/half-precision-inference-doubles-on-device-inference-performance.html
This is result in macbook.
- Chip: Apple M1 Pro (ArmV8 processor)
- Python 3.9
- Tensorflow 2.15.0 (from pip install tensorflow==2.15.0)
<img width="811" alt="image" src="https://github.com/PINTO0309/onnx2tf/assets/74748700/51799aff-b006-46e1-a372-bd8b2195b854">
Regarding on x86, AVX2 is necessary and rebuild python package in PyPI seems be NOT enabled AVX2. According to the blog, AVX2 emulation in x86 is for precision check and its is slow.
What's Changed
* Improve macOS compatibility via docker by ysohma in https://github.com/PINTO0309/onnx2tf/pull/548
* Add option for tflite float16 model by ysohma in https://github.com/PINTO0309/onnx2tf/pull/553
New Contributors
* ysohma made their first contribution in https://github.com/PINTO0309/onnx2tf/pull/548
**Full Changelog**: https://github.com/PINTO0309/onnx2tf/compare/1.18.14...1.18.15