What's Changed
AWS SDK
* Use AWS Neuron SDK 2.16 (398)
* Use offical serialization API for transformers_neuronx models instead of beta by aws-yishanm (387, 393)
Inference
* Improve the support of sentence transformers by JingyaHuang (408)
* Add Neuronx compile cache Hub proxy and use it for LLM decoder models by dacorvo (410)
* Add support for Mistral models by dacorvo (411)
* Do not upload Neuron LLM weights when they can be fetched from the hub by dacorvo (413)
Training
* Add general support for generation on TRN with NxD by aws-tianquaw (370)
Tutorials and doc improvement
* Add llama 2 fine tuning tutorial by philschmid (390)
Major bugfixes
* Skip pushing if the user does not have write access to the cache repo by michaelbenayoun (405)
Other changes
* Bump Hugging Face library versions by JingyaHuang (403)
New Contributors
* aws-tianquaw made their first contribution in 370
* aws-yishanm made their first contribution in 387
**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.16...v0.0.17