Optimum-neuron

Latest version: v0.1.0

Safety actively analyzes 723954 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 6

0.0.18

What's Changed

AWS SDK

* Use AWS Neuron SDK 2.16.1 (449)

Inference

* Preliminary support for neff/weights decoupling by JingyaHuang (402)
* Allow exporting decoder models using optimum-cli by dacorvo (422)
* Add Neuron X cache registry by dacorvo (442)
* Add StoppingCriteria to generate() of NeuronModelForCausalLM by dacorvo (454)

Training

* Initial support for pipeline parallelism by michaelbenayoun (279)

TGI

* TGI: support vanilla transformer models whose configuration is cached by dacorvo (445)

Tutorials and doc improvement

* Various fixes by jimburtoft michaelbenayoun JingyaHuang (428 429 432)
* Improve Stable Diffusion Notebooks by JingyaHuang (431)
* Add Sentence Transformers Guide and Notebook by philschmid (434)
* Add benchmark section by dacorvo (435)

Major bugfixes

* TGI: correctly identify special tokens during generation by dacorvo (438)
* TGI: do not include the input_text in generated text by dacorvo (454)

Other changes

* API change to be compatible to Optimum by JingyaHuang (421)

New Contributors

* jimburtoft made their first contribution in 432

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.17...v0.0.18

0.0.17

What's Changed

AWS SDK

* Use AWS Neuron SDK 2.16 (398)
* Use offical serialization API for transformers_neuronx models instead of beta by aws-yishanm (387, 393)

Inference

* Improve the support of sentence transformers by JingyaHuang (408)
* Add Neuronx compile cache Hub proxy and use it for LLM decoder models by dacorvo (410)
* Add support for Mistral models by dacorvo (411)
* Do not upload Neuron LLM weights when they can be fetched from the hub by dacorvo (413)

Training

* Add general support for generation on TRN with NxD by aws-tianquaw (370)

Tutorials and doc improvement

* Add llama 2 fine tuning tutorial by philschmid (390)

Major bugfixes

* Skip pushing if the user does not have write access to the cache repo by michaelbenayoun (405)

Other changes

* Bump Hugging Face library versions by JingyaHuang (403)

New Contributors

* aws-tianquaw made their first contribution in 370
* aws-yishanm made their first contribution in 387

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.16...v0.0.17

0.0.16

What's Changed

Training

A few fixes related to precompilation and checkpoiting. Those fixes enable training LLMs on AWS Trainium instances without friction.

- Skip model saving during precompilation and provide option to skip cache push (365)
- Fixes checkpoint saving and consolidtation for TP (378)
- A `torch_xla` compatible version of `safetensors.torch.save_file` is now used in the `NeuronTrainer` (329)

Inference

- Support for the export and inference of T5 (267)
- New documentation for Stable Diffusion XL Turbo (374)

0.0.15

What's Changed

Training

Distributed Training

- `parallel_cross_entropy` loss support for tensor parallelism (246)
- Support for training the Mistral architecture with tensor parallelism (303)

AWS SDK

- Fix: `neuron_parallel_compile` is compatible with the cache system (352)
- Full support for `neuron_parallel_compile` with the cache system: compilation files produced by `neuron_parallel_compile` will be pushed to the remote cache repo on the Hugging Face Hub at the beginning of the next training job (354)

Documentation

- [Guide](https://huggingface.co/docs/optimum-neuron/guides/distributed_training) explaining how distributed training works in `optimum-neuron` (#339)

Inference

- Data parallelism option for Stable Diffusion - LCM allowing multi-device inference (346)
- Support decoding sequences of byte tokens in TGI (350)

Documentation

- Updated the documentation on LCM (351)

0.0.14

What's Changed

LCM support

* [Stable Diffusion] Add LCM(Latent Consistency Models) support by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/323

Tutorials and doc improvement

* notebooks: add llama2 chatbot example by dacorvo in https://github.com/huggingface/optimum-neuron/pull/300
* Add llama 2 tutorial by dacorvo in https://github.com/huggingface/optimum-neuron/pull/321
* Migrate documentation of Stable Diffusion and add notebooks by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/312

Major bugfixes

* Noisy loss fix by bocchris-aws in https://github.com/huggingface/optimum-neuron/pull/293
* Fix neuron cache starting compilation before fetching by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/280
* fix(pipelines): support passing decoder model + tokenizer by dacorvo in https://github.com/huggingface/optimum-neuron/pull/319

Other changes

* chore: update dev version by dacorvo in https://github.com/huggingface/optimum-neuron/pull/276
* Explicitly mention aws repo extra url in documentation by dacorvo in https://github.com/huggingface/optimum-neuron/pull/277
* Update supported architecture in the doc by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/281
* Fix doc build source code broken links by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/282
* Add revision to push_to_hub by philschmid in https://github.com/huggingface/optimum-neuron/pull/292
* Set default device id for SD and SDXL by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/297
* Add missing decoder model architectures by dacorvo in https://github.com/huggingface/optimum-neuron/pull/298
* Official support for AWS inferentia2 TGI container by dacorvo in https://github.com/huggingface/optimum-neuron/pull/302
* Transformers fix by dacorvo in https://github.com/huggingface/optimum-neuron/pull/320
* Add sagemaker compatible image by dacorvo in https://github.com/huggingface/optimum-neuron/pull/322
* Fix broken tests by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/274
* chore: align with AWS Neuron SDK 2.15.1 by dacorvo in https://github.com/huggingface/optimum-neuron/pull/325
* Deleted the 'maybe_free_model_hooks()' from Diffusers Pipelines by Cerrix in https://github.com/huggingface/optimum-neuron/pull/330
* Bump diffusers version by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/335

New Contributors
* Cerrix made their first contribution in https://github.com/huggingface/optimum-neuron/pull/330

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.13...v0.0.14

0.0.13

What's Changed

The main change in this release is the alignment with AWS Neuron SDK 2.15.

Text-generation
* add support for `bloom` and `opt` models by dacorvo in 275

Other changes
* Use attention masks for TGI generation by dacorvo in 264
* Various fixes for TP by michaelbenayoun in 260
* Fix neuron pipelines by dacorvo in 265
* Fix 241 by michaelbenayoun in 268
* Fixes generation during the evaluation step by michaelbenayoun in 266
* Save / load from checkpoint TP by michaelbenayoun in 269

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.12...v0.0.13

Page 3 of 6

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.