Optimum-neuron

Latest version: v0.0.28

Safety actively analyzes 706267 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 5

0.0.12

What's Changed

Stable Diffusion: SDXL Refiner, Stable Diffusion Img2Img, Inpaint support
* [Stable Diffusion] Image2image and inpaint pipeline support by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/161
* [SDXL] Add SDXL image to image support by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/239

Distributed Training:
* Sequence parallelism by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/233
* Parallelism support for GPTNeoX by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/244

Text generation updates
* Add text generation pipeline by dacorvo in https://github.com/huggingface/optimum-neuron/pull/258

Other changes
* TGI stability fixes by dacorvo in https://github.com/huggingface/optimum-neuron/pull/226
* Remove experimental compilation flag for text-generation models by dacorvo in https://github.com/huggingface/optimum-neuron/pull/228
* Patch for diffusers 0.21.0 release by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/229
* test_examples uses ExampleRunner by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/227
* Using the real model name instead of hard code "model" by davidshtian in https://github.com/huggingface/optimum-neuron/pull/231
* Replace transformers list of logits warpers by a fused logic warper by dacorvo in https://github.com/huggingface/optimum-neuron/pull/234
* Use AWS Neuron SDK 2.14 by dacorvo in https://github.com/huggingface/optimum-neuron/pull/236
* Weight loading after lazy loading fix by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/238
* Add `debug` attribute to `NeuronPartialState` by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/240
* Update `tests/test_examples.py` for AWS team by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/242
* Rework text-generation example by dacorvo in https://github.com/huggingface/optimum-neuron/pull/245
* Fix evaluation recompilation issue by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/248
* test(generation): specify revision for hub test model by dacorvo in https://github.com/huggingface/optimum-neuron/pull/250
* Add sequence length for generative models and llama tests by dacorvo in https://github.com/huggingface/optimum-neuron/pull/251
* Fix noisy loss for T5 when doing TP by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/257
* Fix bug with transformers 4.34 by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/259

New Contributors
* davidshtian made their first contribution in https://github.com/huggingface/optimum-neuron/pull/231

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.11...v0.0.12

0.0.11

SDXL Export and Inference
Optimum CLI now supports compiling components in the SDXL pipeline for inference on neuron devices (inf2/trn1).

Below is an example of compiling SDXL models. You can either compile it with an inf2 instance (`inf2.8xlarge` or larger recommended) or a CPU-only instance (disable the validation with `--disable-validation`) :
bash
optimum-cli export neuron --model stabilityai/stable-diffusion-xl-base-1.0 --task stable-diffusion-xl --batch_size 1 --height 1024 --width 1024 --auto_cast matmul --auto_cast_type bf16 sdxl_neuron/

And then run inference with the class `NeuronStableDiffusionXLPipeline`
python
from optimum.neuron import NeuronStableDiffusionXLPipeline

prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
stable_diffusion_xl = NeuronStableDiffusionXLPipeline.from_pretrained(
model_id="sdxl_neuron/", device_ids=[0, 1]
)
image = stable_diffusion_xl(prompt).images[0]


* Add sdxl exporter support by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/203
* Add Stable Diffusion XL inference support by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/212


Llama v1, v2 Inference
* Add support for Llama inference through NeuronModelForCausalLM by dacorvo in https://github.com/huggingface/optimum-neuron/pull/223

Llama v2 Training
* Llama V2 training support by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/211
* LLama V1 training fix by michaelbenayoun in 211

TGI
* AWS Inferentia2 TGI server by dacorvo in https://github.com/huggingface/optimum-neuron/pull/214

Major bugfixes
* `neuron_parallel_compile`, `ParallelLoader` and Zero-1 fixes for torchneuron 8+ by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/200
* flan-t5 fix: `T5Parallelizer`, `NeuronCacheCallback` and `NeuronHash` refactors by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/207
* Fix optimum-cli broke by optimum 1.13.0 release by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/217

Other changes
* Bump Inference APIs to Neuron 2.13 by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/206
* Add log for SD when applying optim attn & pipelines lazy loading by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/208
* Cancel concurreny CIs for inference by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/218
* fix(tgi): typer does not support Union types by dacorvo in https://github.com/huggingface/optimum-neuron/pull/219
* Bump neuron-cc version to 1.18.* by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/224

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.10...v0.0.11

0.0.10

Major bugfixes
* Improve and Fix inferentia exporter by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/168
* [Stable Diffusion] Fix the image size value inferral by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/167
* Fix inferral of dynamic batch size from the config & Be compatible with transformers 4.32 by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/190


Enhancements of APIs
* Enable exporter on non INF instances by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/178
* Support multiple prompts for generation example by dacorvo in https://github.com/huggingface/optimum-neuron/pull/173
* Fix unet export when using optimized attn score by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/165
* Improve default compilation arguments for stable diffusion by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/182
* Add `num_image_per_prompt` support for stable diffusion by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/192

Other changes
* minor doc fix by oOraph in https://github.com/huggingface/optimum-neuron/pull/164
* Fix duplicates handling in converting to `safetensors` by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/172
* Fix empty preprocessor issue by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/180
* Update models.mdx by philschmid in https://github.com/huggingface/optimum-neuron/pull/183
* Only run INF2 CI for .code change by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/184
* Improve Readme and installation guide by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/181
* Fixes 150 by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/177
* Fix TP for t5 by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/179
* Improve SD logging by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/194
* Add mark step after optimizer step by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/195
* Option to disable the parallelization of the embedding with TP by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/191
* Restrict generation to sampling and greedy search by dacorvo in https://github.com/huggingface/optimum-neuron/pull/201

New Contributors
* oOraph made their first contribution in https://github.com/huggingface/optimum-neuron/pull/164

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.9...v0.0.10

0.0.9

Tensor Parallelism support for T5 on training

* TP tests and additional support by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/155

Enhance Stable Diffusion Inference

* Enhance robustness of stable diffusion inference by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/156
* Some other enhancement for stable diffusion by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/159
* SD quick fix export by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/160
* Stable Diffusion quick fix by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/162

What's Changed
* Doc upgrade by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/152

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.8...v0.0.9

0.0.8

Tensor Parallelism and ZeRO-1 optimization

Tensor Parallelism

It is now possible to shard model's parameters across several Neuron cores using tensor parallelism enabling training of much larger models than before.

The following model architectures are supported:

- BERT
- RoBERTa
- GPT Neo
- LLaMa

Relevant PRs: 125 and 143

ZeRO-1

[Deepspeed ZeRO Stage 1](https://www.deepspeed.ai/tutorials/zero/) optimization is supported as well, which shards the optimizer state across data-parallel ranks, resulting in an important memory save.

Relevant PRs: 140

**Note**: Tensor Parallelism and ZeRO-1 can be combined,

Stable Diffusion Models Inference support

`NeuronStableDiffusionPipeline` allows you to export your stable diffusion checkpoint to neuronx compatible format and run inference on Inf2 or trn1 instances while preserving the python interface you are used to from [`🤗 diffusers`](https://huggingface.co/docs/diffusers/v0.18.2/en/api/pipelines/stable_diffusion/text2img#diffusers.StableDiffusionPipeline)

Example:
python
from optimum.neuron import NeuronStableDiffusionPipeline

model_id = "runwayml/stable-diffusion-v1-5"
input_shapes = {"batch_size": 1, "height": 512, "width": 512}
stable_diffusion = NeuronStableDiffusionPipeline.from_pretrained(model_id, export=True, **input_shapes)

prompt = "a photo of an astronaut riding a horse on mars"
image = stable_diffusion(prompt).images[0]


Currently only Text-to-Image Generation task is supported.

0.0.7

Stable diffusion

Supports stable diffusion compilation with `neuronx-cc` for inference with inf2 / trn1.

Components chosen to be exported from `StableDiffusionPipeline` are:
* CLIP text encoder
* VAE decoder
* UNet
* VAE_post_quant_conv

The export can be done with `optimum-cli` as follow:

bash
optimum-cli export neuron --model stabilityai/stable-diffusion-2-1-base --task stable-diffusion --batch_size 1 --num_channels 4 --height 64 --width 64 --sequence_length 32 sd_neuron/


*Relevant PR: 101*
*More guide: [Exporting stable diffusion to neuron](https://huggingface.co/docs/optimum-neuron/guides/export_model#exporting-stable-diffusion-to-neuron)*

`transformers` pipeline support

Pipelines running on Inferiencia instances are now supported.

It can be used with an online export as follows:

python
from optimum.neuron.pipelines import pipeline

clf = pipeline("text-classification", model="distilbert-base-uncased-finetuned-sst-2-english", export=True)
clf("Amazon is a great company")

Page 4 of 5

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.