Optimum-neuron

Latest version: v0.1.0

Safety actively analyzes 723685 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 6

0.0.24

What's Changed

* Use AWS Neuron SDK 2.19.1 by dacorvo in https://github.com/huggingface/optimum-neuron/pull/661

Training
* Initial PEFT support by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/612
* PEFT + TP support by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/620
* Fix MPMD detected error during training with TP by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/648

Inference
* Add Stable Diffusion ControlNet support by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/622
* Add InstructPix2Pix pipeline support. by asntr in https://github.com/huggingface/optimum-neuron/pull/625
* Add ViT export support and image classification by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/616
* Add wav2vec2 support - export and audio tasks modeling by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/645
* Add more audio models: ast, hubert, unispeech, unispeech-sat, wavlm by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/651

TGI
* Extending TGI benchmarking and documentation by jimburtoft in https://github.com/huggingface/optimum-neuron/pull/621
* Add support for TGI truncate parameter by dacorvo in https://github.com/huggingface/optimum-neuron/pull/647

Other changes
* enable unequal height and width by yahavb in https://github.com/huggingface/optimum-neuron/pull/592
* Skip invalid gen config by dacorvo in https://github.com/huggingface/optimum-neuron/pull/618
* Deprecate resume_download by Wauplin in https://github.com/huggingface/optimum-neuron/pull/586
* Remove a line non-intentionally merged by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/628
* Add secrets scanning workflow by mfuntowicz in https://github.com/huggingface/optimum-neuron/pull/631
* fix bad link to distributed-training how-to guide in optimum-neuron docs by aws-amj in https://github.com/huggingface/optimum-neuron/pull/627
* Do not copy local checkpoint by dacorvo in https://github.com/huggingface/optimum-neuron/pull/630
* Make neuron_cc_optlevel `None` by default by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/632
* Remove print by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/633
* Set bf16 to true when needed by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/635
* Fix gradient checkpointing with PEFT by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/634
* Refactor decoder tests by dacorvo in https://github.com/huggingface/optimum-neuron/pull/641
* CI cache builder by dacorvo in https://github.com/huggingface/optimum-neuron/pull/642
* Restore optimized attention score for sd15 & fix the generated images quality issue by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/646
* Add and remove some mark steps by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/644
* Fix consolidation for TP by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/649
* Fix spelling in error message by jimburtoft in https://github.com/huggingface/optimum-neuron/pull/656
* Update docs by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/588
* Fixes NxDPPModel for Neuron SDK 2.19 by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/663
* Various fixes for training by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/654
* migrate ci by XciD in https://github.com/huggingface/optimum-neuron/pull/662
* ci: fix inference cache pipeline by dacorvo in https://github.com/huggingface/optimum-neuron/pull/667
* broken link by pagezyhf in https://github.com/huggingface/optimum-neuron/pull/669
* Bump TGI version and fix bugs by dacorvo in https://github.com/huggingface/optimum-neuron/pull/666

New Contributors
* mfuntowicz made their first contribution in https://github.com/huggingface/optimum-neuron/pull/631
* aws-amj made their first contribution in https://github.com/huggingface/optimum-neuron/pull/627
* asntr made their first contribution in https://github.com/huggingface/optimum-neuron/pull/625
* XciD made their first contribution in https://github.com/huggingface/optimum-neuron/pull/662

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.23...v0.0.24

0.0.23

What's Changed

* bump required packages versions: `transformers==4.41.1`, `accelerate==0.29.2`, `optimum==1.20.*`

Inference

* Fix diffusion caching by oOraph in https://github.com/huggingface/optimum-neuron/pull/594
* Fix inference latency issue when weights/neff are separated by JingyaHuang in 584
* Enable caching for inlined models by JingyaHuang in 604
* Patch attention score far off issue for sd 1.5 by JingyaHuang in 611

TGI

* Fix excessive CPU memory consumption on TGI startup by dacorvo in 595
* Avoid clearing all pending requests on early user cancellations by dacorvo in 609
* Include tokenizer during export and simplify deployment by dacorvo in 610

Training

* Performance improvements and neuron_parallel_compile and gradient checkpointing fixes by michaelbenayoun in 602

New Contributors
* pagezyhf made their first contribution in https://github.com/huggingface/optimum-neuron/pull/601

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.22...v0.0.23

0.0.22

What's Changed

Training
* Integrate new API for saving and loading with `neuronx_distributed` by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/560

Inference

* Add support for Mixtral by dacorvo in https://github.com/huggingface/optimum-neuron/pull/569
* Improve Llama models performance by dacorvo in https://github.com/huggingface/optimum-neuron/pull/587
* Make Stable Diffusion pipelines compatible with compel by JingyaHuang and neo in https://github.com/huggingface/optimum-neuron/pull/581 (with tests inspired by the snippets sent from Suprhimp)
* Add `SentenceTransformers` support to `pipeline` for `feature-extration` by philschmid in https://github.com/huggingface/optimum-neuron/pull/583
* Allow download subfolder for caching models with subfolder by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/566
* Do not split decoder checkpoint files by dacorvo in https://github.com/huggingface/optimum-neuron/pull/567

TGI

* Set up TGI environment values with the ones used to build the model by oOraph in https://github.com/huggingface/optimum-neuron/pull/529
* TGI benchmark with llmperf by dacorvo in https://github.com/huggingface/optimum-neuron/pull/564
* Improve tgi env wrapper for neuron by oOraph in https://github.com/huggingface/optimum-neuron/pull/589

Caveat

Currently traced models with `inline_weights_to_neff=False` have higher than expected latency during the inference. This is due to the weights are not automatically moved to Neuron devices. The issue will be fixed in 584, please avoid setting `inline_weights_to_neff=False` in this release.

Other changes
* Improve installation guide by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/559
* upgrade optimum and then install optimum-neuron by shub-kris in https://github.com/huggingface/optimum-neuron/pull/533
* Cleanup obsolete code by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/555
* Extend TGI integration tests by dacorvo in https://github.com/huggingface/optimum-neuron/pull/561
* Modify benchmarks by dacorvo in https://github.com/huggingface/optimum-neuron/pull/563
* Bump PyTorch to 2.1 by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/502
* fix(decoder): specify libraryname to suppress warning by dacorvo in https://github.com/huggingface/optimum-neuron/pull/570
* missing \ in quickstart inference guide by yahavb in https://github.com/huggingface/optimum-neuron/pull/574
* Use AWS 2.18.0 AMI as base by dacorvo in https://github.com/huggingface/optimum-neuron/pull/572
* Update TGI router version to 2.0.1 by dacorvo in https://github.com/huggingface/optimum-neuron/pull/577
* Add guide for LoRA adapters by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/582
* eos_token_id can be a list in configs by dacorvo in https://github.com/huggingface/optimum-neuron/pull/580
* Ease the tests when there is no hf token by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/585
* Change inline weights to Neff default value to True by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/590

New Contributors
* yahavb made their first contribution in https://github.com/huggingface/optimum-neuron/pull/574

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.21...v0.0.22

0.0.21

What's Changed

Training

* Add GQA optimization for Tensor Parallel training to support the case `tp_size > num_key_value_heads` by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/498
* Mixed-precision training with both `torch_xla` or `torch.autocast` by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/523

Inference

* Add caching support for traced TorchScript models (eg. encoders, stable diffusion models) by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/510
* Support phi model on feature-extraction, text-classification, token-classification tasks by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/509

TGI

* TGI improvements by dacorvo in https://github.com/huggingface/optimum-neuron/pull/522

Caveat

AWS Neuron SDK 2.18 doesn't support the compilation of SDXL's unet with weights / neff separation, `inline_weights_to_neff=True` is forced through:
* Disable weights / neff separation of SDXL's UNET for neuron sdk 2.18 by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/554

Other changes

* Fix/ami authorized keys by shub-kris in https://github.com/huggingface/optimum-neuron/pull/517
* Skip weight load during parallel compile by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/524
* fixing format in getting-started.ipynb by jimburtoft in https://github.com/huggingface/optimum-neuron/pull/526
* Removing colab links in notebooks.mdx by jimburtoft in https://github.com/huggingface/optimum-neuron/pull/525
* ADD stale bot by philschmid in https://github.com/huggingface/optimum-neuron/pull/530
* Bump optimum version by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/534
* Fix style by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/538
* Fix GQA permutation computation and sequential weight initialization / loading when doing TP by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/531
* Add setup runtime step for K8S by glegendre01 in https://github.com/huggingface/optimum-neuron/pull/541
* Disable logging during precompilation by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/539
* Do not use deprecated list_files_info by Wauplin in https://github.com/huggingface/optimum-neuron/pull/536
* Adding link to existing Fine-tuning example in Notebooks by jimburtoft in https://github.com/huggingface/optimum-neuron/pull/527
* Add missing notebooks to doc by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/543
* fix: bug in get_available_cores within container by oOraph in https://github.com/huggingface/optimum-neuron/pull/546
* Init on the `xla` device by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/521
* Adding CodeLlama-7B inference and compilation example notebook by jimburtoft in https://github.com/huggingface/optimum-neuron/pull/549
* Add tools for auto filling traced models cache by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/537
* Remove print that should not be there by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/552
* Use AWS Neuron sdk 2.18 by dacorvo in https://github.com/huggingface/optimum-neuron/pull/547
* Cache utils related cleanup by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/553

New Contributors
* glegendre01 made their first contribution in https://github.com/huggingface/optimum-neuron/pull/541
* Wauplin made their first contribution in https://github.com/huggingface/optimum-neuron/pull/536

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.20...v0.0.21

0.0.20

What's Changed

Training

- Multi-node training support by michaelbenayoun (440)

TGI

- optimize continuous batching and improve export (506)

Inference

- Add Lora support to stable diffusion by JingyaHuang (483)
- Support sentence transformers clip by JingyaHuang (495)
- Inference compile cache script by philschmid and dacorvo (496, 504)

Doc

- Update Inference supported models list by JingyaHuang (501)

Bug fixes

- inference cache: omit irrelevant config parameters in lookup dy dacorvo (494)
- Optimize disk usage when fetching model checkpoints by dacorvo (505)

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.19...v0.0.20

0.0.19

What's Changed

Training

* Integrate new cache system for training by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/472

TGI

* Support higher batch sizes using transformers-neuronx continuous batching by dacorvo in https://github.com/huggingface/optimum-neuron/pull/488
* Lift max-concurrent-request limitation usingTGI 1.4.1 by dacorvo in https://github.com/huggingface/optimum-neuron/pull/488


AMI

* Add packer support for building AWS AMI by shub-kris in https://github.com/huggingface/optimum-neuron/pull/441
* [AMI] Updates base ami to new id by philschmid in https://github.com/huggingface/optimum-neuron/pull/482

Major bugfixes

* Fix sdxl inpaint pipeline for diffusers 0.26.* by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/458
* TGI: update to controller version 1.4.0 & bug fixes by dacorvo in https://github.com/huggingface/optimum-neuron/pull/470
* Fix optimum-cli export for inf1 by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/474

Other changes
* Add TGI tests and CI workflow by dacorvo in https://github.com/huggingface/optimum-neuron/pull/355
* Bump to optimum 1.17 - Adapt to optimum exporter refactoring by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/414
* [Training] Support for Transformers 4.37 by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/459
* Add contribution guide for Neuron exporter by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/461
* Fix path, update versions by shub-kris in https://github.com/huggingface/optimum-neuron/pull/462
* Add issue and PR templates & build optimum env cli for Neuron by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/463
* Fix trigger for actions by philschmid in https://github.com/huggingface/optimum-neuron/pull/468
* TGI: bump rust version by dacorvo in https://github.com/huggingface/optimum-neuron/pull/477
* [documentation] Add Container overview page. by philschmid in https://github.com/huggingface/optimum-neuron/pull/481
* Bump to Neuron sdk 2.17.0 by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/487

New Contributors
* shub-kris made their first contribution in https://github.com/huggingface/optimum-neuron/pull/441

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.18...v0.0.19

Page 2 of 6

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.