Optimum-neuron

Latest version: v0.0.21

Safety actively analyzes 622918 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 4

0.9998538494110107

clf = pipeline("question-answering")
clf({"context": "This is a sample context", "question": "What is the context here?"})
{'score': 0.4972594678401947, 'start': 8, 'end': 16, 'answer': 'a sample'}

Or with precompiled models as follows:

python
from transformers import AutoTokenizer
from optimum.neuron import NeuronModelForQuestionAnswering, pipeline

tokenizer = AutoTokenizer.from_pretrained("deepset/roberta-base-squad2")

Loading the PyTorch checkpoint and converting to the neuron format by providing export=True
model = NeuronModelForQuestionAnswering.from_pretrained(
"deepset/roberta-base-squad2",
export=True
)

neuron_qa = pipeline("question-answering", model=model, tokenizer=tokenizer)
question = "What's my name?"
context = "My name is Philipp and I live in Nuremberg."

pred = neuron_qa(question=question, context=context)

*Relevant PR: 107*

Cache repo fix

The cache repo system was broken starting from Neuron 2.11.
*This release fixes that, the relevant PR is 119.*

0.0.20

What's Changed

Training

- Multi-node training support by michaelbenayoun (440)

TGI

- optimize continuous batching and improve export (506)

Inference

- Add Lora support to stable diffusion by JingyaHuang (483)
- Support sentence transformers clip by JingyaHuang (495)
- Inference compile cache script by philschmid and dacorvo (496, 504)

Doc

- Update Inference supported models list by JingyaHuang (501)

Bug fixes

- inference cache: omit irrelevant config parameters in lookup dy dacorvo (494)
- Optimize disk usage when fetching model checkpoints by dacorvo (505)

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.19...v0.0.20

0.0.19

What's Changed

Training

* Integrate new cache system for training by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/472

TGI

* Support higher batch sizes using transformers-neuronx continuous batching by dacorvo in https://github.com/huggingface/optimum-neuron/pull/488
* Lift max-concurrent-request limitation usingTGI 1.4.1 by dacorvo in https://github.com/huggingface/optimum-neuron/pull/488

AMI

* Add packer support for building AWS AMI by shub-kris in https://github.com/huggingface/optimum-neuron/pull/441
* [AMI] Updates base ami to new id by philschmid in https://github.com/huggingface/optimum-neuron/pull/482

Major bugfixes

* Fix sdxl inpaint pipeline for diffusers 0.26.* by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/458
* TGI: update to controller version 1.4.0 & bug fixes by dacorvo in https://github.com/huggingface/optimum-neuron/pull/470
* Fix optimum-cli export for inf1 by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/474

Other changes
* Add TGI tests and CI workflow by dacorvo in https://github.com/huggingface/optimum-neuron/pull/355
* Bump to optimum 1.17 - Adapt to optimum exporter refactoring by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/414
* [Training] Support for Transformers 4.37 by michaelbenayoun in https://github.com/huggingface/optimum-neuron/pull/459
* Add contribution guide for Neuron exporter by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/461
* Fix path, update versions by shub-kris in https://github.com/huggingface/optimum-neuron/pull/462
* Add issue and PR templates & build optimum env cli for Neuron by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/463
* Fix trigger for actions by philschmid in https://github.com/huggingface/optimum-neuron/pull/468
* TGI: bump rust version by dacorvo in https://github.com/huggingface/optimum-neuron/pull/477
* [documentation] Add Container overview page. by philschmid in https://github.com/huggingface/optimum-neuron/pull/481
* Bump to Neuron sdk 2.17.0 by JingyaHuang in https://github.com/huggingface/optimum-neuron/pull/487

New Contributors
* shub-kris made their first contribution in https://github.com/huggingface/optimum-neuron/pull/441

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.18...v0.0.19

0.0.18

What's Changed

AWS SDK

* Use AWS Neuron SDK 2.16.1 (449)

Inference

* Preliminary support for neff/weights decoupling by JingyaHuang (402)
* Allow exporting decoder models using optimum-cli by dacorvo (422)
* Add Neuron X cache registry by dacorvo (442)
* Add StoppingCriteria to generate() of NeuronModelForCausalLM by dacorvo (454)

Training

* Initial support for pipeline parallelism by michaelbenayoun (279)

TGI

* TGI: support vanilla transformer models whose configuration is cached by dacorvo (445)

Tutorials and doc improvement

* Various fixes by jimburtoft michaelbenayoun JingyaHuang (428 429 432)
* Improve Stable Diffusion Notebooks by JingyaHuang (431)
* Add Sentence Transformers Guide and Notebook by philschmid (434)
* Add benchmark section by dacorvo (435)

Major bugfixes

* TGI: correctly identify special tokens during generation by dacorvo (438)
* TGI: do not include the input_text in generated text by dacorvo (454)

Other changes

* API change to be compatible to Optimum by JingyaHuang (421)

New Contributors

* jimburtoft made their first contribution in 432

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.17...v0.0.18

0.0.17

What's Changed

AWS SDK

* Use AWS Neuron SDK 2.16 (398)
* Use offical serialization API for transformers_neuronx models instead of beta by aws-yishanm (387, 393)

Inference

* Improve the support of sentence transformers by JingyaHuang (408)
* Add Neuronx compile cache Hub proxy and use it for LLM decoder models by dacorvo (410)
* Add support for Mistral models by dacorvo (411)
* Do not upload Neuron LLM weights when they can be fetched from the hub by dacorvo (413)

Training

* Add general support for generation on TRN with NxD by aws-tianquaw (370)

Tutorials and doc improvement

* Add llama 2 fine tuning tutorial by philschmid (390)

Major bugfixes

* Skip pushing if the user does not have write access to the cache repo by michaelbenayoun (405)

Other changes

* Bump Hugging Face library versions by JingyaHuang (403)

New Contributors

* aws-tianquaw made their first contribution in 370
* aws-yishanm made their first contribution in 387

**Full Changelog**: https://github.com/huggingface/optimum-neuron/compare/v0.0.16...v0.0.17

0.0.16

What's Changed

Training

A few fixes related to precompilation and checkpoiting. Those fixes enable training LLMs on AWS Trainium instances without friction.

- Skip model saving during precompilation and provide option to skip cache push (365)
- Fixes checkpoint saving and consolidtation for TP (378)
- A `torch_xla` compatible version of `safetensors.torch.save_file` is now used in the `NeuronTrainer` (329)

Inference

- Support for the export and inference of T5 (267)
- New documentation for Stable Diffusion XL Turbo (374)

Page 1 of 4

Releases

Has known vulnerabilities

Optimum-neuron

Page 1 of 4

0.9998538494110107

0.0.20

0.0.19

0.0.18

0.0.17

0.0.16

Page 1 of 4

Links

Releases