Text-generation

Latest version: v0.7.0

Safety actively analyzes 723152 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 6

1.4.3

Highlights
* Add support for Starcoder 2
* Add support for Qwen2

What's Changed
* fix openapi schema by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1586
* avoid default message by drbh in https://github.com/huggingface/text-generation-inference/pull/1579
* Revamp medusa implementation so that every model can benefit. by Narsil in https://github.com/huggingface/text-generation-inference/pull/1588
* Support tools by drbh in https://github.com/huggingface/text-generation-inference/pull/1587
* Fixing x-compute-time. by Narsil in https://github.com/huggingface/text-generation-inference/pull/1606
* Fixing guidance docs. by Narsil in https://github.com/huggingface/text-generation-inference/pull/1607
* starcoder2 by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1605
* Qwen2 by Jason-CKY in https://github.com/huggingface/text-generation-inference/pull/1608

**Full Changelog**: https://github.com/huggingface/text-generation-inference/compare/v1.4.2...v1.4.3

1.4.2

Highlights

* Add support for Google Gemma models

What's Changed
* Fix mistral with length > window_size for long prefills (rotary doesn't create long enough cos, sin). by Narsil in https://github.com/huggingface/text-generation-inference/pull/1571
* improve endpoint support by drbh in https://github.com/huggingface/text-generation-inference/pull/1577
* refactor syntax to correctly include structs by drbh in https://github.com/huggingface/text-generation-inference/pull/1580
* fix openapi and add jsonschema validation by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1578
* add support for Gemma by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1583

**Full Changelog**: https://github.com/huggingface/text-generation-inference/compare/v1.4.1...v1.4.2

1.4.1

Highlights

* Mamba support by drbh in https://github.com/huggingface/text-generation-inference/pull/1480 and by Narsil in https://github.com/huggingface/text-generation-inference/pull/1552
* Experimental support for cuda graphs by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1428
* Outlines guided generation by drbh in https://github.com/huggingface/text-generation-inference/pull/1539
* Added `name` field to OpenAI compatible API Messages by amihalik in https://github.com/huggingface/text-generation-inference/pull/1563

What's Changed

* Fixing top_n_tokens. by Narsil in https://github.com/huggingface/text-generation-inference/pull/1497
* Sending compute type from the environment instead of hardcoded string by Narsil in https://github.com/huggingface/text-generation-inference/pull/1504
* Create the compute type at launch time (if not provided in the env). by Narsil in https://github.com/huggingface/text-generation-inference/pull/1505
* Modify default for max_new_tokens in python client by freitng in https://github.com/huggingface/text-generation-inference/pull/1336
* feat: eetq gemv optimization when batch_size <= 4 by dtlzhuangz in https://github.com/huggingface/text-generation-inference/pull/1502
* fix: improve messages api docs content and formatting by drbh in https://github.com/huggingface/text-generation-inference/pull/1506
* GPTNeoX: Use static rotary embedding by dwyatte in https://github.com/huggingface/text-generation-inference/pull/1498
* Hotfix the / health - route. by Narsil in https://github.com/huggingface/text-generation-inference/pull/1515
* fix: tokenizer config should use local model path when possible by drbh in https://github.com/huggingface/text-generation-inference/pull/1518
* Updating tokenizers. by Narsil in https://github.com/huggingface/text-generation-inference/pull/1517
* [docs] Fix link to Install CLI by pcuenca in https://github.com/huggingface/text-generation-inference/pull/1526
* feat: add ie update to message docs by drbh in https://github.com/huggingface/text-generation-inference/pull/1523
* feat: use existing add_generation_prompt variable from config in temp… by drbh in https://github.com/huggingface/text-generation-inference/pull/1533
* Update to peft 0.8.2 by Stillerman in https://github.com/huggingface/text-generation-inference/pull/1537
* feat(server): add frequency penalty by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1541
* chore: bump ci rust version by drbh in https://github.com/huggingface/text-generation-inference/pull/1543
* ROCm AWQ support by IlyasMoutawwakil in https://github.com/huggingface/text-generation-inference/pull/1514
* feat(router): add max_batch_size by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1542
* feat: add deserialize_with that handles strings or objects with content by drbh in https://github.com/huggingface/text-generation-inference/pull/1550
* Fixing glibc version in the runtime. by Narsil in https://github.com/huggingface/text-generation-inference/pull/1556
* Upgrade intermediary layer for nvidia too. by Narsil in https://github.com/huggingface/text-generation-inference/pull/1557
* Improving mamba runtime by using updates by Narsil in https://github.com/huggingface/text-generation-inference/pull/1552
* Small cleanup. by Narsil in https://github.com/huggingface/text-generation-inference/pull/1560
* Bugfix: eos and bos tokens positions are inconsistent by amihalik in https://github.com/huggingface/text-generation-inference/pull/1567
* chore: add pre-commit by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1569
* feat: add chat template struct to avoid tuple ordering errors by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1570
* v1.4.1 by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1568

New Contributors
* freitng made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1336
* dtlzhuangz made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1502
* dwyatte made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1498
* pcuenca made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1526
* Stillerman made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1537
* IlyasMoutawwakil made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1514
* amihalik made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1563

**Full Changelog**: https://github.com/huggingface/text-generation-inference/compare/v1.4.0...v1.4.1

1.4.0

Highlights

* OpenAI compatible API 1427
* exllama v2 Tensor Parallel 1490
* GPTQ support for AMD GPUs 1489
* Phi support 1442

What's Changed
* fix: fix local loading for .bin models by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1419
* Fix missing make target platform for local install: 'install-flash-attention-v2' by deepily in https://github.com/huggingface/text-generation-inference/pull/1414
* fix: follow base model for tokenizer in router by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1424
* Fix local load for Medusa by PYNing in https://github.com/huggingface/text-generation-inference/pull/1420
* Return prompt vs generated tokens. by Narsil in https://github.com/huggingface/text-generation-inference/pull/1436
* feat: supports openai chat completions API by drbh in https://github.com/huggingface/text-generation-inference/pull/1427
* feat: support raise_exception, bos and eos tokens by drbh in https://github.com/huggingface/text-generation-inference/pull/1450
* chore: bump rust version and annotate/fix all clippy warnings by drbh in https://github.com/huggingface/text-generation-inference/pull/1455
* feat: conditionally toggle chat on invocations route by drbh in https://github.com/huggingface/text-generation-inference/pull/1454
* Disable `decoder_input_details` on OpenAI-compatible chat streaming, pass temp and top-k from API by EndlessReform in https://github.com/huggingface/text-generation-inference/pull/1470
* Fixing non divisible embeddings. by Narsil in https://github.com/huggingface/text-generation-inference/pull/1476
* Add messages api compatibility docs by drbh in https://github.com/huggingface/text-generation-inference/pull/1478
* Add a new `/tokenize` route to get the tokenized input by Narsil in https://github.com/huggingface/text-generation-inference/pull/1471
* feat: adds phi model by drbh in https://github.com/huggingface/text-generation-inference/pull/1442
* fix: read stderr in download by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1486
* fix: show warning with tokenizer config parsing error by drbh in https://github.com/huggingface/text-generation-inference/pull/1488
* fix: launcher doc typos by Narsil in https://github.com/huggingface/text-generation-inference/pull/1473
* Reinstate exl2 with tp by Narsil in https://github.com/huggingface/text-generation-inference/pull/1490
* Add sealion mpt support by Narsil in https://github.com/huggingface/text-generation-inference/pull/1477
* Trying to fix that flaky test. by Narsil in https://github.com/huggingface/text-generation-inference/pull/1491
* fix: launcher doc typos by thelinuxkid in https://github.com/huggingface/text-generation-inference/pull/1462
* Update the docs to include newer models. by Narsil in https://github.com/huggingface/text-generation-inference/pull/1492
* GPTQ support on ROCm by fxmarty in https://github.com/huggingface/text-generation-inference/pull/1489
* feat: add tokenizer-config-path to launcher args by drbh in https://github.com/huggingface/text-generation-inference/pull/1495

New Contributors
* deepily made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1414
* PYNing made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1420
* drbh made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1427
* EndlessReform made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1470
* thelinuxkid made their first contribution in https://github.com/huggingface/text-generation-inference/pull/1462

**Full Changelog**: https://github.com/huggingface/text-generation-inference/compare/v1.3.4...v1.4.0

1.3.4

What's Changed
* feat: relax mistral requirements by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1351
* fix: fix logic if sliding window key is not present in config by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1352
* fix: fix offline (1341) by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1347
* fix: fix gpt-q with groupsize = -1 by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1358
* Peft safetensors. by Narsil in https://github.com/huggingface/text-generation-inference/pull/1364
* Change URL for Habana Gaudi support in doc by regisss in https://github.com/huggingface/text-generation-inference/pull/1343
* feat: update exllamav2 kernels by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1370
* Fix local load for peft by Narsil in https://github.com/huggingface/text-generation-inference/pull/1373

**Full Changelog**: https://github.com/huggingface/text-generation-inference/compare/v1.3.3...v1.3.4

1.3.3

What's Changed
* fix gptq params loading
* improve decode latency for long sequences two fold
* feat: add more latency metrics in forward by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1346
* fix: max_past default value must be -1, not 0 by OlivierDehaene in https://github.com/huggingface/text-generation-inference/pull/1348

**Full Changelog**: https://github.com/huggingface/text-generation-inference/compare/v1.3.2...v1.3.3

Page 1 of 6

Releases

Has known vulnerabilities

Text-generation

Page 1 of 6

1.4.3

1.4.2

1.4.1

1.4.0

1.3.4

1.3.3

Page 1 of 6

Links

Releases