Text-generation

Latest version: v0.7.0

Safety actively analyzes 682361 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 6

0.4.0

Features

- **router**: support best_of sampling
- **router**: support left truncation
- **server**: support typical sampling
- **launcher**: allow local models
- **clients**: add text-generation Python client
- **launcher**: allow parsing num_shard from CUDA_VISIBLE_DEVICES


Fix

- **server**: do not warp prefill logits
- **server**: fix formatting issues in generate_stream tokens
- **server**: fix galactica batch
- **server**: fix index out of range issue with watermarking

0.3.2

Features

- **router**: add support for huggingface api-inference
- **server**: add logits watermark with "A Watermark for Large Language Models"
- **server**: use a fixed transformers commit

Fix

- **launcher**: add missing parameters to launcher
- **server**: update to hf_transfer==0.1.2 to fix corrupted files issue

0.3.1

Features

- **server**: allocate full attention mask to decrease latency
- **server**: enable hf-transfer for insane download speeds
- **router**: add CORS options

Fix

- **server**: remove position_ids from galactica forward

0.3.0

Features

- **server**: support t5 models
- **router**: add max_total_tokens and empty_input validation
- **launcher**: add the possibility to disable custom CUDA kernels
- **server**: add automatic safetensors conversion
- **router**: add prometheus scrape endpoint
- **server, router**: add distributed tracing

Fix

- **launcher**: copy current env vars to subprocesses
- **docker**: add note around shared memory

0.2.1

Fix

- **server**: fix bug with repetition penalty when using GPUs and inference mode

0.2.0

Features

- **router**: support Token streaming using Server Side Events
- **router**: support seeding
- **server**: support gpt-neox
- **server**: support santacoder
- **server**: support repetition penalty
- **server**: allow the server to use a local weight cache

Breaking changes

- **router**: refactor Token API
- **router**: modify /generate API to only return generated text

Misc

- **router**: use background task to manage request queue
- **ci**: docker build/push on update

Page 6 of 6

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.