Sglang

Latest version: v0.4.4.post3

Safety actively analyzes 724206 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 7

0.1.17

Highlights
- Add data parallelim 480
- Add speculative execution for OpenAI API 250
- Update vllm to v0.4.3 for new quantization features 511
- Better error handling (457, 449, 514)

What's Changed
* [Feat] Add llava qwen, llava mistral by kcz358 in https://github.com/sgl-project/sglang/pull/419
* Format code by hnyls2002 in https://github.com/sgl-project/sglang/pull/441
* Add finish_reason to OpenAI API by mgerstgrasser in https://github.com/sgl-project/sglang/pull/446
* Simplify port allocation by merrymercy in https://github.com/sgl-project/sglang/pull/447
* Add PUT for generate api by Ying1123 in https://github.com/sgl-project/sglang/pull/448
* Improve error handling & abort disconnected requests by merrymercy in https://github.com/sgl-project/sglang/pull/449
* Fix the broken `--disable-radix-cache` by hnyls2002 in https://github.com/sgl-project/sglang/pull/451
* openai chat speculative execution by ChuyueSun in https://github.com/sgl-project/sglang/pull/250
* Fix openai speculative execution by Ying1123 in https://github.com/sgl-project/sglang/pull/456
* Abort disconnected requests by merrymercy in https://github.com/sgl-project/sglang/pull/457
* Rename api_num_spec_tokens -> num_api_spec_tokens by merrymercy in https://github.com/sgl-project/sglang/pull/458
* Use model loader from vllm by merrymercy in https://github.com/sgl-project/sglang/pull/459
* port fp8 mixtral by merrymercy in https://github.com/sgl-project/sglang/pull/460
* fix test bug in srt_llava_next_test.py by bingwork in https://github.com/sgl-project/sglang/pull/470
* Add the instruction link to the LLaVA-NeXT-Video at README by ZhangYuanhan-AI in https://github.com/sgl-project/sglang/pull/463
* Improve logging & add logit cap by merrymercy in https://github.com/sgl-project/sglang/pull/471
* Optimize retract by hnyls2002 in https://github.com/sgl-project/sglang/pull/440
* Add benchmark scripts by Ying1123 in https://github.com/sgl-project/sglang/pull/476
* [Feat/Fix] Refactoring Llava models into single file by Luodian in https://github.com/sgl-project/sglang/pull/475
* Improve benchmark scripts & rename some scripts by merrymercy in https://github.com/sgl-project/sglang/pull/477
* Improve benchmark scripts & add more models by merrymercy in https://github.com/sgl-project/sglang/pull/484
* Support data parallelism (static) by Ying1123 in https://github.com/sgl-project/sglang/pull/480
* Make the server random by default by merrymercy in https://github.com/sgl-project/sglang/pull/488
* Revert "Make the server random by default" by Ying1123 in https://github.com/sgl-project/sglang/pull/492
* update the script: examples/usage/llava_video/srt_example_llava_v.sh by ZhangYuanhan-AI in https://github.com/sgl-project/sglang/pull/491
* Make the server random by default by merrymercy in https://github.com/sgl-project/sglang/pull/493
* Update vllm to v0.4.3 by merrymercy in https://github.com/sgl-project/sglang/pull/511
* remove redundant pad_input_ids function by amosyou in https://github.com/sgl-project/sglang/pull/500
* Litellm Backend by huyiwen in https://github.com/sgl-project/sglang/pull/502
* Fix rid state map leak + Refractor .finished by Qubitium in https://github.com/sgl-project/sglang/pull/505
* Crash the server when error or OOM happens by merrymercy in https://github.com/sgl-project/sglang/pull/514
* Update version to 0.1.17 by merrymercy in https://github.com/sgl-project/sglang/pull/515

New Contributors
* kcz358 made their first contribution in https://github.com/sgl-project/sglang/pull/419
* mgerstgrasser made their first contribution in https://github.com/sgl-project/sglang/pull/446
* bingwork made their first contribution in https://github.com/sgl-project/sglang/pull/470
* amosyou made their first contribution in https://github.com/sgl-project/sglang/pull/500
* huyiwen made their first contribution in https://github.com/sgl-project/sglang/pull/502

**Full Changelog**: https://github.com/sgl-project/sglang/compare/v0.1.16...v0.1.17

0.1.16

Highlight
* Support more models: DBRX, Command-R, Gemma
* Support llava-video (423, https://llava-vl.github.io/blog/2024-04-30-llava-next-video/)
* Cache performance improvements (418, 364)
* Marlin quantization kernels
* Many bug fixes
* Update dependencies to be compatible with their latest versions

What's Changed
* Fix Runtime missing some ServerArgs options by Qubitium in https://github.com/sgl-project/sglang/pull/281
* adding the triton docker build minimal example by amirarsalan90 in https://github.com/sgl-project/sglang/pull/242
* Fix flashinfer >= 0.0.3 compat by Qubitium in https://github.com/sgl-project/sglang/pull/282
* Fix Incorrect CURL Request Example in README by amirarsalan90 in https://github.com/sgl-project/sglang/pull/287
* enable marlin kernels by qeternity in https://github.com/sgl-project/sglang/pull/286
* Fix env (docker) compat due to __file__ usage by Qubitium in https://github.com/sgl-project/sglang/pull/288
* Fix marlin model loading compat with autogptq by Liurl21 in https://github.com/sgl-project/sglang/pull/290
* Fix outlines-0.0.35 incompatibility by ZhouGongZaiShi in https://github.com/sgl-project/sglang/pull/291
* [Fix/Potential Bugs] Can not correctly import models in python/sglang/srt/models by Luodian in https://github.com/sgl-project/sglang/pull/311
* Use Anthropic messages API by janimo in https://github.com/sgl-project/sglang/pull/304
* Add StableLM model. by janimo in https://github.com/sgl-project/sglang/pull/301
* Support oai in benchmark/mmlu by merrymercy in https://github.com/sgl-project/sglang/pull/323
* Update version to v0.1.14 by merrymercy in https://github.com/sgl-project/sglang/pull/324
* Cleanup codebase: removed unnecessary code/logic by Qubitium in https://github.com/sgl-project/sglang/pull/298
* Update dependencies by janimo in https://github.com/sgl-project/sglang/pull/326
* Openrouter usage example by janimo in https://github.com/sgl-project/sglang/pull/327
* `model_rpc` style improvement by hnyls2002 in https://github.com/sgl-project/sglang/pull/293
* `model_runner` simplify by hnyls2002 in https://github.com/sgl-project/sglang/pull/329
* Logprobs Refractor by hnyls2002 in https://github.com/sgl-project/sglang/pull/331
* `DBRX` support by hnyls2002 in https://github.com/sgl-project/sglang/pull/337
* Add support for new autogptq quant_config.checkpoint_format by Qubitium in https://github.com/sgl-project/sglang/pull/332
* Fix llava parallelism/fork bug by lockon-n in https://github.com/sgl-project/sglang/pull/315
* Eliminate 2 gpu ops during sampling when logit_bias is zero by hnyls2002 in https://github.com/sgl-project/sglang/pull/343
* Revert "Eliminate 2 gpu ops during sampling when logit_bias is zero" by hnyls2002 in https://github.com/sgl-project/sglang/pull/345
* Eliminate 2 gpu ops during sampling when logit_bias is zero by Qubitium in https://github.com/sgl-project/sglang/pull/338
* Add timeout to get_meta_info by SimoneRaponi in https://github.com/sgl-project/sglang/pull/346
* Fix typos in infer_batch.py by tom-doerr in https://github.com/sgl-project/sglang/pull/354
* Time cost utils by hnyls2002 in https://github.com/sgl-project/sglang/pull/355
* Update README.md by eltociear in https://github.com/sgl-project/sglang/pull/358
* support `command-r` by ZhouXingg in https://github.com/sgl-project/sglang/pull/369
* Fix issue 367 – System message not supported for Anthropic (anthropic.BadRequestError) by fronx in https://github.com/sgl-project/sglang/pull/368
* Update model support in readme by Ying1123 in https://github.com/sgl-project/sglang/pull/370
* Optimize radix tree matching by ispobock in https://github.com/sgl-project/sglang/pull/364
* Reduce overhead when `fork(1)` by hnyls2002 in https://github.com/sgl-project/sglang/pull/375
* llama3 instruct template by qeternity in https://github.com/sgl-project/sglang/pull/372
* add `.isort.cfg` by hnyls2002 in https://github.com/sgl-project/sglang/pull/378
* Revert removing the unused imports by hnyls2002 in https://github.com/sgl-project/sglang/pull/385
* Benchmark Updates by hnyls2002 in https://github.com/sgl-project/sglang/pull/382
* Improve performance when running with full parallel by hnyls2002 in https://github.com/sgl-project/sglang/pull/394
* Minor: style improvement of radix_cache and memory_pool by hnyls2002 in https://github.com/sgl-project/sglang/pull/395
* Format Benchmark Code by hnyls2002 in https://github.com/sgl-project/sglang/pull/399
* Fix chatml template by merrymercy in https://github.com/sgl-project/sglang/pull/406
* Adding RAG tracing & eval cookbook using Parea by joschkabraun in https://github.com/sgl-project/sglang/pull/390
* SamplingParams add "spaces_between_special_tokens" argument by ZhouXingg in https://github.com/sgl-project/sglang/pull/392
* Organize Benchmark by hnyls2002 in https://github.com/sgl-project/sglang/pull/381
* Add Cohere Command R chat template by noah-kim-theori in https://github.com/sgl-project/sglang/pull/411
* Fix `sync()` when `fork(1)` by hnyls2002 in https://github.com/sgl-project/sglang/pull/412
* Include finish reason in meta info response by qeternity in https://github.com/sgl-project/sglang/pull/415
* Make public APIs more standard. by hnyls2002 in https://github.com/sgl-project/sglang/pull/416
* Compat with latest VLLM 0.4.2 main + fork.number rename + Flashinfer 0.0.4 by Qubitium in https://github.com/sgl-project/sglang/pull/380
* Optimize the memory usage of logits processor by merrymercy in https://github.com/sgl-project/sglang/pull/420
* Clean up by merrymercy in https://github.com/sgl-project/sglang/pull/422
* Fix logit processor bugs by merrymercy in https://github.com/sgl-project/sglang/pull/427
* Minor fix for the import path by merrymercy in https://github.com/sgl-project/sglang/pull/428
* Move openai api server into a separate file by merrymercy in https://github.com/sgl-project/sglang/pull/429
* Fix flashinfer by merrymercy in https://github.com/sgl-project/sglang/pull/430
* Update version to 0.1.15 by merrymercy in https://github.com/sgl-project/sglang/pull/431
* Misc fixes by merrymercy in https://github.com/sgl-project/sglang/pull/432
* Allow `input_ids` in the input of the `/generate` endpoint by lolipopshock in https://github.com/sgl-project/sglang/pull/363
* Improve error handling by merrymercy in https://github.com/sgl-project/sglang/pull/433
* Cache optimizations by hnyls2002 in https://github.com/sgl-project/sglang/pull/418
* Update readme by merrymercy in https://github.com/sgl-project/sglang/pull/434
* Raise errors for prompts that are too long by merrymercy in https://github.com/sgl-project/sglang/pull/436
* support llava video by ZhangYuanhan-AI in https://github.com/sgl-project/sglang/pull/426
* Fix streaming by merrymercy in https://github.com/sgl-project/sglang/pull/437
* Update version to 0.1.16 by merrymercy in https://github.com/sgl-project/sglang/pull/438

New Contributors
* Qubitium made their first contribution in https://github.com/sgl-project/sglang/pull/281
* amirarsalan90 made their first contribution in https://github.com/sgl-project/sglang/pull/242
* Liurl21 made their first contribution in https://github.com/sgl-project/sglang/pull/290
* ZhouGongZaiShi made their first contribution in https://github.com/sgl-project/sglang/pull/291
* Luodian made their first contribution in https://github.com/sgl-project/sglang/pull/311
* janimo made their first contribution in https://github.com/sgl-project/sglang/pull/304
* lockon-n made their first contribution in https://github.com/sgl-project/sglang/pull/315
* SimoneRaponi made their first contribution in https://github.com/sgl-project/sglang/pull/346
* tom-doerr made their first contribution in https://github.com/sgl-project/sglang/pull/354
* ZhouXingg made their first contribution in https://github.com/sgl-project/sglang/pull/369
* fronx made their first contribution in https://github.com/sgl-project/sglang/pull/368
* ispobock made their first contribution in https://github.com/sgl-project/sglang/pull/364
* joschkabraun made their first contribution in https://github.com/sgl-project/sglang/pull/390
* noah-kim-theori made their first contribution in https://github.com/sgl-project/sglang/pull/411
* lolipopshock made their first contribution in https://github.com/sgl-project/sglang/pull/363
* ZhangYuanhan-AI made their first contribution in https://github.com/sgl-project/sglang/pull/426

**Full Changelog**: https://github.com/sgl-project/sglang/compare/v0.1.13...v0.1.16

0.1.13

Highlights
* Gemma Support by hnyls2002 in https://github.com/sgl-project/sglang/pull/256
* Add Together and AzureOpenAI examples by merrymercy in https://github.com/sgl-project/sglang/pull/184

What's Changed
* correct a mistake on the README.md by yaya-sy in https://github.com/sgl-project/sglang/pull/182
* correct reference dtype openai.py by yaya-sy in https://github.com/sgl-project/sglang/pull/181
* Add Together and AzureOpenAI examples by merrymercy in https://github.com/sgl-project/sglang/pull/184
* Fix server launch for jupyter notebook by merrymercy in https://github.com/sgl-project/sglang/pull/186
* Refactor decoding logprob and add completion_tokens_wo_jump_forward by comaniac in https://github.com/sgl-project/sglang/pull/189
* Pin outlines version by comaniac in https://github.com/sgl-project/sglang/pull/196
* Adjust outlines version. by hnyls2002 in https://github.com/sgl-project/sglang/pull/200
* Update README.md by eltociear in https://github.com/sgl-project/sglang/pull/207
* Added the ability to Modify the Context Length by psych0v0yager in https://github.com/sgl-project/sglang/pull/210
* Fix logprobs with logprob_start_len by comaniac in https://github.com/sgl-project/sglang/pull/193
* Support outlines > 0.0.31 by comaniac in https://github.com/sgl-project/sglang/pull/219
* Fix stop str merging by hnyls2002 in https://github.com/sgl-project/sglang/pull/225
* Fix interpreter.py `get_var(var_name)` in text iter when `stream` is not enabled by exceedzhang in https://github.com/sgl-project/sglang/pull/198
* fix chatml template by qeternity in https://github.com/sgl-project/sglang/pull/195
* Upload `agent_calls.jsonl` download link by hnyls2002 in https://github.com/sgl-project/sglang/pull/226
* Fix addr reuse in check_port by hnyls2002 in https://github.com/sgl-project/sglang/pull/253
* Add SSL Cert Functionality by nivibilla in https://github.com/sgl-project/sglang/pull/224
* Refactor ChatTemplate for Enhanced Clarity and Efficiency by cubxxw in https://github.com/sgl-project/sglang/pull/201
* Add `set_var` to interpreter.py by 1024th in https://github.com/sgl-project/sglang/pull/263
* Add logo by merrymercy in https://github.com/sgl-project/sglang/pull/275
* Fix qwen config by hnyls2002 in https://github.com/sgl-project/sglang/pull/261
* replace skip_embed with input_embeds by TideDra in https://github.com/sgl-project/sglang/pull/222
* Gemma Support by hnyls2002 in https://github.com/sgl-project/sglang/pull/256
* Improve gemma and documentations by merrymercy in https://github.com/sgl-project/sglang/pull/278
* Organize `server_args` by hnyls2002 in https://github.com/sgl-project/sglang/pull/277
* Add Support for API Key Authentication by alessiodallapiazza in https://github.com/sgl-project/sglang/pull/230
* Fix RuntimeEndpoint by merrymercy in https://github.com/sgl-project/sglang/pull/279
* Update version to v0.1.13 by merrymercy in https://github.com/sgl-project/sglang/pull/280

New Contributors
* psych0v0yager made their first contribution in https://github.com/sgl-project/sglang/pull/210
* exceedzhang made their first contribution in https://github.com/sgl-project/sglang/pull/198
* qeternity made their first contribution in https://github.com/sgl-project/sglang/pull/195
* cubxxw made their first contribution in https://github.com/sgl-project/sglang/pull/201
* 1024th made their first contribution in https://github.com/sgl-project/sglang/pull/263
* TideDra made their first contribution in https://github.com/sgl-project/sglang/pull/222
* alessiodallapiazza made their first contribution in https://github.com/sgl-project/sglang/pull/230

**Full Changelog**: https://github.com/sgl-project/sglang/compare/v0.1.12...v0.1.13

0.1.12

Highlights
- Fast JSON Decoding ([blog](https://lmsys.org/blog/2024-02-05-compressed-fsm/))
- Output logprobs for decoding tokens
- Multiple bug fixes

What's Changed
* Fix no-cache mode by Ying1123 in https://github.com/sgl-project/sglang/pull/136
* Support Faster JSON decoding for llava by hnyls2002 in https://github.com/sgl-project/sglang/pull/137
* fix undfined variable by yaya-sy in https://github.com/sgl-project/sglang/pull/142
* jump-forward rename by hnyls2002 in https://github.com/sgl-project/sglang/pull/144
* Add warmup to SRT server by comaniac in https://github.com/sgl-project/sglang/pull/146
* add openai error handler with retry and logger by ChuyueSun in https://github.com/sgl-project/sglang/pull/148
* Temporary fix OpenAI API for Pydantic v1/v2 by comaniac in https://github.com/sgl-project/sglang/pull/153
* Add gptq quantization model support by Arcmoon-Hu in https://github.com/sgl-project/sglang/pull/141
* Support decode token logprobs by comaniac in https://github.com/sgl-project/sglang/pull/130
* Format code & move functions by merrymercy in https://github.com/sgl-project/sglang/pull/155
* [Submodule] Change FlashInfer to import by comaniac in https://github.com/sgl-project/sglang/pull/156
* add `--disable-disk-cache` by hnyls2002 in https://github.com/sgl-project/sglang/pull/160
* Add Auth Token to RuntimeEndpoint by nivibilla in https://github.com/sgl-project/sglang/pull/162
* Fix BaseCache metric by comaniac in https://github.com/sgl-project/sglang/pull/170
* import outlines by hnyls2002 in https://github.com/sgl-project/sglang/pull/168
* Fix token usage with jump forward by comaniac in https://github.com/sgl-project/sglang/pull/174
* Support extra field regex in OpenAI API by comaniac in https://github.com/sgl-project/sglang/pull/172
* Fix the chat template for llava-v1.6-34b & format code by merrymercy in https://github.com/sgl-project/sglang/pull/177
* Update version to 0.1.12 by merrymercy in https://github.com/sgl-project/sglang/pull/178

New Contributors
* yaya-sy made their first contribution in https://github.com/sgl-project/sglang/pull/142
* ChuyueSun made their first contribution in https://github.com/sgl-project/sglang/pull/148
* nivibilla made their first contribution in https://github.com/sgl-project/sglang/pull/162

**Full Changelog**: https://github.com/sgl-project/sglang/compare/v0.1.11...v0.1.12

0.1.11

New Contributors
* isaac-vidas made their first contribution in https://github.com/sgl-project/sglang/pull/80
* Arcmoon-Hu made their first contribution in https://github.com/sgl-project/sglang/pull/75
* CSWellesSun made their first contribution in https://github.com/sgl-project/sglang/pull/84
* haotian-liu made their first contribution in https://github.com/sgl-project/sglang/pull/95
* parasol-aser made their first contribution in https://github.com/sgl-project/sglang/pull/48
* JustinLin610 made their first contribution in https://github.com/sgl-project/sglang/pull/114
* fozziethebeat made their first contribution in https://github.com/sgl-project/sglang/pull/113
* Ja1Zhou made their first contribution in https://github.com/sgl-project/sglang/pull/116

**Full Changelog**: https://github.com/sgl-project/sglang/compare/v0.1.6...v0.1.11

0.1.6

Major features
- Add OpenAI-compatible API server (Completion and ChatCompletion)
- Fix `sgl.select`

All PRs
* Support v1/chat/completions by comaniac in https://github.com/sgl-project/sglang/pull/50
* Fix select and normalized logprobs by merrymercy in https://github.com/sgl-project/sglang/pull/67
* Bump version to 0.1.5 by merrymercy in https://github.com/sgl-project/sglang/pull/33
* Use HTTP link in 3rdparty module by comaniac in https://github.com/sgl-project/sglang/pull/42
* Document sampling parameters by merrymercy in https://github.com/sgl-project/sglang/pull/45
* Increase interpreter parallelism by merrymercy in https://github.com/sgl-project/sglang/pull/46
* Add a llava example by merrymercy in https://github.com/sgl-project/sglang/pull/47
* Support stream=True in v1/completions by comaniac in https://github.com/sgl-project/sglang/pull/49
* Format code & Improve readme by merrymercy in https://github.com/sgl-project/sglang/pull/52
* Fix the possible bug of decode out of memory by hnyls2002 in https://github.com/sgl-project/sglang/pull/36
* Improve error message & Add vicuna template by merrymercy in https://github.com/sgl-project/sglang/pull/57
* Update README.md by eltociear in https://github.com/sgl-project/sglang/pull/58
* Disk FSM cache and adjust code. by hnyls2002 in https://github.com/sgl-project/sglang/pull/63
* Fix select by merrymercy in https://github.com/sgl-project/sglang/pull/64
* Bump version to 0.1.6 by merrymercy in https://github.com/sgl-project/sglang/pull/68

New Contributors
* comaniac made their first contribution in https://github.com/sgl-project/sglang/pull/42
* eltociear made their first contribution in https://github.com/sgl-project/sglang/pull/58

**Full Changelog**: https://github.com/sgl-project/sglang/compare/v0.1.5...v0.1.6

Page 6 of 7

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.