Sglang

Latest version: v0.3.6

Safety actively analyzes 682416 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 5

0.3.4

* Simplify the interface of tp_worker by merrymercy in https://github.com/sgl-project/sglang/pull/1718
* Update vllm to 0.6.3 (1711) by zhyncs in https://github.com/sgl-project/sglang/pull/1720
* Support qwen2 vl model by zhyncs in https://github.com/sgl-project/sglang/pull/1721
* Update README.md by Ying1123 in https://github.com/sgl-project/sglang/pull/1722
* Unify the memory pool api and tp worker API by merrymercy in https://github.com/sgl-project/sglang/pull/1724
* Temporarily skip the test_mixed_batch for QWen2VL by merrymercy in https://github.com/sgl-project/sglang/pull/1725
* Split the overlapped version of TpModelWorkerClient into a separate file by merrymercy in https://github.com/sgl-project/sglang/pull/1726
* [Bugfix] qwen2vl forward_extend by yizhang2077 in https://github.com/sgl-project/sglang/pull/1727
* Simplify the usage of device by merrymercy in https://github.com/sgl-project/sglang/pull/1734
* Simplify batch result resolution by merrymercy in https://github.com/sgl-project/sglang/pull/1735
* Add GLM-4 TextGeneration Model support for SGLang by sixsixcoder in https://github.com/sgl-project/sglang/pull/1736
* Make token mapping non-blocking in the overlapped mode by merrymercy in https://github.com/sgl-project/sglang/pull/1740
* Maintain seq_lens_sum to make more FlashInfer operations non-blocking by merrymercy in https://github.com/sgl-project/sglang/pull/1741
* Fix prefill oom by hnyls2002 in https://github.com/sgl-project/sglang/pull/1743
* Faster overlap mode scheduler by merrymercy in https://github.com/sgl-project/sglang/pull/1738
* misc: add CODEOWNERS by zhyncs in https://github.com/sgl-project/sglang/pull/1737
* Fix sliding window attention and gemma-2 unit tests in CI by merrymercy in https://github.com/sgl-project/sglang/pull/1746
* Llama3.2 vision model support by hnyls2002 in https://github.com/sgl-project/sglang/pull/1551
* Update `max_req_len` and `max_req_input_len` by hnyls2002 in https://github.com/sgl-project/sglang/pull/1748

0.3.3.post1

* [engine] support async and streaming by ByronHsu in https://github.com/sgl-project/sglang/pull/1614
* [Fix] Fix the style of test_large_max_new_tokens.py by merrymercy in https://github.com/sgl-project/sglang/pull/1638
* fix missing ignore_eos in v1/chat/completions by learninmou in https://github.com/sgl-project/sglang/pull/1642
* Fix ignore_eos in the OpenAI ChatCompletions API by merrymercy in https://github.com/sgl-project/sglang/pull/1645
* [Feature, Hardware] Enable SGLang on XPU GPUs via PyTorch by liangan1 in https://github.com/sgl-project/sglang/pull/1480
* Fix unit tests and type annotations by merrymercy in https://github.com/sgl-project/sglang/pull/1648
* Add an option to disable penalizer by merrymercy in https://github.com/sgl-project/sglang/pull/1651
* Add get_tokenizer function for Engine class by pjyi2147 in https://github.com/sgl-project/sglang/pull/1653
* Fix the batch_is_full check for jump-forward decoding by merrymercy in https://github.com/sgl-project/sglang/pull/1654
* Simplify the event loop and expose `--num-continuous-decode-steps` as an argument by merrymercy in https://github.com/sgl-project/sglang/pull/1652
* [doc] Add engine section in backend.md by ByronHsu in https://github.com/sgl-project/sglang/pull/1656
* [Fix] fix eos trim inconsistency by Ying1123 in https://github.com/sgl-project/sglang/pull/1650
* Add output_ids into ScheduleBatch by merrymercy in https://github.com/sgl-project/sglang/pull/1659
* [Minor] Rename no_eos_trim to no_stop_trim by Ying1123 in https://github.com/sgl-project/sglang/pull/1661
* Add a test case to test retract by merrymercy in https://github.com/sgl-project/sglang/pull/1662
* Move filter_batch out of stream_output by merrymercy in https://github.com/sgl-project/sglang/pull/1663
* Support double sparsity by andy-yang-1 in https://github.com/sgl-project/sglang/pull/1459
* Fix unit test order to balance the tasks in CI by merrymercy in https://github.com/sgl-project/sglang/pull/1665
* [Minor] Improve style by merrymercy in https://github.com/sgl-project/sglang/pull/1666
* Simplify chunked prefill by merrymercy in https://github.com/sgl-project/sglang/pull/1667
* [1/N] Remove `CacheConfig` import in all model files by ByronHsu in https://github.com/sgl-project/sglang/pull/1658
* [doc] improve engine doc and add to readme by ByronHsu in https://github.com/sgl-project/sglang/pull/1670
* [Minor] Add some utility functions by merrymercy in https://github.com/sgl-project/sglang/pull/1671
* Improve benchmark scripts by merrymercy in https://github.com/sgl-project/sglang/pull/1672
* Fix memory leak during abort by merrymercy in https://github.com/sgl-project/sglang/pull/1674
* Fix filter_batch function call by hnyls2002 in https://github.com/sgl-project/sglang/pull/1681
* Add OLMo model by janimo in https://github.com/sgl-project/sglang/pull/1676
* Add a new event loop by merrymercy in https://github.com/sgl-project/sglang/pull/1677
* Fix srt dependency by ispobock in https://github.com/sgl-project/sglang/pull/1685
* [Event] Add online meetup meeting link by Ying1123 in https://github.com/sgl-project/sglang/pull/1686
* Launch a thread to overlap CPU and GPU by merrymercy in https://github.com/sgl-project/sglang/pull/1687
* Returning a per request metric for number of cached_tokens read by havetc in https://github.com/sgl-project/sglang/pull/1599
* add orjson for jsonresponse by michaelfeil in https://github.com/sgl-project/sglang/pull/1688
* Update README.md by merrymercy in https://github.com/sgl-project/sglang/pull/1689
* Add date to logging messages (1623) by zeng-zc in https://github.com/sgl-project/sglang/pull/1679
* Update the transformers version in CI by merrymercy in https://github.com/sgl-project/sglang/pull/1690
* Use SGLang imports for linear layer by janimo in https://github.com/sgl-project/sglang/pull/1696
* feat: radix tree code optimize by wxsms in https://github.com/sgl-project/sglang/pull/1697
* ORJson. Faster Json serialization by michaelfeil in https://github.com/sgl-project/sglang/pull/1694
* Fix the failed unit tests by merrymercy in https://github.com/sgl-project/sglang/pull/1699
* Fix failed ci tests on long prompts; Better error messages for embedding models by merrymercy in https://github.com/sgl-project/sglang/pull/1700
* Fix engine unit test by merrymercy in https://github.com/sgl-project/sglang/pull/1701
* Fix mixed batch for multi modal models by merrymercy in https://github.com/sgl-project/sglang/pull/1702
* Add matched_stop token or str to distinguish between eos or stop str finish_reason generation by g-drozdov in https://github.com/sgl-project/sglang/pull/1684
* Fix regex and logprob conflicts when chunked prefilling by hnyls2002 in https://github.com/sgl-project/sglang/pull/1703
* Simplify flashinfer utilities by merrymercy in https://github.com/sgl-project/sglang/pull/1704
* Add dtype for more operations by merrymercy in https://github.com/sgl-project/sglang/pull/1705
* Add grouped free operations by merrymercy in https://github.com/sgl-project/sglang/pull/1706
* Skip unnecessary penalizer by merrymercy in https://github.com/sgl-project/sglang/pull/1707
* Simplify the nan detection and greedy check in sampler by merrymercy in https://github.com/sgl-project/sglang/pull/1709
* Fix `is_all_ready` for overlap copy by merrymercy in https://github.com/sgl-project/sglang/pull/1710
* Fix the race condition in overlap mode by merrymercy in https://github.com/sgl-project/sglang/pull/1712
* Update README.md by merrymercy in https://github.com/sgl-project/sglang/pull/1713

0.3.3

* [Minor] Fix logging typo by amosyou in https://github.com/sgl-project/sglang/pull/1615
* Fix test_vision_openai_server on CI by ByronHsu in https://github.com/sgl-project/sglang/pull/1620
* [Performance, hardware] MoE tuning update to AMD MI300x GPUs by HaiShaw in https://github.com/sgl-project/sglang/pull/1619
* Update README.md by kushal34712 in https://github.com/sgl-project/sglang/pull/1625
* Update README.md by merrymercy in https://github.com/sgl-project/sglang/pull/1629
* Add device support by liangan1 in https://github.com/sgl-project/sglang/pull/1607
* Nit about the decorator of `PortArgs.init_new` by glen-amd in https://github.com/sgl-project/sglang/pull/1611
* [Bug] Fix the Image Input of Batch Generation by OBJECT907 in https://github.com/sgl-project/sglang/pull/1579
* Add the ability to enable and disable the Profiler via HTTP API. by abatom in https://github.com/sgl-project/sglang/pull/1626
* Fix the correctness test in bench_latency.py when tp > 1 and test_generation_models.py by merrymercy in https://github.com/sgl-project/sglang/pull/1631
* Add image_token in conversation.py by merrymercy in https://github.com/sgl-project/sglang/pull/1632
* Added a "Back To Top" Button by JanumalaAkhilendra in https://github.com/sgl-project/sglang/pull/1633
* Fix constrained decoding by merrymercy in https://github.com/sgl-project/sglang/pull/1634
* Add back data parallelism by merrymercy in https://github.com/sgl-project/sglang/pull/1635

0.3.2

New Contributors
* zifeitong made their first contribution in https://github.com/sgl-project/sglang/pull/1363
* wcsjtu made their first contribution in https://github.com/sgl-project/sglang/pull/1370
* Achazwl made their first contribution in https://github.com/sgl-project/sglang/pull/1371
* josephrocca made their first contribution in https://github.com/sgl-project/sglang/pull/1373
* blacker521 made their first contribution in https://github.com/sgl-project/sglang/pull/1367
* yzh119 made their first contribution in https://github.com/sgl-project/sglang/pull/1403
* hxer7963 made their first contribution in https://github.com/sgl-project/sglang/pull/1397
* Aphoh made their first contribution in https://github.com/sgl-project/sglang/pull/1427
* HaiShaw made their first contribution in https://github.com/sgl-project/sglang/pull/1420
* jasonyux made their first contribution in https://github.com/sgl-project/sglang/pull/1449
* Muennighoff made their first contribution in https://github.com/sgl-project/sglang/pull/1476
* rchen19 made their first contribution in https://github.com/sgl-project/sglang/pull/1481
* wellhowtosay made their first contribution in https://github.com/sgl-project/sglang/pull/1456
* luzengxiangcn made their first contribution in https://github.com/sgl-project/sglang/pull/1499
* TianyiQ made their first contribution in https://github.com/sgl-project/sglang/pull/1508

**Full Changelog**: https://github.com/sgl-project/sglang/compare/v0.3.0...v0.3.2

0.3.1.post2

* Fix env vars in bench_latency by merrymercy in https://github.com/sgl-project/sglang/pull/1472
* feat: update linear deps 1/N by zhyncs in https://github.com/sgl-project/sglang/pull/1305
* minor: add quant eval compared with base by zhyncs in https://github.com/sgl-project/sglang/pull/1475
* Add OLMoE by Muennighoff in https://github.com/sgl-project/sglang/pull/1476
* Fix triton head num by ispobock in https://github.com/sgl-project/sglang/pull/1482
* Add MLA gsm8k eval by ispobock in https://github.com/sgl-project/sglang/pull/1484
* chore: bump v0.3.1.post3 by zhyncs in https://github.com/sgl-project/sglang/pull/1483
* fix incorrect links in documentation by rchen19 in https://github.com/sgl-project/sglang/pull/1481
* doc: update backend by zhyncs in https://github.com/sgl-project/sglang/pull/1486
* Better unit tests for adding a new model by merrymercy in https://github.com/sgl-project/sglang/pull/1488
* Pr fix max workers by wellhowtosay in https://github.com/sgl-project/sglang/pull/1456
* Add a unit test for data parallelism by merrymercy in https://github.com/sgl-project/sglang/pull/1489
* Add AMD tests to CI by Ying1123 in https://github.com/sgl-project/sglang/pull/1491
* Update dockerfile to include datamodel_code_generator by merrymercy in https://github.com/sgl-project/sglang/pull/1492
* [API, Feature] Support response prefill for openai API by Ying1123 in https://github.com/sgl-project/sglang/pull/1490
* minor: add mla fp8 test by zhyncs in https://github.com/sgl-project/sglang/pull/1494
* Fix the overhead due to penalizer in bench_latency by merrymercy in https://github.com/sgl-project/sglang/pull/1496
* MoE torch compile by ispobock in https://github.com/sgl-project/sglang/pull/1497
* [CI] Move AMD test to a separate file by merrymercy in https://github.com/sgl-project/sglang/pull/1500
* Update test_srt_backend.py by merrymercy in https://github.com/sgl-project/sglang/pull/1502
* debug radixcache stack_overflow by luzengxiangcn in https://github.com/sgl-project/sglang/pull/1499
* Simplify bench_latency.py by merrymercy in https://github.com/sgl-project/sglang/pull/1503
* [Fix] Fix clean_up_tokenization_spaces in tokenizer by merrymercy in https://github.com/sgl-project/sglang/pull/1510
* Add support for tie_word_embeddings when loading weights + support for SmolLM by TianyiQ in https://github.com/sgl-project/sglang/pull/1508
* Revert "kernel: use tensor cores for flashinfer gqa kernels" by Ying1123 in https://github.com/sgl-project/sglang/pull/1511

0.3.1.post1

* Enable MLA by default by ispobock in https://github.com/sgl-project/sglang/pull/1447
* Fix attention backend by ispobock in https://github.com/sgl-project/sglang/pull/1448
* fix schedule bug by hnyls2002 in https://github.com/sgl-project/sglang/pull/1450
* Fix schedule bug by hnyls2002 in https://github.com/sgl-project/sglang/pull/1451
* Fixed n>1 causing list index out of range with VLM by jasonyux in https://github.com/sgl-project/sglang/pull/1449
* Add bench_server_latency.py by merrymercy in https://github.com/sgl-project/sglang/pull/1452
* [Bugfix] Enable SGLang on AMD GPUs via PyTorch for ROCm (1419) by HaiShaw in https://github.com/sgl-project/sglang/pull/1453
* Fix oom issues with fp8 for llama by merrymercy in https://github.com/sgl-project/sglang/pull/1454
* Fuse top_k and top_k in the sampler by merrymercy in https://github.com/sgl-project/sglang/pull/1457
* [Event] Add public meeting invite to README by Ying1123 in https://github.com/sgl-project/sglang/pull/1458
* fix: creat new dict everytime for putting new frame by Luodian in https://github.com/sgl-project/sglang/pull/1464
* Fix padding in the cuda graph by merrymercy in https://github.com/sgl-project/sglang/pull/1469

Page 2 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.