Highlights since open-sourcing
We have introduced "sllm" (pronounced "slim") as the new abbreviation for ServerlessLLM and updated the corresponding file paths for simplicity. The PyPI package name remains `serverless-llm`.
For example:
- `serverless_llm/` has been renamed to `sllm/`
- `serverless_llm/store/` has been shortened to `sllm_store/`
This allows you to now `import sllm` and `import sllm_store` for easier usage.
New contributors
Welcome several new contributors!
- Yinsicheng Jiang [SecretSettler](https://github.com/SecretSettler)
- Yanwei Ye [anyin233](https://github.com/anyin233)
- [eltociear](https://github.com/eltociear)
New Features
- **New inference backend: vLLM**:
- Integrated the vLLM inference backend, enabling highly optimized model execution for large language models (61).
- Added the vLLM model save/load interface to manage model persistence across sessions (31).
- Enhanced the vLLM model downloader to improve GPU resource utilization and caching mechanisms, boosting efficiency and stability (53, 101).
- **Expansion in functionality**:
- Introduced support for the Embedding API for the transformers backend, expanding compatibility with additional AI models (97).
- Verified support for BF16 precision, ensuring better performance and reduced memory consumption for specific transformer-based models (102).
- **Enhanced deployment**:
- Enabled the ability to override default configurations during deployment, making the process more customizable for different environments (32).
- Added support for partial configurations, simplifying the setup for users by allowing them to modify only relevant parts of the configuration (46).
- **Pip installation**: We now support installation directly from pip, making it easier to set up and use ServerlessLLM across different platforms.
Documentation
- **Multi-Machine Setup**: Added detailed instructions for setting up multi-machine environments, making it easier to scale deployments across multiple nodes (30).
- **Storage-Aware Scheduling**: Documented the new storage-aware scheduler feature, which optimizes job placement based on available storage (22).
- **ServerlessLLM Store Quickstart**: Provided a quickstart guide for ServerlessLLM Store, helping users quickly get up and running with this feature (13).
Testing and CI Enhancements
- **CI Enhancements**:
- Introduced continuous integration improvements, including linter integration and refined workflows to ensure code quality and streamline development (73, 99).
- **Automated Publish Workflow**: Added a workflow that automatically publishes releases, streamlining the release process and reducing manual steps (88).
- **Unit Tests**: Added comprehensive unit tests, covering backend functionalities (55), ServerlessLLM Store tests (58), CLI commands (65), and supporting end-to-end GPU tests (85).
What's Changed
* Initial release by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/8
* Update README.md by luomai in https://github.com/ServerlessLLM/ServerlessLLM/pull/9
* Add contributors. by luomai in https://github.com/ServerlessLLM/ServerlessLLM/pull/10
* docs: add sllm-store guide by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/13
* docs: update readme and fix docs (14) by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/15
* fix: re-add examples folder by JustinTong0323 in https://github.com/ServerlessLLM/ServerlessLLM/pull/18
* feat: add support for update autoscaling config by SiyangShao in https://github.com/ServerlessLLM/ServerlessLLM/pull/21
* [Docs] Add Delete Instructions for Model Deployment in Quickstart Guide (17) by JustinTong0323 in https://github.com/ServerlessLLM/ServerlessLLM/pull/25
* docs: build sllm-store from source by SiyangShao in https://github.com/ServerlessLLM/ServerlessLLM/pull/29
* [Docs] Add Multi-Machine setup guide by JustinTong0323 in https://github.com/ServerlessLLM/ServerlessLLM/pull/30
* feat: change save model path with backend name by SiyangShao in https://github.com/ServerlessLLM/ServerlessLLM/pull/27
* [Doc] Update CONTRIBUTING.md by Chivier in https://github.com/ServerlessLLM/ServerlessLLM/pull/43
* Add essential format and ignore hints by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/28
* [Feat] Enhance deploy functionality to allow overriding default configuration by JustinTong0323 in https://github.com/ServerlessLLM/ServerlessLLM/pull/32
* [Doc] Remove empty pages and update introduction by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/41
* [Docs] Minor fix doc by JustinTong0323 in https://github.com/ServerlessLLM/ServerlessLLM/pull/42
* fix(cli): display model name when deploy using config by JustinTong0323 in https://github.com/ServerlessLLM/ServerlessLLM/pull/45
* feat: storage aware scheduler by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/22
* [Feat] Enhance deploy command to support partial configurations by JustinTong0323 in https://github.com/ServerlessLLM/ServerlessLLM/pull/46
* Add save/load interface for vLLM by drunkcoding in https://github.com/ServerlessLLM/ServerlessLLM/pull/31
* Create LICENSE by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/49
* docs: update README.md by eltociear in https://github.com/ServerlessLLM/ServerlessLLM/pull/56
* fix: vllm model downloader's GPU usage by SiyangShao in https://github.com/ServerlessLLM/ServerlessLLM/pull/53
* docs: update documents by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/50
* chore: fix intro in docs by Chivier in https://github.com/ServerlessLLM/ServerlessLLM/pull/64
* chore: fix document assets position by Chivier in https://github.com/ServerlessLLM/ServerlessLLM/pull/66
* feat: backend unit tests by SiyangShao in https://github.com/ServerlessLLM/ServerlessLLM/pull/55
* Update .gitignore by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/68
* Update issue and PR templates by andrei3131 in https://github.com/ServerlessLLM/ServerlessLLM/pull/52
* docs: minor improvements by JustinTong0323 in https://github.com/ServerlessLLM/ServerlessLLM/pull/51
* Revert "Update issue and PR templates" by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/70
* fix: backend workflow changes by SiyangShao in https://github.com/ServerlessLLM/ServerlessLLM/pull/69
* feat: ServerlessLLM Store c++ unit tests (CPU) by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/58
* Update test_sllm_store.yaml by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/79
* add linter CI by lrq619 in https://github.com/ServerlessLLM/ServerlessLLM/pull/73
* feat: vLLM integration by SiyangShao in https://github.com/ServerlessLLM/ServerlessLLM/pull/61
* [Tests] Add Unit Tests for CLI Commands in sllm-cli by JustinTong0323 in https://github.com/ServerlessLLM/ServerlessLLM/pull/65
* Issue and PR templates by andrei3131 in https://github.com/ServerlessLLM/ServerlessLLM/pull/80
* docs: add tips for usage with vllm by SiyangShao in https://github.com/ServerlessLLM/ServerlessLLM/pull/86
* [test] Gpu workflow by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/85
* fix: add script to apply patch by SiyangShao in https://github.com/ServerlessLLM/ServerlessLLM/pull/87
* fix: patch file and script by SiyangShao in https://github.com/ServerlessLLM/ServerlessLLM/pull/89
* [URGENT] docs: update sllm_store to dev4 by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/91
* Fy/unified model path by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/82
* docs: Add Activate Worker Env before Apply vLLM Patch by SiyangShao in https://github.com/ServerlessLLM/ServerlessLLM/pull/95
* Fix/update net address by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/94
* [test] change the trigger of cli_test by JustinTong0323 in https://github.com/ServerlessLLM/ServerlessLLM/pull/98
* [Fix] Fix gpu unavailable report by SecretSettler in https://github.com/ServerlessLLM/ServerlessLLM/pull/96
* Lrq/publish workflow by lrq619 in https://github.com/ServerlessLLM/ServerlessLLM/pull/88
* Fy/enhanced ci by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/99
* [FIX] vllm model cache by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/101
* fix: support bfloat16 by SiyangShao in https://github.com/ServerlessLLM/ServerlessLLM/pull/102
* New Feature: Supporting Embedding API for transformers backend by SecretSettler in https://github.com/ServerlessLLM/ServerlessLLM/pull/97
* FIX: outlines version by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/103
* [FIX]: checkpoint loader for transformer backend by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/104
* fix: mark model as registered only if model is registered successfully by anyin233 in https://github.com/ServerlessLLM/ServerlessLLM/pull/77
* Fix: Add formatting for commits by drunkcoding in https://github.com/ServerlessLLM/ServerlessLLM/pull/71
* docs: add code of conduct by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/109
* docs: update CONTRIBUTING.md by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/110
* Prepare for 0.5.0 release by future-xy in https://github.com/ServerlessLLM/ServerlessLLM/pull/112
New Contributors
* future-xy made their first contribution in https://github.com/ServerlessLLM/ServerlessLLM/pull/8
* luomai made their first contribution in https://github.com/ServerlessLLM/ServerlessLLM/pull/9
* JustinTong0323 made their first contribution in https://github.com/ServerlessLLM/ServerlessLLM/pull/18
* SiyangShao made their first contribution in https://github.com/ServerlessLLM/ServerlessLLM/pull/21
* Chivier made their first contribution in https://github.com/ServerlessLLM/ServerlessLLM/pull/43
* drunkcoding made their first contribution in https://github.com/ServerlessLLM/ServerlessLLM/pull/31
* eltociear made their first contribution in https://github.com/ServerlessLLM/ServerlessLLM/pull/56
* andrei3131 made their first contribution in https://github.com/ServerlessLLM/ServerlessLLM/pull/52
* lrq619 made their first contribution in https://github.com/ServerlessLLM/ServerlessLLM/pull/73
* SecretSettler made their first contribution in https://github.com/ServerlessLLM/ServerlessLLM/pull/96
* anyin233 made their first contribution in https://github.com/ServerlessLLM/ServerlessLLM/pull/77
**Full Changelog**: https://github.com/ServerlessLLM/ServerlessLLM/commits/v0.5.0