OpenLLM now release a base container containing all compiled kernels, removing the needs for building kernels with `openllm build` when using vLLM or auto-gptq
vLLM supports (experimental)
Currently, only OPT and Llama 2 supports vLLM. Simply use `OPENLLM_LLAMA_FRAMEWORK=vllm` to startup openllm runners with vllm.
Installation
bash
0.2.11
Usage
All available models: openllm models
To start a LLM: python -m openllm start opt
Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)
What's Changed * fix(ci): correct tag for checkout by aarnphm in https://github.com/bentoml/OpenLLM/pull/150 * fix: disable auto fixes by aarnphm in https://github.com/bentoml/OpenLLM/pull/151 * chore: add nous to example default id as non-gated Llama by aarnphm in https://github.com/bentoml/OpenLLM/pull/152 * feat: supports embeddings for T5 and ChatGLM family generation by aarnphm in https://github.com/bentoml/OpenLLM/pull/153
Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)
What's Changed * feat(ci): automatic release semver + git archival installation by aarnphm in https://github.com/bentoml/OpenLLM/pull/143 * docs: remove extraneous whitespace by aarnphm in https://github.com/bentoml/OpenLLM/pull/144 * docs: update fine tuning model support by aarnphm in https://github.com/bentoml/OpenLLM/pull/145 * fix(build): running from container choosing models correctly by aarnphm in https://github.com/bentoml/OpenLLM/pull/141 * feat(client): embeddings by aarnphm in https://github.com/bentoml/OpenLLM/pull/146
Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)
What's Changed * ci: release python earlier than building binary wheels by aarnphm in https://github.com/bentoml/OpenLLM/pull/138 * docs: Update README.md by parano in https://github.com/bentoml/OpenLLM/pull/139