openllm Changelog

0.4.44

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.44 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)

What's Changed
* fix: remove vllm dependency for pytorch bento by larme in https://github.com/bentoml/OpenLLM/pull/893

**Full Changelog**: https://github.com/bentoml/OpenLLM/compare/v0.4.43...v0.4.44

0.4.43

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.43 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)

What's Changed
* fix: limit BentoML version range by larme in https://github.com/bentoml/OpenLLM/pull/881
* chore: bump up bentoml version to 1.1.11 by larme in https://github.com/bentoml/OpenLLM/pull/883
* Bump BentoML version in tools by larme in https://github.com/bentoml/OpenLLM/pull/884

**Full Changelog**: https://github.com/bentoml/OpenLLM/compare/v0.4.42...v0.4.43

0.4.42

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.42 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)

What's Changed
* docs: Update opt example to ms-phi by Sherlock113 in https://github.com/bentoml/OpenLLM/pull/805
* chore(script): run vendored scripts by aarnphm in https://github.com/bentoml/OpenLLM/pull/808
* docs: README.md typo by weibeu in https://github.com/bentoml/OpenLLM/pull/819
* ci: pre-commit autoupdate [pre-commit.ci] by pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/818
* chore(deps): bump docker/metadata-action from 5.3.0 to 5.4.0 by dependabot in https://github.com/bentoml/OpenLLM/pull/814
* chore(deps): bump taiki-e/install-action from 2.22.5 to 2.23.1 by dependabot in https://github.com/bentoml/OpenLLM/pull/813
* chore(deps): bump github/codeql-action from 3.22.11 to 3.22.12 by dependabot in https://github.com/bentoml/OpenLLM/pull/815
* ci: pre-commit autoupdate [pre-commit.ci] by pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/825
* chore(deps): bump crazy-max/ghaction-import-gpg from 6.0.0 to 6.1.0 by dependabot in https://github.com/bentoml/OpenLLM/pull/824
* chore(deps): bump taiki-e/install-action from 2.23.1 to 2.23.7 by dependabot in https://github.com/bentoml/OpenLLM/pull/823
* docs: Add Llamaindex in freedom to build by Sherlock113 in https://github.com/bentoml/OpenLLM/pull/826
* ci: pre-commit autoupdate [pre-commit.ci] by pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/836
* chore(deps): bump docker/metadata-action from 5.4.0 to 5.5.0 by dependabot in https://github.com/bentoml/OpenLLM/pull/834
* chore(deps): bump aquasecurity/trivy-action from 0.16.0 to 0.16.1 by dependabot in https://github.com/bentoml/OpenLLM/pull/832
* chore(deps): bump taiki-e/install-action from 2.23.7 to 2.24.1 by dependabot in https://github.com/bentoml/OpenLLM/pull/833
* chore(deps): bump vllm to 0.2.7 by aarnphm in https://github.com/bentoml/OpenLLM/pull/837
* chore: update discord link by aarnphm in https://github.com/bentoml/OpenLLM/pull/838
* improv(package): use python slim base image and let pytorch install cuda by larme in https://github.com/bentoml/OpenLLM/pull/807
* fix(dockerfile): conflict deps by aarnphm in https://github.com/bentoml/OpenLLM/pull/841
* chore: fix typo in list_models pydoc by fuzzie360 in https://github.com/bentoml/OpenLLM/pull/847
* docs: update README.md telemetry code link by fuzzie360 in https://github.com/bentoml/OpenLLM/pull/842
* chore(deps): bump taiki-e/install-action from 2.24.1 to 2.25.1 by dependabot in https://github.com/bentoml/OpenLLM/pull/846
* chore(deps): bump github/codeql-action from 3.22.12 to 3.23.0 by dependabot in https://github.com/bentoml/OpenLLM/pull/844
* ci: pre-commit autoupdate [pre-commit.ci] by pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/848
* ci: pre-commit autoupdate [pre-commit.ci] by pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/858
* chore(deps): bump taiki-e/install-action from 2.25.1 to 2.25.9 by dependabot in https://github.com/bentoml/OpenLLM/pull/856
* chore(deps): bump github/codeql-action from 3.23.0 to 3.23.1 by dependabot in https://github.com/bentoml/OpenLLM/pull/855
* fix: proper SSE handling for vllm by larme in https://github.com/bentoml/OpenLLM/pull/877
* chore: set stop to empty list by default by larme in https://github.com/bentoml/OpenLLM/pull/878
* fix: all runners sse output by larme in https://github.com/bentoml/OpenLLM/pull/880

New Contributors
* weibeu made their first contribution in https://github.com/bentoml/OpenLLM/pull/819
* fuzzie360 made their first contribution in https://github.com/bentoml/OpenLLM/pull/847

**Full Changelog**: https://github.com/bentoml/OpenLLM/compare/v0.4.41...v0.4.42

0.4.41

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.41 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)

What's Changed
* docs: add notes about dtypes usage. by aarnphm in https://github.com/bentoml/OpenLLM/pull/786
* chore(deps): bump taiki-e/install-action from 2.22.0 to 2.22.5 by dependabot in https://github.com/bentoml/OpenLLM/pull/790
* chore(deps): bump github/codeql-action from 2.22.9 to 3.22.11 by dependabot in https://github.com/bentoml/OpenLLM/pull/794
* chore(deps): bump sigstore/cosign-installer from 3.2.0 to 3.3.0 by dependabot in https://github.com/bentoml/OpenLLM/pull/793
* chore(deps): bump actions/download-artifact from 3.0.2 to 4.0.0 by dependabot in https://github.com/bentoml/OpenLLM/pull/791
* chore(deps): bump actions/upload-artifact from 3.1.3 to 4.0.0 by dependabot in https://github.com/bentoml/OpenLLM/pull/792
* ci: pre-commit autoupdate [pre-commit.ci] by pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/796
* fix(cli): avoid runtime `__origin__` check for older Python by aarnphm in https://github.com/bentoml/OpenLLM/pull/798
* feat(vllm): support GPTQ with 0.2.6 by aarnphm in https://github.com/bentoml/OpenLLM/pull/797
* fix(ci): lock to v3 iteration of `actions/artifacts` workflow by aarnphm in https://github.com/bentoml/OpenLLM/pull/799

**Full Changelog**: https://github.com/bentoml/OpenLLM/compare/v0.4.40...v0.4.41

0.4.40

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.40 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)

What's Changed
* fix(infra): conform ruff to 150 LL by aarnphm in https://github.com/bentoml/OpenLLM/pull/781
* infra: update blame ignore to formatter hash by aarnphm in https://github.com/bentoml/OpenLLM/pull/782
* perf: upgrade mixtral to use expert parallelism by aarnphm in https://github.com/bentoml/OpenLLM/pull/783

**Full Changelog**: https://github.com/bentoml/OpenLLM/compare/v0.4.39...v0.4.40

0.4.39

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.39 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)

What's Changed
* fix(logprobs): correct check logprobs by aarnphm in https://github.com/bentoml/OpenLLM/pull/779

**Full Changelog**: https://github.com/bentoml/OpenLLM/compare/v0.4.38...v0.4.39

Openllm

Page 6 of 23