openllm Changelog

0.4.41

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.41 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)

What's Changed
* docs: add notes about dtypes usage. by aarnphm in https://github.com/bentoml/OpenLLM/pull/786
* chore(deps): bump taiki-e/install-action from 2.22.0 to 2.22.5 by dependabot in https://github.com/bentoml/OpenLLM/pull/790
* chore(deps): bump github/codeql-action from 2.22.9 to 3.22.11 by dependabot in https://github.com/bentoml/OpenLLM/pull/794
* chore(deps): bump sigstore/cosign-installer from 3.2.0 to 3.3.0 by dependabot in https://github.com/bentoml/OpenLLM/pull/793
* chore(deps): bump actions/download-artifact from 3.0.2 to 4.0.0 by dependabot in https://github.com/bentoml/OpenLLM/pull/791
* chore(deps): bump actions/upload-artifact from 3.1.3 to 4.0.0 by dependabot in https://github.com/bentoml/OpenLLM/pull/792
* ci: pre-commit autoupdate [pre-commit.ci] by pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/796
* fix(cli): avoid runtime `__origin__` check for older Python by aarnphm in https://github.com/bentoml/OpenLLM/pull/798
* feat(vllm): support GPTQ with 0.2.6 by aarnphm in https://github.com/bentoml/OpenLLM/pull/797
* fix(ci): lock to v3 iteration of `actions/artifacts` workflow by aarnphm in https://github.com/bentoml/OpenLLM/pull/799

**Full Changelog**: https://github.com/bentoml/OpenLLM/compare/v0.4.40...v0.4.41

0.4.40

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.40 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)

What's Changed
* fix(infra): conform ruff to 150 LL by aarnphm in https://github.com/bentoml/OpenLLM/pull/781
* infra: update blame ignore to formatter hash by aarnphm in https://github.com/bentoml/OpenLLM/pull/782
* perf: upgrade mixtral to use expert parallelism by aarnphm in https://github.com/bentoml/OpenLLM/pull/783

**Full Changelog**: https://github.com/bentoml/OpenLLM/compare/v0.4.39...v0.4.40

0.4.39

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.39 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)

What's Changed
* fix(logprobs): correct check logprobs by aarnphm in https://github.com/bentoml/OpenLLM/pull/779

**Full Changelog**: https://github.com/bentoml/OpenLLM/compare/v0.4.38...v0.4.39

0.4.38

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.38 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)

What's Changed
* fix(mixtral): correct chat templates to remove additional spacing by aarnphm in https://github.com/bentoml/OpenLLM/pull/774
* fix(cli): correct set arguments for `openllm import` and `openllm build` by aarnphm in https://github.com/bentoml/OpenLLM/pull/775
* fix(mixtral): setup hack atm to load weights from pt specifically instead of safetensors by aarnphm in https://github.com/bentoml/OpenLLM/pull/776

**Full Changelog**: https://github.com/bentoml/OpenLLM/compare/v0.4.37...v0.4.38

0.4.37

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.37 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)

What's Changed
* feat(mixtral): correct support for mixtral by aarnphm in https://github.com/bentoml/OpenLLM/pull/772
* chore: running all script when installation by aarnphm in https://github.com/bentoml/OpenLLM/pull/773

**Full Changelog**: https://github.com/bentoml/OpenLLM/compare/v0.4.36...v0.4.37

0.4.36

Usage

All available models: openllm models

To start a LLM: python -m openllm start HuggingFaceH4/zephyr-7b-beta

To run OpenLLM within a container environment (requires GPUs): docker run --gpus all -it -P -v $PWD/data:$HOME/.cache/huggingface/ ghcr.io/bentoml/openllm:0.4.36 start HuggingFaceH4/zephyr-7b-beta

Find more information about this release in the [CHANGELOG.md](https://github.com/bentoml/OpenLLM/blob/main/CHANGELOG.md)

What's Changed
* feat(openai): supports echo by aarnphm in https://github.com/bentoml/OpenLLM/pull/760
* fix(openai): logprobs when echo is enabled by aarnphm in https://github.com/bentoml/OpenLLM/pull/761
* ci: pre-commit autoupdate [pre-commit.ci] by pre-commit-ci in https://github.com/bentoml/OpenLLM/pull/767
* chore(deps): bump docker/metadata-action from 5.2.0 to 5.3.0 by dependabot in https://github.com/bentoml/OpenLLM/pull/766
* chore(deps): bump actions/setup-python from 4.7.1 to 5.0.0 by dependabot in https://github.com/bentoml/OpenLLM/pull/765
* chore(deps): bump taiki-e/install-action from 2.21.26 to 2.22.0 by dependabot in https://github.com/bentoml/OpenLLM/pull/764
* chore(deps): bump aquasecurity/trivy-action from 0.14.0 to 0.16.0 by dependabot in https://github.com/bentoml/OpenLLM/pull/763
* chore(deps): bump github/codeql-action from 2.22.8 to 2.22.9 by dependabot in https://github.com/bentoml/OpenLLM/pull/762
* feat: mixtral support by aarnphm in https://github.com/bentoml/OpenLLM/pull/770

**Full Changelog**: https://github.com/bentoml/OpenLLM/compare/v0.4.35...v0.4.36

Openllm

Page 7 of 24