Llama-cpp-python

Latest version: v0.3.5

Safety actively analyzes 688600 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 22

0.3.5

- feat: Update llama.cpp to ggerganov/llama.cpp26a8406ba9198eb6fdd8329fa717555b4f77f05f
- fix(ci): Fix release by updating macos runner image to non-deprecated version by abetlen in afedfc888462f9a6e809dc9455eb3b663764cc3f
- fix(server): add missing await statements for async exit_stack handling by gjpower in 1858

0.3.4

- fix(ci): Build wheels for macos 13-15, cuda 12.1-12.4 by abetlen in ca808028bd16b8327bd84128d48015a4b1304690

0.3.3

- feat: Update llama.cpp to ggerganov/llama.cppce8784bdb153ff7794dde5a50b0ebfa51baa6171
- fix: chat API logprobs format by domdomegg in 1788
- feat: Add support for CUDA 12.6, fix CUDA 12.5 by Smartappli in 1775
- fix: Make content not required in ChatCompletionRequestAssistantMessage by feloy in 1807
- fix: Fix pickling of Llama class by setting seed from _seed member by abetlen in 2523472c3eccb9ab9277117cc4ff705212b6888a
- fix: Fix logit-bias type hint by ddh0 in 1802
- fix(server): Avoid thread starvation on many concurrent requests by making use of asyncio to lock llama_proxy context by gjpower in 1798
- fix(server): Added missing exit_stack.close() to /v1/chat/completions by Ian321 in 1796
- fix(examples): Refactor Batching notebook to use new sampler chain API by lukestanley in 1793
- fix(docs): Update development instructions by Florents-Tselai in 1833
- fix(docs): Remove ref to llama_eval in llama_cpp.py docs by richdougherty in 1819

0.3.2

- feat: Update llama.cpp to ggerganov/llama.cpp74d73dc85cc2057446bf63cc37ff649ae7cebd80

0.3.1

- feat: Update llama.cpp to ggerganov/llama.cppc919d5db39c8a7fcb64737f008e4b105ee0acd20
- feat: Expose libggml in internal APIs by abetlen in 1761
- fix: Fix speculative decoding by abetlen in 9992c5084a3df2f533e265d10f81d4269b97a1e6 and e975dabf74b3ad85689c9a07719cbb181313139b
- misc: Rename all_text to remaining_text by xu-song in 1658

0.3.0

- feat: Update llama.cpp to ggerganov/llama.cppea9c32be71b91b42ecc538bd902e93cbb5fb36cb
- feat: Enable detokenizing special tokens with special=True by benniekiss in 1596
- feat(ci): Speed up CI workflows using uv, add support for CUDA 12.5 wheels by Smartappli in e529940f45d42ed8aa31334123b8d66bc67b0e78
- feat: Add loading sharded GGUF files from HuggingFace with Llama.from_pretrained(additional_files=[...]) by Gnurro in 84c092063e8f222758dd3d60bdb2d1d342ac292e
- feat: Add option to configure n_ubatch by abetlen in 6c44a3f36b089239cb6396bb408116aad262c702
- feat: Update sampling API for llama.cpp. Sampling now uses sampler chain by abetlen in f8fcb3ea3424bcfba3a5437626a994771a02324b
- fix: Don't store scores internally unless logits_all=True. Reduces memory requirements for large context by abetlen in 29afcfdff5e75d7df4c13bad0122c98661d251ab
- fix: Fix memory allocation of ndarray in by xu-song in 1704
- fix: Use system message in og qwen format by abetlen in 98eb092d3c6e7c142c4ba2faaca6c091718abbb3

Page 1 of 22

Releases

Has known vulnerabilities

Llama-cpp-python

Page 1 of 22

0.3.5

0.3.4

0.3.3

0.3.2

0.3.1

0.3.0

Page 1 of 22

Links

Releases