Llama-cpp-cffi

Latest version: v0.4.40

Safety actively analyzes 715174 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 9 of 13

0.2.4

Fixed:
- `sampler_init` because `llama_sampler_init_penalties` in `llama.cpp` changed its behaviour

0.2.3

Changed:
- `llama.cpp` revision `4f51968aca049080dc77e26603aa0681ea77fe45`
- Build process now has global variable `LLAMA_CPP_GIT_REF`

Fixed:
- Issue with Phi 3.5 based models, `tokenizer.decode([new_token_id], clean_up_tokenization_spaces=False)`

0.2.2

Added:
- `Model.free`

Changed:
- Fixed revision of `llama.cpp` for all wheels
- `llama.cpp` revision `c27ac678dd393af0da9b8acf10266e760c8a0912`
- disabled `llama_kv_cache_seq_cp` in `_decode_tokens`

0.2.1

Fixed:
- Batch "decode" process. NOTE: Encode part is missing for encoder-decoder models.
- Thread-safe calls to the most important functions of llama, llava, clip, ggml API.

Removed:
- `mllama_completions` for low-level function for Mllama-based VLMs

0.2.0

Added:
- New high-level Python API
- Low-level C API calls from llama.h, llava.h, clip.h, ggml.h
- `completions` for high-level function for LLMs / VLMs
- `text_completions` for low-level function for LLMs
- `clip_completions` for low-level function for CLIP-based VLMs
- WIP: `mllama_completions` for low-level function for Mllama-based VLMs

Changed:
- All examples

Removed:
- `llama_generate` function
- `llama_cpp_cli`
- `llava_cpp_cli`
- `minicpmv_cpp_cli`

0.1.23

Added:
- Support and examples for `llava` and `minicpmv` models.

Page 9 of 13

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.