Llama-cpp-cffi

Latest version: v0.4.40

Safety actively analyzes 715174 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 8 of 13

0.3.2

Added:
- `server` support for OpenAI extra fields: `grammar`, `json_schema`, `chat_template`

Changed:
- `llama.cpp` revision `0827b2c1da299805288abbd556d869318f2b121e`

0.3.1

Added:
- llama-cpp-cffi server - support for dynamic load/unload of model - hot-swap of models on demand
- llama-cpp-cffi server - compatible with llama.cpp cli options
- llama-cpp-cffi server - limited compatibility for OpenAI API `/v1/chat/completions` for text and vision models
- Support for `CompletionsOptions.messages` for VLM prompts with a single message containing just a pair of `text` and `image_url` in `content`.

Changed:
- `llama.cpp` revision `0827b2c1da299805288abbd556d869318f2b121e`

0.3.0

Added:
- Qwen 2 VL 2B / 7B vision models support
- WIP llama-cpp-cffi server - compatible with llama.cpp cli options instead of OpenAI

Changed:
- `llama.cpp` revision `5896c65232c7dc87d78426956b16f63fbf58dcf6`
- Refactored `Options` class into two separate classes: `ModelOptions`, `CompletionsOptions`

Fixed:
- Llava (moondream2, nanoLLaVA-1.5, llava-v1.6-mistral-7b) vision models support
- MiniCPM-V 2.5 / 2.6 vision models support

Removed:
- Removed ambiguous `Options` class

0.2.7

Changed:
- In `format_messages`, optional `options` argument
- `llama.cpp` revision `081b29bd2a3d91e7772e3910ce223dd63b8d7d26`

0.2.6

Changed:
- `llama.cpp` revision `5437d4aaf5132c879acda0bb67f2f8f71da4c9fe`

0.2.5

Fixed:
- Replaced `tokenizer.decode(new_token_id)` with custom `_common_token_to_piece(context, new_token_id, True)`

Page 8 of 13

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.