Turnkeyml

Latest version: v6.1.3

Safety actively analyzes 723717 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 7

6.1.3

What's Changed

- Fix default model selection on devices that do not support Hybrid (danielholanda)
- Add a new `lemonade-server` CLI for starting server and checking status (danielholanda)

**Full Changelog**: https://github.com/onnx/turnkeyml/compare/v6.1.1...v6.1.3

6.1.1

What's Changed
- Upgrade Ryzen AI SW to version 1.4.0 (amd-pworfolk, jeremyfowers)
- Add DeepSeek Hybrid models to Lemonade Server (danielholanda)
- Refactor the oga-load tool and oga.py (ramkrishna2910)
- Documentation overhaul (vgodsoe)
- New Lemonade Server demos:
- CodeGPT (vgodsoe)
- Microsoft AI Toolkit (danielholanda)
- Fixes:
- Make sure that OGA models use their chat template in Lemonade Server (danielholanda)
- Lemonade API can load checkpoints from folders on disk using `lemonade.api.from_pretrained()` (amd-pworfolk)


**Full Changelog**: https://github.com/onnx/turnkeyml/compare/v6.0.3...v6.1.1

6.0.3

Breaking Changes

OpenAI-Compatible Server Model Selection

Lemonade's server now requires models to be downloaded at install time. Apps that use our installer in silent mode now have to specify which models to download. See [docs/lemonade/server_integration.md](https://github.com/onnx/turnkeyml/blob/release_603/docs/lemonade/server_integration.md) for details.

Summary of Contributions

- Add guide on how to use Continue app with Lemonade Server (jeremyfowers)
- Overhaul the lemonade help menu (jeremyfowers)
- Stop importing tkml CLI in lemonade CLI (jeremyfowers)
- Only show hybrid models when Hybrid is available (danielholanda)
- Fix oga seed and avoid default params from being overwriten (jeremyfowers)
- Improve Server Integration Documentation (danielholanda)
- Add exception handler for server's generate thread (jeremyfowers)
- Improve server logger in debug mode (jeremyfowers)
- Added mmlu accuracy test command format (vgodsoe)

6.0.2

What's Changed

- Add the "echo" parameter to OpenAI completions (danielholanda)
- New dedicated report tool for LLM CSVs, as well as ASCII tables (amd-pworfolk)
- Properly raise and transmit server model load failures (jeremyfowers)
- Add documentation for Lemonade_Server_Installer.exe (jeremyfowers)
- Add telemetry to server: performance, input tokens, output tokens, and prompt tracing (danielholanda)
- Ensure that Ryzen AI Hybrid support is not installed on incompatible devices (danielholanda)


**Full Changelog**: https://github.com/onnx/turnkeyml/compare/v6.0.1...v6.0.2

6.0.1

Summary

This update extends OpenAI-compatible endpoints and enhances server reliability.

Summary of Contributions

- Significantly improve server reliability by avoiding race conditions (danielholanda)
- Curate list of Hybrid models shared in /models server endpoint (danielholanda)
- Increase the server's max new tokens default value to 1500 (jeremyfowers)
- Avoid sudden closure on server startup (danielholanda )
- Extend OpenAI-compatible endpoints: `stop` parameter and `/completions` endpoint (danielholanda )
- Fix server test name collision, add no-op test, and update black (jeremyfowers)

**Full Changelog**: https://github.com/onnx/turnkeyml/compare/v6.0.0...v6.0.1

6.0.0

Summary

This is a major release that introduces an OpenAI-compatible server in a completely new `serve` tool, support for Quark quantization in the new `quark` tool, and many other fixes/improvements.

Breaking Changes

New OpenAI-Compatible Server

The previous `serve` `Tool` has been replaced by a new standalone serving command. This new server has OpenAI API compatibility and will add Ollama compatibility in the near future.
- Old usage: `lemoande -i CHECKPOINT oga-load --args serve`
- New usage: `lemonade serve`, then use REST APIs to control model loading, completions, etc. See https://github.com/onnx/turnkeyml/blob/main/docs/lemonade/server_spec.md to learn more.

The server can also be installed and used with no-code by running `Lemonade_Server_Installer.exe`, which is provided as a release asset in this and all future releases.

The server code was also moved out of tools/chat.py into its own file in tools/serve.py. We also renamed chat.py to prompt.py for clarity, since that file now only contains the prompting tool.

The LEAP name has been deprecated

In the interest of reducing naming confusion, the "LEAP API" is now simply the "high-level lemonade API".
- Old usage: `from lemonade.leap import from_pretrained`
- New usage: `from lemonade.api import from_pretrained`

Summary of Contributions

- The base checkpoint for models is retrieved from the Hugging Face API at loading time (ramkrishna2910)
- The benchmarking tools (huggingface-bench, oga-bench, and llamacpp-bench) have been refactored to reduce code duplication and improve maintainability. They now also support a list of prompts (or prompt lengths) to be benchmarked: `--prompts 128 256 512` (amd-pworfolk)
- The `avg_accuracy` stats has been renamed to `average_mmlu_accuracy` for clarity with respect to non-MMLU accuracy tests (jeremyfowers), (attn apsonawane)
- Introduce `Lemonade_Server_Installer.exe` (jeremyfowers)
- Implement an OpenAI-compatible server and remove the old `serve` tool (danielholanda)
- Rename `chat` module to `prompt` (jeremyfowers)
- Improved lemonade getting started documentation and remove the "LEAP" branding (jeremyfowers)
- OGA 0.6.0 is the default package for CPU, CUDA, and DML (jeremyfowers)
- Add support for Quark quantization with a new `quark-quantize` tool (iswaryaalex)
- Clean up the lemonade getting started docs and remove some deprecated tools (jeremyfowers)

New Contributors

- iswaryaalex made their first contribution in 290


**Full Changelog**: https://github.com/onnx/turnkeyml/compare/v5.1.1...v6.0.0

Page 1 of 7

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.