Improvements 🚀
* `nexa eval` command now supports evaluating memory usage, latency, and energy consumption ([166](https://github.com/NexaAI/nexa-sdk/pull/166))
Upgrade Guide 📝
To upgrade the Nexa SDK, use the command for your system:
CPU
bash
pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (Metal)
For the GPU version supporting **Metal (macOS)**:
bash
CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (CUDA)
For **Linux**:
bash
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For **Windows PowerShell**:
bash
$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For **Windows Command Prompt**:
bash
set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For **Windows Git Bash**:
bash
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (ROCm)
For **Linux**:
bash
CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir
For detailed installation instructions, please refer to the [Installation section](https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file#installation) in the README.
[Full Changelog - v0.0.8.7...v0.0.8.8](https://github.com/NexaAI/nexa-sdk/compare/v0.0.8.7...v0.0.8.8)
v0.0.8.7-rocm621
What's New ✨
* Support for running models from user's local path ([151](https://github.com/NexaAI/nexa-sdk/pull/151))
* See details in [CLI doc](https://github.com/NexaAI/nexa-sdk/blob/main/CLI.md#run-a-model) and [Server doc](https://github.com/NexaAI/nexa-sdk/blob/main/SERVER.md#start-local-server)
* Run a NLP model from local path: `nexa run ../models/gemma-1.1-2b-instruct-q4_0.gguf -lp -mt NLP`
* Start a multimodal model server from a local directory: `nexa server ../models/llava-v1.6-vicuna-7b/ -lp -mt MULTIMODAL`
* Embedding models support ([159](https://github.com/NexaAI/nexa-sdk/pull/159))
* See details in [**nexa embed**](https://github.com/NexaAI/nexa-sdk/blob/main/CLI.md#generate-embeddings)
* Quick example: `nexa embed nomic "Advancing on-device AI, together." >> generated_embeddings.txt` (This command generates embeddings for the text "Advancing on-device AI, together." using the Nomic model and appends the result to a file named generated_embeddings.txt)
* VLM models support in /v1/chat/completions ([154](https://github.com/NexaAI/nexa-sdk/pull/154))
* See details in [Server doc](https://github.com/NexaAI/nexa-sdk/blob/main/SERVER.md#2-chat-completions-v1chatcompletions)
* Support for running model evaluation on your device ([150](https://github.com/NexaAI/nexa-sdk/pull/150))
Improvements 🚀
* Customizable maximum context window (--nctx) for NLP and VLM models: ([155](https://github.com/NexaAI/nexa-sdk/pull/155) and [#158](https://github.com/NexaAI/nexa-sdk/pull/158))
* CV models now supported when running with -hf flag ([151](https://github.com/NexaAI/nexa-sdk/pull/151))
* Pull and run a CV model from Hugging Face: `nexa run -hf Steward/lcm-dreamshaper-v7-gguf -mt COMPUTER_VISION`
Fixes 🐞
* Fixed streaming issues with /v1/chat/completions: ([152](https://github.com/NexaAI/nexa-sdk/pull/152))
* Resolved download problems on macOS and Windows: ([146](https://github.com/NexaAI/nexa-sdk/pull/146))
Upgrade Guide 📝
To upgrade the Nexa SDK, use the command for your system:
CPU
bash
pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (Metal)
For the GPU version supporting **Metal (macOS)**:
bash
CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (CUDA)
For **Linux**:
bash
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For **Windows PowerShell**:
bash
$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For **Windows Command Prompt**:
bash
set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For **Windows Git Bash**:
bash
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (ROCm)
For **Linux**:
bash
CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir
For detailed installation instructions, please refer to the [Installation section](https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file#installation) in the README.
[Full Changelog - v0.0.8.6.1...v0.0.8.7](https://github.com/NexaAI/nexa-sdk/compare/v0.0.8.6.1...v0.0.8.7)
v0.0.8.7-metal
What's New ✨
* Support for running models from user's local path ([151](https://github.com/NexaAI/nexa-sdk/pull/151))
* See details in [CLI doc](https://github.com/NexaAI/nexa-sdk/blob/main/CLI.md#run-a-model) and [Server doc](https://github.com/NexaAI/nexa-sdk/blob/main/SERVER.md#start-local-server)
* Run a NLP model from local path: `nexa run ../models/gemma-1.1-2b-instruct-q4_0.gguf -lp -mt NLP`
* Start a multimodal model server from a local directory: `nexa server ../models/llava-v1.6-vicuna-7b/ -lp -mt MULTIMODAL`
* Embedding models support ([159](https://github.com/NexaAI/nexa-sdk/pull/159))
* See details in [**nexa embed**](https://github.com/NexaAI/nexa-sdk/blob/main/CLI.md#generate-embeddings)
* Quick example: `nexa embed nomic "Advancing on-device AI, together." >> generated_embeddings.txt` (This command generates embeddings for the text "Advancing on-device AI, together." using the Nomic model and appends the result to a file named generated_embeddings.txt)
* VLM models support in /v1/chat/completions ([154](https://github.com/NexaAI/nexa-sdk/pull/154))
* See details in [Server doc](https://github.com/NexaAI/nexa-sdk/blob/main/SERVER.md#2-chat-completions-v1chatcompletions)
* Support for running model evaluation on your device ([150](https://github.com/NexaAI/nexa-sdk/pull/150))
Improvements 🚀
* Customizable maximum context window (--nctx) for NLP and VLM models: ([155](https://github.com/NexaAI/nexa-sdk/pull/155) and [#158](https://github.com/NexaAI/nexa-sdk/pull/158))
* CV models now supported when running with -hf flag ([151](https://github.com/NexaAI/nexa-sdk/pull/151))
* Pull and run a CV model from Hugging Face: `nexa run -hf Steward/lcm-dreamshaper-v7-gguf -mt COMPUTER_VISION`
Fixes 🐞
* Fixed streaming issues with /v1/chat/completions: ([152](https://github.com/NexaAI/nexa-sdk/pull/152))
* Resolved download problems on macOS and Windows: ([146](https://github.com/NexaAI/nexa-sdk/pull/146))
Upgrade Guide 📝
To upgrade the Nexa SDK, use the command for your system:
CPU
bash
pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (Metal)
For the GPU version supporting **Metal (macOS)**:
bash
CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (CUDA)
For **Linux**:
bash
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For **Windows PowerShell**:
bash
$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For **Windows Command Prompt**:
bash
set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For **Windows Git Bash**:
bash
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (ROCm)
For **Linux**:
bash
CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir
For detailed installation instructions, please refer to the [Installation section](https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file#installation) in the README.
[Full Changelog - v0.0.8.6.1...v0.0.8.7](https://github.com/NexaAI/nexa-sdk/compare/v0.0.8.6.1...v0.0.8.7)
v0.0.8.7-cu124
What's New ✨
* Support for running models from user's local path ([151](https://github.com/NexaAI/nexa-sdk/pull/151))
* See details in [CLI doc](https://github.com/NexaAI/nexa-sdk/blob/main/CLI.md#run-a-model) and [Server doc](https://github.com/NexaAI/nexa-sdk/blob/main/SERVER.md#start-local-server)
* Run a NLP model from local path: `nexa run ../models/gemma-1.1-2b-instruct-q4_0.gguf -lp -mt NLP`
* Start a multimodal model server from a local directory: `nexa server ../models/llava-v1.6-vicuna-7b/ -lp -mt MULTIMODAL`
* Embedding models support ([159](https://github.com/NexaAI/nexa-sdk/pull/159))
* See details in [**nexa embed**](https://github.com/NexaAI/nexa-sdk/blob/main/CLI.md#generate-embeddings)
* Quick example: `nexa embed nomic "Advancing on-device AI, together." >> generated_embeddings.txt` (This command generates embeddings for the text "Advancing on-device AI, together." using the Nomic model and appends the result to a file named generated_embeddings.txt)
* VLM models support in /v1/chat/completions ([154](https://github.com/NexaAI/nexa-sdk/pull/154))
* See details in [Server doc](https://github.com/NexaAI/nexa-sdk/blob/main/SERVER.md#2-chat-completions-v1chatcompletions)
* Support for running model evaluation on your device ([150](https://github.com/NexaAI/nexa-sdk/pull/150))
Improvements 🚀
* Customizable maximum context window (--nctx) for NLP and VLM models: ([155](https://github.com/NexaAI/nexa-sdk/pull/155) and [#158](https://github.com/NexaAI/nexa-sdk/pull/158))
* CV models now supported when running with -hf flag ([151](https://github.com/NexaAI/nexa-sdk/pull/151))
* Pull and run a CV model from Hugging Face: `nexa run -hf Steward/lcm-dreamshaper-v7-gguf -mt COMPUTER_VISION`
Fixes 🐞
* Fixed streaming issues with /v1/chat/completions: ([152](https://github.com/NexaAI/nexa-sdk/pull/152))
* Resolved download problems on macOS and Windows: ([146](https://github.com/NexaAI/nexa-sdk/pull/146))
Upgrade Guide 📝
To upgrade the Nexa SDK, use the command for your system:
CPU
bash
pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cpu --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (Metal)
For the GPU version supporting **Metal (macOS)**:
bash
CMAKE_ARGS="-DGGML_METAL=ON -DSD_METAL=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/metal --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (CUDA)
For **Linux**:
bash
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For **Windows PowerShell**:
bash
$env:CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON"; pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For **Windows Command Prompt**:
bash
set CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" & pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
For **Windows Git Bash**:
bash
CMAKE_ARGS="-DGGML_CUDA=ON -DSD_CUBLAS=ON" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/cu124 --extra-index-url https://pypi.org/simple --no-cache-dir
GPU (ROCm)
For **Linux**:
bash
CMAKE_ARGS="-DGGML_HIPBLAS=on" pip install -U nexaai --prefer-binary --index-url https://nexaai.github.io/nexa-sdk/whl/rocm621 --extra-index-url https://pypi.org/simple --no-cache-dir
For detailed installation instructions, please refer to the [Installation section](https://github.com/NexaAI/nexa-sdk?tab=readme-ov-file#installation) in the README.
[Full Changelog - v0.0.8.6.1...v0.0.8.7](https://github.com/NexaAI/nexa-sdk/compare/v0.0.8.6.1...v0.0.8.7)