![vision](https://github.com/ollama/ollama/assets/251292/cb513ae7-a9c2-4a2b-891b-fd79bf997f86)
New vision models
The [LLaVA](https://ollama.ai/library/llava) model family on Ollama has been updated to [version 1.6](https://llava-vl.github.io/blog/2024-01-30-llava-1-6/), and now includes a new `34b` version:
- `ollama run llava` A new 7B LLaVA model based on mistral.
- `ollama run llava:13b` 13B LLaVA model
- `ollama run llava:34b` 34B LLaVA model – one of the most powerful open-source vision models available
These new models share new improvements:
- **More permissive licenses:** LLaVA 1.6 models are distributed via the Apache 2.0 license or the LLaMA 2 Community License.
- **Higher image resolution:** support for up to 4x more pixels, allowing the model to grasp more details.
- **Improved text recognition and reasoning capabilities:** these models are trained on additional document, chart and diagram data sets.
`keep_alive` parameter: control how long models stay loaded
When making API requests, the new `keep_alive` parameter can be used to control how long a model stays loaded in memory:
shell
curl http://localhost:11434/api/generate -d '{
"model": "mistral",
"prompt": "Why is the sky blue?",
"keep_alive": "30s"
}'
* If set to a positive duration (e.g. `20m`, `1hr` or `30`), the model will stay loaded for the provided duration
* If set to a negative duration (e.g. `-1`), the model will stay loaded indefinitely
* If set to `0`, the model will be unloaded immediately once finished
* If not set, the model will stay loaded for 5 minutes by default
Support for more Nvidia GPUs
* GeForce GTX `TITAN X` `980 Ti` `980` `970` `960` `950` `750 Ti` `750`
* GeForce GTX `980M` `970M` `965M` `960M` `950M` `860M` `850M`
* GeForce `940M` `930M` `910M` `840M` `830M`
* Quadro `M6000` `M5500M` `M5000` `M2200` `M1200` `M620` `M520`
* Tesla `M60` `M40`
* NVS `810`
What's Changed
* New `keep_alive` API parameter to control how long models stay loaded
* Image paths can now be provided to `ollama run` when running multimodal models
* Fixed issue where downloading models via `ollama pull` would slow down to 99%
* Fixed error when running Ollama with Nvidia GPUs and CPUs without AVX instructions
* Support for additional Nvidia GPUs (compute capability 5)
* Fixed issue where system prompt would be repeated in subsequent messages
* `ollama serve` will now print prompt when `OLLAMA_DEBUG=1` is set
* Fixed issue where exceeding context size would cause erroneous responses in `ollama run` and the `/api/chat` API
* `ollama run` will now allow sending messages without images to multimodal models
New Contributors
* jaglinux made their first contribution in https://github.com/ollama/ollama/pull/2224
* textspur made their first contribution in https://github.com/ollama/ollama/pull/2252
* rjmacarthy made their first contribution in https://github.com/ollama/ollama/pull/1950
* hugo53 made their first contribution in https://github.com/ollama/ollama/pull/1957
* RussellCanfield made their first contribution in https://github.com/ollama/ollama/pull/2313
**Full Changelog**: https://github.com/ollama/ollama/compare/v0.1.22...v0.1.23