<img width="1548" alt="Screenshot 2024-08-06 at 8 16 44 PM" src="https://github.com/user-attachments/assets/9a53a40e-4649-4d67-8433-052a5941a5b6">
New embedding models
* [BGE-M3](https://ollama.com/library/bge-m3): a large embedding model from BAAI distinguished for its versatility in Multi-Functionality, Multi-Linguality, and Multi-Granularity.
* [BGE-Large](https://ollama.com/library/bge-large): a large embedding model trained in english.
* [Paraphrase-Multilingual](https://ollama.com/library/paraphrase-multilingual): A multilingual embedding model trained on parallel data for 50+ languages.
New embedding API with batch support
Ollama now supports a new API endpoint `/api/embed` for embedding generation:
curl http://localhost:11434/api/embed -d '{
"model": "all-minilm",
"input": ["Why is the sky blue?", "Why is the grass green?"]
}'
This API endpoint supports new features:
* **Batches**: generate embeddings for several documents in one request
* **Normalized embeddings**: embeddings are now normalized, improving similarity results
* **Truncation**: a new `truncate` parameter that will error if set to `false`
* **Metrics**: responses include `load_duration`, `total_duration` and `prompt_eval_count` metrics
See the [API documentation](https://github.com/ollama/ollama/blob/main/docs/api.md#generate-embeddings) for more details and examples.
What's Changed
* Fixed initial slow download speeds on Windows
* NUMA support will now be autodetected by Ollama to improve performance
* Fixed issue where the `/api/embed` would sometimes return embedding results out of order
New Contributors
* av made their first contribution in https://github.com/ollama/ollama/pull/6147
* sryu1 made their first contribution in https://github.com/ollama/ollama/pull/6151
* rick-github made their first contribution in https://github.com/ollama/ollama/pull/6154
**Full Changelog**: https://github.com/ollama/ollama/compare/v0.3.3...v0.3.4