What's Changed
* anthropic prompt caching cost tracking by krrishdholakia in https://github.com/BerriAI/litellm/pull/5453
* [Feat-Proxy] track spend logs for vertex pass through endpoints by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5457
* [Feat] New Provider - Add Cerebras AI API by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5461
* [Feat - Prometheus] - Track error_code, model metric by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5463
* Minor LiteLLM Fixes and Improvements by krrishdholakia in https://github.com/BerriAI/litellm/pull/5456
**Full Changelog**: https://github.com/BerriAI/litellm/compare/v1.44.13...v1.44.14
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.44.14
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
| Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| /chat/completions | Passed ✅ | 140.0 | 174.81784158205727 | 6.331611805444247 | 0.0 | 1895 | 0 | 108.71869999994033 | 5381.36602100002 |
| Aggregated | Passed ✅ | 140.0 | 174.81784158205727 | 6.331611805444247 | 0.0 | 1895 | 0 | 108.71869999994033 | 5381.36602100002 |
v1.44.13-stable
What's Changed
* Clarify support-related Exceptions in utils.py by jhtobigs in https://github.com/BerriAI/litellm/pull/5447
* - merge - fix TypeError: 'CompletionUsage' object is not subscriptable 5441 by krrishdholakia in https://github.com/BerriAI/litellm/pull/5448
* [Fix-Proxy] Allow running /health checks on vertex multimodal embedding requests by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5449
* [Fix] Use correct Vertex AI AI21 Cost tracking by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5439
* (models): Add gemini-1.5-pro-exp-0827 pricing. by Manouchehri in https://github.com/BerriAI/litellm/pull/5419
* [Fix-Proxy] Vertex SDK pass through - pass all relevant vertex creds by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5451
* [Fix-Proxy] - Allow Qdrant API Key to be optional by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5452
* [Feat-Proxy] Load config.yaml from GCS Bucket by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5450
* [Refactor] Refactor vertex text to speech to be in vertex directory by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5454
* [Fix-Proxy-Auth] allow pass through routes as LLM API routes by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5458
* [Feat] Vertex embeddings - map `input_type` to `text_type` by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5455
New Contributors
* jhtobigs made their first contribution in https://github.com/BerriAI/litellm/pull/5447
**Full Changelog**: https://github.com/BerriAI/litellm/compare/v1.44.12...v1.44.13-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.44.13-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
| Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| /chat/completions | Passed ✅ | 100.0 | 123.21960788963653 | 6.451188433400272 | 0.0 | 1930 | 0 | 85.36914300003673 | 2112.863508000032 |
| Aggregated | Passed ✅ | 100.0 | 123.21960788963653 | 6.451188433400272 | 0.0 | 1930 | 0 | 85.36914300003673 | 2112.863508000032 |