What's Changed
* [Feat-Proxy] send prometheus fallbacks stats to slack by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5154
* [Feat-Security] Send Slack Alert when CRUD ops done on Virtual Keys, Teams, Internal Users by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5166
* [Proxy docstring] fix curl on docstring on /team endpoints by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5167
* [Feat Proxy] Send slack alert on CRUD endpoints for Internal Users by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5168
* [Feat] Log GCS logs in folders based on dd-m-yyyy by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5171
* [Feat] GCS Bucket logging - log api key metadata + response cost by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5169
**Full Changelog**: https://github.com/BerriAI/litellm/compare/v1.43.6.dev1...v1.43.7.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.43.7.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
| Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| /chat/completions | Passed ✅ | 83 | 99.22923393705152 | 6.5347647956346755 | 0.0 | 1954 | 0 | 70.33222400002614 | 1331.3259499999504 |
| Aggregated | Passed ✅ | 83 | 99.22923393705152 | 6.5347647956346755 | 0.0 | 1954 | 0 | 70.33222400002614 | 1331.3259499999504 |
v1.43.7-stable
📈 New Prometheus Metrics
doc: https://docs.litellm.ai/docs/proxy/prometheus#llm-api--provider-metrics
Release: https://github.com/BerriAI/litellm/releases/tag/v1.43.7-stable
llm_deployment_latency_per_output_token -> Track latency / output tokens
llm_deployment_failure_responses -> Calculate error rate per deployment (divide this by llm_deployment_total_requests
llm_deployment_successful_fallbacks -> Number of successful fallback requests from primary model -> fallback model
llm_deployment_failed_fallbacks -> Number of failed fallback requests from primary model -> fallback model
![Group 5949](https://github.com/user-attachments/assets/9a213d46-ecf3-423c-a58e-b1d598cb892d)
What's Changed
* [Refactor+Testing] Refactor Prometheus metrics to use CustomLogger class + add testing for prometheus by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5149
* fix(main.py): safely fail stream_chunk_builder calls by krrishdholakia in https://github.com/BerriAI/litellm/pull/5151
* Feat - track response latency on prometheus by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5152
* Feat - Proxy track fallback metrics on prometheus by ishaan-jaff in https://github.com/BerriAI/litellm/pull/5153
**Full Changelog**: https://github.com/BerriAI/litellm/compare/v1.43.6...v1.43.7-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.43.7-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
| Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| /chat/completions | Passed ✅ | 130.0 | 158.2456064989428 | 6.32111960609156 | 0.0 | 1892 | 0 | 111.09798900000101 | 2661.257857999999 |
| Aggregated | Passed ✅ | 130.0 | 158.2456064989428 | 6.32111960609156 | 0.0 | 1892 | 0 | 111.09798900000101 | 2661.257857999999 |