⚡️LiteLLM Proxy 100+ LLMs, Track Number of Requests, Avg Latency Per Model Deployment
<img width="1250" alt="model_latency" src="https://github.com/BerriAI/litellm/assets/29436595/3170b3a9-7c9f-4e27-9624-01c314d0ee32">
🛠️ High Traffic Fixes - Fix for DB connection limit hits when model fallbacks occur
🚀 High Traffic Fixes - /embedding - bug "Dictionary changed size during iteration"
⚡️ High Traffic Fixes - Switch off --detailed_debug in default Dockerfile. Users will need to opt in to viewing --detailed_debug logs. (This led to a 5% decrease in avg latency across 1K concurrent calls)
📖 Docs - Fixes for /user/new on LiteLLM Proxy Swagger (show how to set tpm/rpm limits per user) https://docs.litellm.ai/docs/proxy/virtual_keys#usernew
⭐️ Admin UI - separate latency, num requests graphs for model deployments https://docs.litellm.ai/docs/proxy/ui
What's Changed
* (Fix) High Traffic Fix - handle litellm circular ref error by ishaan-jaff in https://github.com/BerriAI/litellm/pull/2363
* (feat) admin UI show model avg latency, num requests by ishaan-jaff in https://github.com/BerriAI/litellm/pull/2367
* (fix) admin UI swagger by ishaan-jaff in https://github.com/BerriAI/litellm/pull/2371
* [FIX] 🐛 embedding - "Dictionary changed size during iteration" Debug Log by ishaan-jaff in https://github.com/BerriAI/litellm/pull/2378
* [Fix] Switch off detailed_debug in default docker by ishaan-jaff in https://github.com/BerriAI/litellm/pull/2375
* feat(proxy_server.py): retry if virtual key is rate limited by krrishdholakia in https://github.com/BerriAI/litellm/pull/2347
* fix(caching.py): add s3 path as a top-level param by krrishdholakia in https://github.com/BerriAI/litellm/pull/2379
**Full Changelog**: https://github.com/BerriAI/litellm/compare/v1.29.4...v1.29.7