What's Changed
- use s3 Buckets for caching /chat/completion, embedding responses. Proxy Caching: https://docs.litellm.ai/docs/proxy/caching, Caching with `litellm.completion` https://docs.litellm.ai/docs/caching/redis_cache
- `litellm.completion_cost()` Support for cost calculation for embedding responses - Azure embedding, and `text-embedding-ada-002-v2` jeromeroussin
python
async def _test():
response = await litellm.aembedding(
model="azure/azure-embedding-model",
input=["good morning from litellm", "gm"],
)
print(response)
return response
response = asyncio.run(_test())
cost = litellm.completion_cost(completion_response=response)
- `litellm.completion_cost()` raises exceptions (instead of swallowing exceptions) jeromeroussin
- Improved token counting for azure streaming responses langgg0511 https://github.com/BerriAI/litellm/issues/1304
- set os.environ/ variables for litellm proxy cache Manouchehri
yaml
model_list:
- model_name: gpt-3.5-turbo
litellm_params:
model: gpt-3.5-turbo
- model_name: text-embedding-ada-002
litellm_params:
model: text-embedding-ada-002
litellm_settings:
set_verbose: True
cache: True set cache responses to True
cache_params: set cache params for s3
type: s3
s3_bucket_name: cache-bucket-litellm AWS Bucket Name for S3
s3_region_name: us-west-2 AWS Region Name for S3
s3_aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID us os.environ/<variable name> to pass environment variables. This is AWS Access Key ID for S3
s3_aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY AWS Secret Access Key for S3
* build(Dockerfile): moves prisma logic to dockerfile by krrishdholakia in https://github.com/BerriAI/litellm/pull/1342
**Full Changelog**: https://github.com/BerriAI/litellm/compare/1.16.14...v1.16.15