openai-ratelimiter is a Python library offering a simple and efficient way to prevent hitting OpenAI API rate limits. Supporting both synchronous and asynchronous programming paradigms, the package provides classes like ChatCompletionLimiter, TextCompletionLimiter, and their asynchronous equivalents. The current version supports only Redis as the caching service and has been tested with Python 3.11.4.
Key methods available include clear_locks() to remove all current model locks and is_locked() to check if a request would be locked based on given parameters.
As part of its future plans, the library aims to include in-memory caching, rate limiting for different types of models such as embeddings and DALL·E image model, providing more functions about the current state, and organization level rate limiting.
Contributions to enhance this library are highly welcomed. The library has been developed and maintained by Youssef Benhammouda.
To install the library, please refer to the instructions on the main page of the repository.