* added example for phi-3.5-vision-instruct * removed the wait for deployment step
0.5.0
- fixed bugs with the adapter client - ez deployment fixes - supported models: - numind/nuxtract / tiny - gemma-2b-9b-it - llava-v1.5-7b - llava-v1.6-32b - llama3-llava-next-8b - phi-3.5-vision-instruct (multiple context lengths due to kv cache block size for smaller gpus)
0.4.0
* support beyond /chat/completions api * support for sglang, langchain and openai sdk properly documented * support for sglang frontend integration to models
0.3.0
* support ollama * document example with ollama
0.2.0
* updated some docs on guided decode * created clear implementation for