What's Changed
* frontend: enable kobold api by default by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/803
* feat: add serviceinfo endpoint by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/807
* feat: update to serviceinfo v0.2 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/808
* Mask dynatemp using min/max, rather than exp by 50h100a in https://github.com/PygmalionAI/aphrodite-engine/pull/813
* fix: temperature issues by 50h100a in https://github.com/PygmalionAI/aphrodite-engine/pull/814
* fix: --max-seq-len-to-capture arg by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/818
* [IMPORTANT] updating test units by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/769
* fix: tokenization api test by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/821
* feat: add chat method for LLM class by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/822
* feat: support chunked prefill with LoRA by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/823
* SPMD optimizations by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/824
* fix: sampler test with new transformers version by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/826
* feat: add cuda sampling kernels for top_k and top_p by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/828
* feat: add metrics for prefix cache hit rate by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/829
* fix: unbound tokenizer error by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/830
* feat: multi-step scheduling by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/831
* feat: Add DRY (Do not Repeat Yourself) sampling by selalipop in https://github.com/PygmalionAI/aphrodite-engine/pull/827
* feat: add no_repeat_ngram sampler by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/832
* feat: add skew sampling by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/834
* fix: hidden states handling in batch expansion for spec decoding by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/839
* chore: refactor executor classes for easier inheritance by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/840
* fix: latency and serving benchmarks by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/841
* feat: Machete Kernels for Hopper GPUs by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/842
* feat: add sampler_priorty by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/837
* fix: disable awq_marlin override for awq models by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/843
* chore: bump mistral_common to 1.5.0 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/844
* ci: bump version to 0.6.4 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/845
New Contributors
* dependabot made their first contribution in https://github.com/PygmalionAI/aphrodite-engine/pull/796
* selalipop made their first contribution in https://github.com/PygmalionAI/aphrodite-engine/pull/827
**Full Changelog**: https://github.com/PygmalionAI/aphrodite-engine/compare/v0.6.3...v0.6.4