Aphrodite-engine

Latest version: v0.6.5

Safety actively analyzes 693883 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 6

0.6.5

What's Changed
* xpu: refactor XPU worker & executor by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/861
* build: add jinja2 to requirements file by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/862
* attention: add `AttentionState` abstraction by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/863
* xpu: disable punica kernels for XPU by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/864
* executor: pipe `worker_class_fn` arg in executor by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/865
* server: log the process occupying our port by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/866
* feat: AWQ quantization for InternVL by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/867
* Rewrite DRY sampler to be a lot faster by 50h100a in https://github.com/PygmalionAI/aphrodite-engine/pull/868
* fix: ROCm build by Naomiusearch in https://github.com/PygmalionAI/aphrodite-engine/pull/817
* fix: temp_last warning being repeated for every output token by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/869
* feat: add support for chunked prefill + prefix caching by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/871
* async: avoid premature exit in the async generator by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/872
* cpu: fix `mm_limits` initialization by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/873
* spec decoding: set the draft model ctxlen to target model by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/874
* sampler: pad dry sequence breakers tensor by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/875
* fix: `add_generation_template` -> `add_generation_prompt` in llm by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/877
* Update README.md by NoahBPeterson in https://github.com/PygmalionAI/aphrodite-engine/pull/876
* api: fix crashes under very high loads by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/878
* build: pass `PYTHONPATH` from setup.py to cmake by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/879
* async: disable multi-step scheduling for sync engine by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/880
* api: better startup failure UX by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/881
* chore: consolidate environment variables within one file by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/882
* core: fix spec decode metrics and envs circular import by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/889
* feat: add support for audio models by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/891
* distributed: fix issue for when nodes have multiple network interfaces by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/892
* rocm: fix compile issues with rocm 6.2 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/893
* build: fix invalid path for envs.py in setup by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/894
* kernel: use `cub::BlockReduce` instead of custom impl by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/895
* fix: Phi 3.5 Vision model loading by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/896
* api: add client timeouts for the ZeroMQ server by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/897
* feat: add torch.compile for GemmaRMSNorm by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/898
* spec decode: add support for EAGLE by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/899
* fix: `ShardedStateLoader` with fp8 quant by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/900
* kernel: do not compile machete for cuda 11 and below by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/901
* chore: add AphroditeParameter support for FP8 quant by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/902
* spec decode: fix logprobs when using speculative decoding by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/904
* api: error suppression cleanup + timeout suppression on aborts by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/905
* ray: better error when placement group topology is incorrect by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/906
* xpu: refactor the model runner for tensor parallelism by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/910
* fix: empty prompt crashing the server by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/912
* quantization: update marlin to use `AphroditeParameters` by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/913
* core: add multi-step scheduling support for the synchronous engine by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/914
* api: add json_schema to OpenAI server by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/915
* fix: phi3v crash with unusual image sizes by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/916
* feat: multi-image input support for Phi3V by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/917
* spec decode: streamline batch expansion tensor manipulation by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/918
* api: use fp32 for base64 embeddings by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/919
* core: improve warmup times for prefix caching in block manager v2 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/920
* quants: update `qqq` and `gptq_marlin_24` to use AphroditeParameters by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/921
* distributed: fix custom allreduce p2p cache file generation by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/922
* neuron: add support for tensor parallelism by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/923
* quants: update compressed tensors lifecycle to remove `prefix` from `create_weights` by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/924
* feat: add async postprocessor by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/925
* api: add endpoint for loading and unloading the model by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/926
* feat: add single user mode by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/927
* api: add inline model loading by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/928
* api: support aphrodite_config.yaml with inline loading by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/929
* fix: inline model loading conflicts with lora by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/930
* core: do not compile for profiling by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/931
* xpu: support pipeline parallel by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/932
* fix: phi3v image_idx in async server by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/933
* feat: add fused Marlin MoE kernel by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/934
* chore: multi-image support for llava-next by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/935
* model: add support for paligemma2 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/936
* vlm: stack multimodal tensors to represent multiple images within each prompt by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/937
* core: do not compile ScalarType for torch < 2.4.0 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/938
* core: add virtual engine for async outproc by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/939
* api: log prompt truncation by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/940
* vlm: fix incompatibility nested tensors and multi-image llava-next by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/941
* vlm: fix persimmon and fuyu issues with transformers 4.45 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/942
* Fix SentencePieceTokenizer error when generating on Mistral Large 2411 with `--tokenizer-mode mistral` by khanonnie in https://github.com/PygmalionAI/aphrodite-engine/pull/943
* core: use flashinfer for FP8 KV when available by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/944
* tests: update flashinfer test for 944 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/945
* quants: add triton kernels for AWQ by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/946
* tests: add kernel tests for causal_conv1d and mamba_ssm by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/947
* fix: do not register punica with torch if using older torch by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/948
* tpu: avoid dynamo guard eval overhead by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/949
* fix: issues with flashinfer fp8 kv by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/950
* api: optimize zeromq frontend performance by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/951
* tpu: remove torch._dynamo.reset() by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/952
* vlm: fix errors on ragged NestedTensors by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/953
* spec decode: match the original rank computation impl for spec decoding by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/954
* core: support multi-step scheduling w/ async post-processor by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/955
* Revert "fix: issues with flashinfer fp8 kv (950)" by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/956
* misc: extend cuda graph capture size for H200 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/957
* fix: gguf vocab embddings in TP by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/958
* quant: update tpu_int8 to use AphroditeParameters by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/959
* neuron: support for context length and token bucketing by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/960
* quant: support pre-quanted bitsandbytes checkpoints by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/961
* vlm: do not allow max_model_len overflow by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/962
* core: support logprobs with multi-step scheduling by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/963
* ci: bump aphrodite version to 0.6.5 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/964

New Contributors
* NoahBPeterson made their first contribution in https://github.com/PygmalionAI/aphrodite-engine/pull/876
* khanonnie made their first contribution in https://github.com/PygmalionAI/aphrodite-engine/pull/943

**Full Changelog**: https://github.com/PygmalionAI/aphrodite-engine/compare/v0.6.4.post1...v0.6.5

0.6.4.post1

What's Changed
* add linux arm64/aarch64/GH200 installation tips by qpwo in https://github.com/PygmalionAI/aphrodite-engine/pull/851
* DRY Fix: Add output_tokens to sampler by selalipop in https://github.com/PygmalionAI/aphrodite-engine/pull/849
* sampler: fix DRY concurrency issue by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/852
* sampler: add range parameter for DRY by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/855
* sampler: optimize DRY performance using z-algorithm by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/856
* sampler: allow parsing sampler order using strings by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/858

New Contributors
* qpwo made their first contribution in https://github.com/PygmalionAI/aphrodite-engine/pull/851

**Full Changelog**: https://github.com/PygmalionAI/aphrodite-engine/compare/v0.6.4...v0.6.4.post1

0.6.4

What's Changed
* frontend: enable kobold api by default by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/803
* feat: add serviceinfo endpoint by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/807
* feat: update to serviceinfo v0.2 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/808
* Mask dynatemp using min/max, rather than exp by 50h100a in https://github.com/PygmalionAI/aphrodite-engine/pull/813
* fix: temperature issues by 50h100a in https://github.com/PygmalionAI/aphrodite-engine/pull/814
* fix: --max-seq-len-to-capture arg by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/818
* [IMPORTANT] updating test units by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/769
* fix: tokenization api test by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/821
* feat: add chat method for LLM class by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/822
* feat: support chunked prefill with LoRA by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/823
* SPMD optimizations by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/824
* fix: sampler test with new transformers version by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/826
* feat: add cuda sampling kernels for top_k and top_p by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/828
* feat: add metrics for prefix cache hit rate by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/829
* fix: unbound tokenizer error by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/830
* feat: multi-step scheduling by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/831
* feat: Add DRY (Do not Repeat Yourself) sampling by selalipop in https://github.com/PygmalionAI/aphrodite-engine/pull/827
* feat: add no_repeat_ngram sampler by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/832
* feat: add skew sampling by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/834
* fix: hidden states handling in batch expansion for spec decoding by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/839
* chore: refactor executor classes for easier inheritance by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/840
* fix: latency and serving benchmarks by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/841
* feat: Machete Kernels for Hopper GPUs by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/842
* feat: add sampler_priorty by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/837
* fix: disable awq_marlin override for awq models by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/843
* chore: bump mistral_common to 1.5.0 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/844
* ci: bump version to 0.6.4 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/845

New Contributors
* dependabot made their first contribution in https://github.com/PygmalionAI/aphrodite-engine/pull/796
* selalipop made their first contribution in https://github.com/PygmalionAI/aphrodite-engine/pull/827

**Full Changelog**: https://github.com/PygmalionAI/aphrodite-engine/compare/v0.6.3...v0.6.4

0.6.3.post1

What's Changed
* build(deps): bump rollup from 4.21.0 to 4.24.3 in /docs by dependabot in https://github.com/PygmalionAI/aphrodite-engine/pull/796
* fix: compilation of gptq_marlin_gemm object by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/800
* ci: bump to 0.6.3.post1 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/801

New Contributors
* dependabot made their first contribution in https://github.com/PygmalionAI/aphrodite-engine/pull/796

**Full Changelog**: https://github.com/PygmalionAI/aphrodite-engine/compare/v0.6.3...v0.6.3.post1

0.6.3

What's Changed
* Stream models rather than load them completely into RAM. by 50h100a in https://github.com/PygmalionAI/aphrodite-engine/pull/785
* feat: windows support by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/790
* fix: windows wheel url by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/794
* fix: kobold lite embedded UI on windows by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/797
* feat: add HQQ quantization support by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/795
* frontend: minor logging improvements by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/787
* ci: bump version to 0.6.3 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/799


**Full Changelog**: https://github.com/PygmalionAI/aphrodite-engine/compare/v0.6.2.post1...v0.6.3

0.6.2.post1

What's Changed
* fix: kobold api for horde by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/763
* Fix for a crash from token bans by Pyroserenus in https://github.com/PygmalionAI/aphrodite-engine/pull/764
* Modified throughput benchmark to allow --max-num-seqs by Pyroserenus in https://github.com/PygmalionAI/aphrodite-engine/pull/770
* Simplify construction of sampling_metadata by 50h100a in https://github.com/PygmalionAI/aphrodite-engine/pull/766
* Add OLMoE by fizzAI in https://github.com/PygmalionAI/aphrodite-engine/pull/772
* feat: ministral support by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/776
* Make amd usable by Naomiusearch in https://github.com/PygmalionAI/aphrodite-engine/pull/775
* docker: apply AMD patch in the dockerfile by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/777
* fix: demote skip_special_tokens assertion to logger error by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/778
* ci: bump version to 0.6.2.post1 by AlpinDale in https://github.com/PygmalionAI/aphrodite-engine/pull/779

New Contributors
* fizzAI made their first contribution in https://github.com/PygmalionAI/aphrodite-engine/pull/772

**Full Changelog**: https://github.com/PygmalionAI/aphrodite-engine/compare/v0.6.2...v0.6.2.post1

Page 1 of 6

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.