Highlights
- This release features new modules in Ray Serve and Ray Data for integration with large language models, marking the first step of addressing [50639](https://github.com/ray-project/ray/issues/50639). Existing Ray Data and Ray Serve have limited support for LLM deployments, where users have to manually configure and manage the underlying LLM engine. In this release, we offer APIs for both batch inference and serving of LLMs within Ray in `ray.data.llm` and `ray.serve.llm`. See the below notes for more details. These APIs are marked as **alpha** -- meaning they may change in future releases without a deprecation period.
- Ray Train V2 is available to try starting in Ray 2.43! Run your next Ray Train job with the `RAY_TRAIN_V2_ENABLED=1` environment variable. See [the migration guide](https://github.com/ray-project/ray/issues/49454) for more information.
- A new integration with `uv run` that allows easily specifying Python dependencies for both driver and workers in a consistent way and enables quick iterations for development of Ray applications ([50160](https://github.com/ray-project/ray/pull/50160), [50462](https://github.com/ray-project/ray/pull/50462)), check out our [blog post](https://www.anyscale.com/blog/uv-ray-pain-free-python-dependencies-in-clusters)
Ray Libraries<a id="ray-libraries"></a>
Ray Data<a id="ray-data"></a>
π New Features:
- *Ray Data LLM*: We are introducing a new module in Ray Data for batch inference with LLMs (currently marked as **alpha**). It offers a new `Processor` abstraction that interoperates with existing Ray Data pipelines. This abstraction can be configured two ways:
- Using the `vLLMEngineProcessorConfig`, which configures vLLM to load model replicas for high throughput model inference
- Using the `HttpRequestProcessorConfig`, which sends HTTP requests to an OpenAI-compatible endpoint for inference.
- Documentation for these features can be [found here.](https://docs.ray.io/en/master/data/working-with-llms.html)
- Implement accurate memory accounting for `UnionOperator` ([50436](https://github.com/ray-project/ray/pull/50436))
- Implement accurate memory accounting for all-to-all operations ([50290](https://github.com/ray-project/ray/pull/50290))
π« Enhancements:
- Support class constructor args for filter() ([50245](https://github.com/ray-project/ray/pull/50245))
- Persist ParquetDatasource metadata. ([50332](https://github.com/ray-project/ray/pull/50332))
- Rebasing `ShufflingBatcher` onto `try_combine_chunked_columns` ([50296](https://github.com/ray-project/ray/pull/50296))
- Improve warning message if required dependency isn't installed ([50464](https://github.com/ray-project/ray/pull/50464))
- Move data-related test logic out of core tests directory ([50482](https://github.com/ray-project/ray/pull/50482))
- Pass executor as an argument to ExecutionCallback ([50165](https://github.com/ray-project/ray/pull/50165))
- Add operator id info to task+actor ([50323](https://github.com/ray-project/ray/pull/50323))
- Abstracting common methods, removing duplication in `ArrowBlockAccessor`, `PandasBlockAccessor` ([50498](https://github.com/ray-project/ray/pull/50498))
- Warn if map UDF is too large ([50611](https://github.com/ray-project/ray/pull/50611))
- Replace `AggregateFn` with `AggregateFnV2`, cleaning up Aggregation infrastructure ([50585](https://github.com/ray-project/ray/pull/50585))
- Simplify Operator.__repr__ ([50620](https://github.com/ray-project/ray/pull/50620))
- Adding in `TaskDurationStats` and `on_execution_step` callback ([50766](https://github.com/ray-project/ray/pull/50766))
- Print Resource Manager stats in release tests ([50801](https://github.com/ray-project/ray/pull/50801))
π¨ Fixes:
- Fix invalid escape sequences in `grouped_data.py` docstrings ([50392](https://github.com/ray-project/ray/pull/50392))
- Deflake `test_map_batches_async_generator` ([50459](https://github.com/ray-project/ray/pull/50459))
- Avoid memory leak with `pyarrow.infer_type` on datetime arrays ([50403](https://github.com/ray-project/ray/pull/50403))
- Fix parquet partition cols to support tensors types ([50591](https://github.com/ray-project/ray/pull/50591))
- Fixing aggregation protocol to be appropriately associative ([50757](https://github.com/ray-project/ray/pull/50757))
π Documentation:
- Remove "Stable Diffusion Batch Prediction with Ray Data" example ([50460](https://github.com/ray-project/ray/pull/50460))
Ray Train<a id="ray-train"></a>
π New Features:
- Ray Train V2 is available to try starting in Ray 2.43! Run your next Ray Train job with the `RAY_TRAIN_V2_ENABLED=1` environment variable. See [the migration guide](https://github.com/ray-project/ray/issues/49454) for more information.
π« Enhancements:
- Add a training ingest benchmark release test ([50019](https://github.com/ray-project/ray/pull/50019), [#50299](https://github.com/ray-project/ray/pull/50299)) with a fault tolerance variant ([#50399](https://github.com/ray-project/ray/pull/50399))
- Add telemetry for Trainer usage in V2 ([50321](https://github.com/ray-project/ray/pull/50321))
- Add pydantic as a `ray[train]` extra install ([46682](https://github.com/ray-project/ray/pull/46682))
- Add state tracking to train v2 to make run status, run attempts, and training worker metadata observable ([50515](https://github.com/ray-project/ray/pull/50515))
π¨ Fixes:
- Increase doc test parallelism ([50326](https://github.com/ray-project/ray/pull/50326))
- Disable TF test for py312 ([50382](https://github.com/ray-project/ray/pull/50382))
- Increase test timeout to deflake ([50796](https://github.com/ray-project/ray/pull/50796))
π Documentation:
- Add missing xgboost pip install in example ([50232](https://github.com/ray-project/ray/pull/50232))
π Architecture refactoring:
- Add deprecation warnings pointing to a migration guide for Ray Train V2 ([49455](https://github.com/ray-project/ray/pull/49455), [#50101](https://github.com/ray-project/ray/pull/50101), [#50322](https://github.com/ray-project/ray/pull/50322))
- Refactor internal Train controller state management ([50113](https://github.com/ray-project/ray/pull/50113), [#50181](https://github.com/ray-project/ray/pull/50181), [#50388](https://github.com/ray-project/ray/pull/50388))
Ray Tune<a id="ray-tune"></a>
π¨ Fixes:
- Fix worker node failure test ([50109](https://github.com/ray-project/ray/pull/50109))
π Documentation:
- Update all doc examples off of ray.train imports ([50458](https://github.com/ray-project/ray/pull/50458))
- Update all ray/tune/examples off of ray.train imports ([50435](https://github.com/ray-project/ray/pull/50435))
- Fix typos in persistent storage guide ([50127](https://github.com/ray-project/ray/pull/50127))
- Remove Binder notebook links in Ray Tune docs ([50621](https://github.com/ray-project/ray/pull/50621))
π Architecture refactoring:
- Update RLlib to use ray.tune imports instead of ray.air and ray.train ([49895](https://github.com/ray-project/ray/pull/49895))
Ray Serve<a id="ray-serve"></a>
π New Features:
- *Ray Serve LLM*: We are introducing a new module in Ray Serve to easily integrate open source LLMs in your Ray Serve deployment, currently marked as **alpha**. This opens up a powerful capability of composing complex applications with multiple LLMs, which is a use case in emerging applications like agentic workflows. Ray Serve LLM offers a couple core components, including:
- `VLLMService`: A prebuilt deployment that offers a full-featured vLLM engine integration, with support for features such as LoRA multiplexing and multimodal language models.
- `LLMRouter`: An out-of-the-box OpenAI compatible model router that can route across multiple LLM deployments.
- Documentation can be found at https://docs.ray.io/en/releases-2.43.0/serve/llm/overview.html
π« Enhancements:
- Add `required_resources` to REST API ([50058](https://github.com/ray-project/ray/pull/50058))
π¨ Fixes:
- Fix batched requests hanging after cancellation ([50054](https://github.com/ray-project/ray/pull/50054))
- Properly propagate backpressure error ([50311](https://github.com/ray-project/ray/pull/50311))
RLlib<a id="rllib"></a>
π New Features:
- Added env vectorization support for multi-agent (new API stack). ([50437](https://github.com/ray-project/ray/pull/50437))
π« Enhancements:
- APPO/IMPALA various acceleration efforts. Reached 100k ts/sec on Atari benchmark with 400 EnvRunners and 16 (multi-node) GPU Learners: [50760](https://github.com/ray-project/ray/pull/50760), [#50162](https://github.com/ray-project/ray/pull/50162), [#50249](https://github.com/ray-project/ray/pull/50249), [#50353](https://github.com/ray-project/ray/pull/50353), [#50368](https://github.com/ray-project/ray/pull/50368), [#50379](https://github.com/ray-project/ray/pull/50379), [#50440](https://github.com/ray-project/ray/pull/50440), [#50477](https://github.com/ray-project/ray/pull/50477), [#50527](https://github.com/ray-project/ray/pull/50527), [#50528](https://github.com/ray-project/ray/pull/50528), [#50600](https://github.com/ray-project/ray/pull/50600), [#50309](https://github.com/ray-project/ray/pull/50309)
- Offline RL:
- Remove all weight synching to `eval_env_runner_group` from the training steps. ([50057](https://github.com/ray-project/ray/pull/50057))
- Enable single-learner/multi-learner GPU training. ([50034](https://github.com/ray-project/ray/pull/50034))
- Remove reference to MARWILOfflinePreLearner in `OfflinePreLearner` docstring. ([50107](https://github.com/ray-project/ray/pull/50107))
- Add metrics to multi-agent replay buffers. ([49959](https://github.com/ray-project/ray/pull/49959)[)](https://github.com/ray-project/ray/commit/00de19036cfcd125012711658833124edaf66c53)
π¨ Fixes:
- Fix SPOT preemption tolerance for large AlgorithmConfig: Pass by reference to RolloutWorker ([50688](https://github.com/ray-project/ray/pull/50688))
- `on_workers/env_runners_recreated` callback would be called twice. ([50172](https://github.com/ray-project/ray/pull/50172))
- `default_resource_request`: aggregator actors missing in placement group for local Learner. ([50219](https://github.com/ray-project/ray/pull/50219), [#50475](https://github.com/ray-project/ray/pull/50475))
π Documentation:
- Docs re-do (new API stack):
- Rewrite/enhance "getting started" rst page. ([49950](https://github.com/ray-project/ray/pull/49950))
- Remove rllib-models.rst and fix broken html links. ([49966](https://github.com/ray-project/ray/pull/49966), [#50126](https://github.com/ray-project/ray/pull/50126))
Ray Core and Ray Clusters
Ray Core<a id="ray-core"></a>
π« Enhancements:
- [Core] Enable users to configure python standard log attributes for structured logging (49871)
- [Core] Prestart worker with runtime env (49994)
- [compiled graphs] Support experimental_compile(_default_communicator=comm) (50023)
- [Core] ray.util.Queue Empty and Full exceptions extend queue.Empty and Full (50261)
- [Core] Initial port of Ray to Python 3.13 (47984)
π¨ Fixes:
- [Core] Ignore stale ReportWorkerBacklogRequest (50280)
- [Core] Fix check failure due to negative available resource (50517)
Ray Clusters <a id="ray-clusters"></a>
π Documentation:
- Update the KubeRay docs to v1.3.0.
Ray Dashboard <a id="ray-dashboard"></a>
π New Features:
- Additional filters for job list page ([50283](https://github.com/ray-project/ray/pull/50283))
Thanks
Thank you to everyone who contributed to this release! π₯³
liuxsh9, justinrmiller, CheyuWu, 400Ping, scottsun94, bveeramani, bhmiller, tylerfreckmann, hefeiyun, pcmoritz, matthewdeng, dentiny, erictang000, gvspraveen, simonsays1980, aslonnie, shorbaji, LeoLiao123, justinvyu, israbbani, zcin, ruisearch42, khluu, kouroshHakha, sijieamoy, SergeCroise, raulchen, anson627, bluenote10, allenyin55, martinbomio, rueian, rynewang, owenowenisme, Betula-L, alexeykudinkin, crypdick, jujipotle, saihaj, EricWiener, kevin85421, MengjinYan, chris-ray-zhang, SumanthRH, chiayi, comaniac, angelinalg, kenchung285, tanmaychimurkar, andrewsykim, MortalHappiness, sven1977, richardliaw, omatthew98, fscnick, akyang-anyscale, cristianjd, Jay-ju, spencer-p, win5923, wxsms, stfp, letaoj, JDarDagran, jjyao, srinathk10, edoakes, vincent0426, dayshah, davidxia, DmitriGekhtman, GeneDer, HYLcool, gameofby, can-anyscale, ryanaoleary, eddyxu