Product Research Enterprise Plans Docs

Ray

Latest version: v2.44.1

Safety actively analyzes 723217 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 18

2.44.1

Not secure

Under screen-lit skies
A ray of bliss in each patch
Joy at any scale

2.44.0

Not secure

Release Highlights

- This release features Ray Compiled Graph (beta). Ray Compiled Graph gives you a classic Ray Core-like API, but with (1) less than 50us system overhead for workloads that repeatedly execute the same task graph; and (2) native support for GPU-GPU communication via NCCL. Ray Compiled Graph APIs simplify high-performance multi-GPU workloads such as LLM inference and training. The beta release refines the API, enhances stability, and adds or improves features like visualization, profiling and experimental GPU compute/computation overlap. For more information, refer to Ray documentation: https://docs.ray.io/en/latest/ray-core/compiled-graph/ray-compiled-graph.html
- The experimental Ray Workflows library has been deprecated and will be removed in a future version of Ray. Ray Workflows has been marked experimental since its inception and hasn’t been maintained due to the Ray team focusing on other priorities. If you are using Ray Workflows, we recommend pinning your Ray version to 2.44.

Ray Libraries

Ray Data

🎉 New Features:
- Add Iceberg write support through pyiceberg[ (](https://github.com/ray-project/ray/commit/5e26c7fc3866921ce97db876136e04271dabf8b4)[#50590](https://github.com/ray-project/ray/pull/50590)[)](https://github.com/ray-project/ray/commit/5e26c7fc3866921ce97db876136e04271dabf8b4)
- [LLM] Various feature enhancements to Ray Data LLM, including LoRA support 50804 and structured outputs 50901

💫 Enhancements:
- Add dataset/operator state, progress, total metrics ([50770](https://github.com/ray-project/ray/pull/50770))
- Make chunk combination threshold configurable ([51200](https://github.com/ray-project/ray/pull/51200))
- Store average memory use per task in OpRuntimeMetrics ([51126](https://github.com/ray-project/ray/pull/51126))
- Avoid unnecessary conversion to Numpy when creating Arrow/Pandas blocks ([51238](https://github.com/ray-project/ray/pull/51238))
- Append-mode API for preprocessors -- 50848, 50847, 50642, 50856, 50584. Note that vectorizers and hashers now output a single column instead 1 column per feature. In the near future, we will be graduating preprocessors to *beta*.

🔨 Fixes:
- Fixing Map Operators to avoid unconditionally overriding generator's back-pressure configuration ([50900](https://github.com/ray-project/ray/pull/50900))
- Fix filter expr equating negative numbers ([50932](https://github.com/ray-project/ray/pull/50932))
- Fix error message for `override_num_blocks` when reading from a HuggingFace Dataset ([50998](https://github.com/ray-project/ray/pull/50998))
- Make num_blocks in repartition optional ([50997](https://github.com/ray-project/ray/pull/50997))
- Always pin the seed when doing file-based random shuffle ([50924](https://github.com/ray-project/ray/pull/50924))
- Fix `StandardScaler` to handle `NaN` stats ([51281](https://github.com/ray-project/ray/pull/51281))

Ray Train

🎉 New Features:
- Implement state export API (50622, 51085, 51177)

💫 Enhancements:
- Folded v2.XGBoostTrainer API into the public trainer class as an alternate constructor (50045)
- Created a default ScalingConfig if one is not provided to the trainer (51093)
- Improved TrainingFailedError message (51199)
- Utilize FailurePolicy factory (51067)

🔨 Fixes:
- Fixed trainer import deserialization when captured within a Ray task (50862)
- Fixed serialize import test for Python 3.12 (50963)
- Fixed RunConfig deprecation message in Tune being emitted in trainer.fit usage (51198)

📖 Documentation:
- [Train V2] Updated API references (51222)
- [Train V2] Updated persistent storage guide (51202)
- [Train V2] Updated user guides for metrics, checkpoints, results, and experiment tracking (51204)
- [Train V2] Added updated Train + Tune user guide (51048)
- [Train V2] Added updated fault tolerance user guide (51083)
- Improved HF Transformers example (50896)
- Improved Train DeepSpeed example (50906)
- Use correct mean and standard deviation norm values in image tutorials (50240)

🏗 Architecture refactoring:
- Deprecated Torch AMP wrapper utilities (51066)
- Hid private functions of train context to avoid abuse (50874)
- Removed ray storage dependency and deprecated RAY_STORAGE env var configuration option (50872)
- Moved library usage tests out of core (51161)

Ray Tune

📖 Documentation:
- Various improvements to Tune Pytorch CIFAR tutorial (50316)
- Various improvements to the Ray Tune XGBoost tutorial (50455)
- Various enhancements to Tune Keras example (50581)
- Minor improvements to Hyperopt tutorial (50697)
- Various improvements to LightGBM tutorial (50704)
- Fixed non-runnable Optuna tutorial (50404)
- Added documentation for Asynchronous HyperBand Example in Tune (50708)
- Replaced reuse actors example with a fuller demonstration (51234)
- Fixed broken PB2/RLlib example (51219)
- Fixed typo and standardized equations across the two APIs (51114)
- Improved PBT example (50870)
- Removed broken links in documentation (50995, 50996)

🏗 Architecture refactoring:
- Removed ray storage dependency and deprecated RAY_STORAGE env var configuration option (50872)
- Moved library usage tests out of core (51161)

Ray Serve

🎉 New Features:
- Faster bulk imperative Serve Application deploys ([49168](https://github.com/ray-project/ray/pull/49168))
- [LLM] Add gen-config ([51235](https://github.com/ray-project/ray/pull/51235))

💫 Enhancements:
- Clean up shutdown behavior of serve ([51009](https://github.com/ray-project/ray/pull/51009))
- Add `additional_log_standard_attrs` to serve logging config ([51144](https://github.com/ray-project/ray/pull/51144))
- [LLM] remove `asyncache` and `cachetools` from dependencies ([50806](https://github.com/ray-project/ray/pull/50806))
- [LLM] remove `backoff` dependency ([50822](https://github.com/ray-project/ray/pull/50822))
- [LLM] Remove `asyncio_timeout` from `ray[llm]` deps on python<3.11 ([50815](https://github.com/ray-project/ray/pull/50815))
- [LLM] Made JSON validator a singleton and `jsonref` packages lazy imported ([50821](https://github.com/ray-project/ray/pull/50821))
- [LLM] Reuse `AutoscalingConfig` and `DeploymentConfig` from Serve ([50871](https://github.com/ray-project/ray/pull/50871))
- [LLM] Use `pyarrow` FS for cloud remote storage interaction ([50820](https://github.com/ray-project/ray/pull/50820))
- [LLM] Add usage telemetry for `serve.llm` ([51221](https://github.com/ray-project/ray/pull/51221))

🔨 Fixes:
- Exclude redirects from request error count ([51130](https://github.com/ray-project/ray/pull/51130))
- [LLM] Fix the wrong `device_capability` issue in vllm on quantized models ([51007](https://github.com/ray-project/ray/pull/51007))
- [LLM] add `gen-config` related data file to the package ([51347](https://github.com/ray-project/ray/pull/51347))

📖 Documentation:
- [LLM] Fix quickstart serve LLM docs ([50910](https://github.com/ray-project/ray/pull/50910))
- [LLM] update `build_openai_app` to include yaml example ([51283](https://github.com/ray-project/ray/pull/51283))
- [LLM] remove old vllm+serve doc ([51311](https://github.com/ray-project/ray/pull/51311))

RLlib

💫 Enhancements:
- APPO/IMPALA accelerate:
- `LearnerGroup` should not pickle remote functions on each update-call; Refactor `LearnerGroup` and `Learner` APIs. ([50665](https://github.com/ray-project/ray/pull/50665))
- `EnvRunner` sync enhancements. ([50918](https://github.com/ray-project/ray/pull/50918)[)](https://github.com/ray-project/ray/commit/02d4a3a51127f8470f9f422cc7f58dce73a6f520)
- Various other speedups: [51302](https://github.com/ray-project/ray/pull/51302), [#50923](https://github.com/ray-project/ray/pull/50923), [#50919](https://github.com/ray-project/ray/pull/50919), [#50791](https://github.com/ray-project/ray/pull/50791)
- Unify namings for actor managers' outstanding in-flight requests metrics. ([51159](https://github.com/ray-project/ray/pull/51159))
- Add timers to env step, forward pass, and complete connector pipelines runs. ([51160](https://github.com/ray-project/ray/pull/51160))

🔨 Fixes:
- Multi-agent env vectorization:
- Fix `MultiAgentEnvRunner` env check bug. ([50891](https://github.com/ray-project/ray/pull/50891)[)](https://github.com/ray-project/ray/commit/f4ab3439d4eb734f69fd8cc13b3d74d0e724864b)
- Add `single_action_space` and `single_observation_space` to `VectorMultiAgentEnv`. ([51096](https://github.com/ray-project/ray/pull/51096))
- Other fixes: [51255](https://github.com/ray-project/ray/pull/51255), [#50920](https://github.com/ray-project/ray/pull/50920), [#51369](https://github.com/ray-project/ray/pull/51369)

📖 Documentation:
- Smaller fixes: [51015](https://github.com/ray-project/ray/pull/51015), [#51219](https://github.com/ray-project/ray/pull/51219)

Ray Core and Ray Clusters

Ray Core

🎉 New Features:
- Enhanced `uv` support ([51233](https://github.com/ray-project/ray/pull/51233))

💫 Enhancements:
- Made infeasible task errors much more obvious ([45909](https://github.com/ray-project/ray/issues/45909))
- Log rotation for workers, runtime env agent, and dashboard agent ([50759](https://github.com/ray-project/ray/pull/50759), [#50877](https://github.com/ray-project/ray/pull/50877), [#50909](https://github.com/ray-project/ray/pull/50909))
- Support customizing gloo timeout ([50223](https://github.com/ray-project/ray/pull/50223))
- Support torch profiling in Compiled Graph ([51022](https://github.com/ray-project/ray/pull/51022))
- Change default tensor deserialization in Compiled Graph ([50778](https://github.com/ray-project/ray/pull/50778))
- Use current node id if no node is specified on ray drain-node ([51134](https://github.com/ray-project/ray/pull/51134))

🔨 Fixes:
- Fixed an issue where the raylet continued to have high CPU overhead after a job was terminated ([49999](https://github.com/ray-project/ray/issues/49999)).
- Fixed compiled graph buffer release issues ([50434](https://github.com/ray-project/ray/pull/50434))
- Improved logic for `ray.wait` on object store objects ([50680](https://github.com/ray-project/ray/pull/50680))
- Ray metrics performing validation the same validation as Prometheus for invalid names ([40586](https://github.com/ray-project/ray/issues/40586))
- Make executor a long-running Python thread ([51016](https://github.com/ray-project/ray/pull/51016))
- Fix plasma client memory leak ([51051](https://github.com/ray-project/ray/pull/51051))
- Fix using `ray.actor.exit_actor()` from within an async background thread ([49451](https://github.com/ray-project/ray/issues/49451))
- Fix UV hook to support Ray Job submission ([51150](https://github.com/ray-project/ray/pull/51150))
- Fix resource leakage after ray job is finished ([49999](https://github.com/ray-project/ray/issues/49999))
- Use the correct way to check whether an actor task is running ([51158](https://github.com/ray-project/ray/pull/51158))
- Controllably destroy CUDA events in GPUFuture’s (Compiled Graph) ([51090](https://github.com/ray-project/ray/pull/51090))
- Avoid creating a thread pool with 0 threads ([50837](https://github.com/ray-project/ray/pull/50837))
- Fix the logic to calculate the number of workers based on the TPU version ([51227](https://github.com/ray-project/ray/pull/51227))

📖 Documentation:
- Updated error message and anti-pattern when forking new processes in worker processes ([50705](https://github.com/ray-project/ray/pull/50705))
- Compiled Graph API Documentation ([50754](https://github.com/ray-project/ray/pull/50754))
- Doc for nsight and torch profile for Compiled Graph ([51037](https://github.com/ray-project/ray/pull/51037))
- Compiled Graph Troubleshooting Doc ([51030](https://github.com/ray-project/ray/pull/51030))
- Completion of of Compiled Graph Docs ([51206](https://github.com/ray-project/ray/pull/51206))
- Updated `jemalloc` profiling doc ([51031](https://github.com/ray-project/ray/pull/51031))
- Add information about standard Python logger attributes ([51038](https://github.com/ray-project/ray/pull/51038))
- Add description for named placement groups to require a namespace ([51285](https://github.com/ray-project/ray/pull/51285))
- Deprecation warnings for Ray Workflows and cluster-wide storage ([51309](https://github.com/ray-project/ray/pull/51309))

Ray Clusters

🎉 New Features:
- Add cuda 12.8 images ([51210](https://github.com/ray-project/ray/pull/51210))

💫 Enhancements:
- Add Pod names to the output of `ray status -v` ([51192](https://github.com/ray-project/ray/pull/51192))

🔨 Fixes:
- Fix autoscaler v1 crash from infeasible strict spread placement groups ([39691](https://github.com/ray-project/ray/issues/39691))

🏗 Architecture refactoring:
- Refactor autoscaler v2 log formatting ([49350](https://github.com/ray-project/ray/pull/49350))
- Update yaml example for `CoordinatorSenderNodeProvider` ([51292](https://github.com/ray-project/ray/pull/51292))

Dashboard

🎉 New Features:
- Discover TPU logs on the Ray Dashboard ([47737](https://github.com/ray-project/ray/pull/47737))

🔨 Fixes:
- Return the correct error message when trying to kill non-existent actors ([51341](https://github.com/ray-project/ray/pull/51341))

----
Many thanks to all those who contributed to this release!
crypdick, rueian, justinvyu, MortalHappiness, CheyuWu, GeneDer, dayshah, lk-chen, matthewdeng, co63oc, win5923, sven1977, akshay-anyscale, ShaochenYu-YW, gvspraveen, bveeramani, jakac, VamshikShetty, raulchen, PaulFenton, elimelt, comaniac, qinyiyan, ruisearch42, nadongjun, AndyUB, israbbani, hongpeng-guo, laysfire, alexeykudinkin, Drice1999, harborn, scottsun94, abrarsheikh, martinbomio, MengjinYan, HollowMan6, orcahmlee, kenchung285, csy1204, noemotiovon, jujipotle, davidxia, kevin85421, hcc429, edoakes, kouroshHakha, omatthew98, alanwguo, farridav, aslonnie, simonsays1980, pcmoritz, terraflops1048576, JoshKarpel, SumanthRH, sijieamoy, zcin, can-anyscale, akyang-anyscale, angelinalg, saihaj, jjyao, anmscale, ryanaoleary, dentiny, jimmyxie-figma, stephanie-wang, khluu, maofagui

2.43.0

Not secure

Highlights
- This release features new modules in Ray Serve and Ray Data for integration with large language models, marking the first step of addressing [50639](https://github.com/ray-project/ray/issues/50639). Existing Ray Data and Ray Serve have limited support for LLM deployments, where users have to manually configure and manage the underlying LLM engine. In this release, we offer APIs for both batch inference and serving of LLMs within Ray in `ray.data.llm` and `ray.serve.llm`. See the below notes for more details. These APIs are marked as **alpha** -- meaning they may change in future releases without a deprecation period.
- Ray Train V2 is available to try starting in Ray 2.43! Run your next Ray Train job with the `RAY_TRAIN_V2_ENABLED=1` environment variable. See [the migration guide](https://github.com/ray-project/ray/issues/49454) for more information.
- A new integration with `uv run` that allows easily specifying Python dependencies for both driver and workers in a consistent way and enables quick iterations for development of Ray applications ([50160](https://github.com/ray-project/ray/pull/50160), [50462](https://github.com/ray-project/ray/pull/50462)), check out our [blog post](https://www.anyscale.com/blog/uv-ray-pain-free-python-dependencies-in-clusters)

Ray Libraries<a id="ray-libraries"></a>

Ray Data<a id="ray-data"></a>
🎉 New Features:
- *Ray Data LLM*: We are introducing a new module in Ray Data for batch inference with LLMs (currently marked as **alpha**). It offers a new `Processor` abstraction that interoperates with existing Ray Data pipelines. This abstraction can be configured two ways:
- Using the `vLLMEngineProcessorConfig`, which configures vLLM to load model replicas for high throughput model inference
- Using the `HttpRequestProcessorConfig`, which sends HTTP requests to an OpenAI-compatible endpoint for inference.
- Documentation for these features can be [found here.](https://docs.ray.io/en/master/data/working-with-llms.html)
- Implement accurate memory accounting for `UnionOperator` ([50436](https://github.com/ray-project/ray/pull/50436))
- Implement accurate memory accounting for all-to-all operations ([50290](https://github.com/ray-project/ray/pull/50290))

💫 Enhancements:
- Support class constructor args for filter() ([50245](https://github.com/ray-project/ray/pull/50245))
- Persist ParquetDatasource metadata. ([50332](https://github.com/ray-project/ray/pull/50332))
- Rebasing `ShufflingBatcher` onto `try_combine_chunked_columns` ([50296](https://github.com/ray-project/ray/pull/50296))
- Improve warning message if required dependency isn't installed ([50464](https://github.com/ray-project/ray/pull/50464))
- Move data-related test logic out of core tests directory ([50482](https://github.com/ray-project/ray/pull/50482))
- Pass executor as an argument to ExecutionCallback ([50165](https://github.com/ray-project/ray/pull/50165))
- Add operator id info to task+actor ([50323](https://github.com/ray-project/ray/pull/50323))
- Abstracting common methods, removing duplication in `ArrowBlockAccessor`, `PandasBlockAccessor` ([50498](https://github.com/ray-project/ray/pull/50498))
- Warn if map UDF is too large ([50611](https://github.com/ray-project/ray/pull/50611))
- Replace `AggregateFn` with `AggregateFnV2`, cleaning up Aggregation infrastructure ([50585](https://github.com/ray-project/ray/pull/50585))
- Simplify Operator.__repr__ ([50620](https://github.com/ray-project/ray/pull/50620))
- Adding in `TaskDurationStats` and `on_execution_step` callback ([50766](https://github.com/ray-project/ray/pull/50766))
- Print Resource Manager stats in release tests ([50801](https://github.com/ray-project/ray/pull/50801))

🔨 Fixes:
- Fix invalid escape sequences in `grouped_data.py` docstrings ([50392](https://github.com/ray-project/ray/pull/50392))
- Deflake `test_map_batches_async_generator` ([50459](https://github.com/ray-project/ray/pull/50459))
- Avoid memory leak with `pyarrow.infer_type` on datetime arrays ([50403](https://github.com/ray-project/ray/pull/50403))
- Fix parquet partition cols to support tensors types ([50591](https://github.com/ray-project/ray/pull/50591))
- Fixing aggregation protocol to be appropriately associative ([50757](https://github.com/ray-project/ray/pull/50757))

📖 Documentation:
- Remove "Stable Diffusion Batch Prediction with Ray Data" example ([50460](https://github.com/ray-project/ray/pull/50460))

Ray Train<a id="ray-train"></a>
🎉 New Features:
- Ray Train V2 is available to try starting in Ray 2.43! Run your next Ray Train job with the `RAY_TRAIN_V2_ENABLED=1` environment variable. See [the migration guide](https://github.com/ray-project/ray/issues/49454) for more information.

💫 Enhancements:
- Add a training ingest benchmark release test ([50019](https://github.com/ray-project/ray/pull/50019), [#50299](https://github.com/ray-project/ray/pull/50299)) with a fault tolerance variant ([#50399](https://github.com/ray-project/ray/pull/50399))
- Add telemetry for Trainer usage in V2 ([50321](https://github.com/ray-project/ray/pull/50321))
- Add pydantic as a `ray[train]` extra install ([46682](https://github.com/ray-project/ray/pull/46682))
- Add state tracking to train v2 to make run status, run attempts, and training worker metadata observable ([50515](https://github.com/ray-project/ray/pull/50515))

🔨 Fixes:
- Increase doc test parallelism ([50326](https://github.com/ray-project/ray/pull/50326))
- Disable TF test for py312 ([50382](https://github.com/ray-project/ray/pull/50382))
- Increase test timeout to deflake ([50796](https://github.com/ray-project/ray/pull/50796))

📖 Documentation:
- Add missing xgboost pip install in example ([50232](https://github.com/ray-project/ray/pull/50232))

🏗 Architecture refactoring:
- Add deprecation warnings pointing to a migration guide for Ray Train V2 ([49455](https://github.com/ray-project/ray/pull/49455), [#50101](https://github.com/ray-project/ray/pull/50101), [#50322](https://github.com/ray-project/ray/pull/50322))
- Refactor internal Train controller state management ([50113](https://github.com/ray-project/ray/pull/50113), [#50181](https://github.com/ray-project/ray/pull/50181), [#50388](https://github.com/ray-project/ray/pull/50388))

Ray Tune<a id="ray-tune"></a>
🔨 Fixes:
- Fix worker node failure test ([50109](https://github.com/ray-project/ray/pull/50109))

📖 Documentation:
- Update all doc examples off of ray.train imports ([50458](https://github.com/ray-project/ray/pull/50458))
- Update all ray/tune/examples off of ray.train imports ([50435](https://github.com/ray-project/ray/pull/50435))
- Fix typos in persistent storage guide ([50127](https://github.com/ray-project/ray/pull/50127))
- Remove Binder notebook links in Ray Tune docs ([50621](https://github.com/ray-project/ray/pull/50621))

🏗 Architecture refactoring:
- Update RLlib to use ray.tune imports instead of ray.air and ray.train ([49895](https://github.com/ray-project/ray/pull/49895))

Ray Serve<a id="ray-serve"></a>
🎉 New Features:
- *Ray Serve LLM*: We are introducing a new module in Ray Serve to easily integrate open source LLMs in your Ray Serve deployment, currently marked as **alpha**. This opens up a powerful capability of composing complex applications with multiple LLMs, which is a use case in emerging applications like agentic workflows. Ray Serve LLM offers a couple core components, including:
- `VLLMService`: A prebuilt deployment that offers a full-featured vLLM engine integration, with support for features such as LoRA multiplexing and multimodal language models.
- `LLMRouter`: An out-of-the-box OpenAI compatible model router that can route across multiple LLM deployments.
- Documentation can be found at https://docs.ray.io/en/releases-2.43.0/serve/llm/overview.html

💫 Enhancements:
- Add `required_resources` to REST API ([50058](https://github.com/ray-project/ray/pull/50058))

🔨 Fixes:
- Fix batched requests hanging after cancellation ([50054](https://github.com/ray-project/ray/pull/50054))
- Properly propagate backpressure error ([50311](https://github.com/ray-project/ray/pull/50311))

RLlib<a id="rllib"></a>
🎉 New Features:
- Added env vectorization support for multi-agent (new API stack). ([50437](https://github.com/ray-project/ray/pull/50437))

💫 Enhancements:
- APPO/IMPALA various acceleration efforts. Reached 100k ts/sec on Atari benchmark with 400 EnvRunners and 16 (multi-node) GPU Learners: [50760](https://github.com/ray-project/ray/pull/50760), [#50162](https://github.com/ray-project/ray/pull/50162), [#50249](https://github.com/ray-project/ray/pull/50249), [#50353](https://github.com/ray-project/ray/pull/50353), [#50368](https://github.com/ray-project/ray/pull/50368), [#50379](https://github.com/ray-project/ray/pull/50379), [#50440](https://github.com/ray-project/ray/pull/50440), [#50477](https://github.com/ray-project/ray/pull/50477), [#50527](https://github.com/ray-project/ray/pull/50527), [#50528](https://github.com/ray-project/ray/pull/50528), [#50600](https://github.com/ray-project/ray/pull/50600), [#50309](https://github.com/ray-project/ray/pull/50309)
- Offline RL:
- Remove all weight synching to `eval_env_runner_group` from the training steps. ([50057](https://github.com/ray-project/ray/pull/50057))
- Enable single-learner/multi-learner GPU training. ([50034](https://github.com/ray-project/ray/pull/50034))
- Remove reference to MARWILOfflinePreLearner in `OfflinePreLearner` docstring. ([50107](https://github.com/ray-project/ray/pull/50107))
- Add metrics to multi-agent replay buffers. ([49959](https://github.com/ray-project/ray/pull/49959)[)](https://github.com/ray-project/ray/commit/00de19036cfcd125012711658833124edaf66c53)

🔨 Fixes:
- Fix SPOT preemption tolerance for large AlgorithmConfig: Pass by reference to RolloutWorker ([50688](https://github.com/ray-project/ray/pull/50688))
- `on_workers/env_runners_recreated` callback would be called twice. ([50172](https://github.com/ray-project/ray/pull/50172))
- `default_resource_request`: aggregator actors missing in placement group for local Learner. ([50219](https://github.com/ray-project/ray/pull/50219), [#50475](https://github.com/ray-project/ray/pull/50475))

📖 Documentation:
- Docs re-do (new API stack):
- Rewrite/enhance "getting started" rst page. ([49950](https://github.com/ray-project/ray/pull/49950))
- Remove rllib-models.rst and fix broken html links. ([49966](https://github.com/ray-project/ray/pull/49966), [#50126](https://github.com/ray-project/ray/pull/50126))

Ray Core and Ray Clusters

Ray Core<a id="ray-core"></a>
💫 Enhancements:
- [Core] Enable users to configure python standard log attributes for structured logging (49871)
- [Core] Prestart worker with runtime env (49994)
- [compiled graphs] Support experimental_compile(_default_communicator=comm) (50023)
- [Core] ray.util.Queue Empty and Full exceptions extend queue.Empty and Full (50261)
- [Core] Initial port of Ray to Python 3.13 (47984)

🔨 Fixes:
- [Core] Ignore stale ReportWorkerBacklogRequest (50280)
- [Core] Fix check failure due to negative available resource (50517)

Ray Clusters <a id="ray-clusters"></a>
📖 Documentation:
- Update the KubeRay docs to v1.3.0.

Ray Dashboard <a id="ray-dashboard"></a>
🎉 New Features:
- Additional filters for job list page ([50283](https://github.com/ray-project/ray/pull/50283))

Thanks

Thank you to everyone who contributed to this release! 🥳
liuxsh9, justinrmiller, CheyuWu, 400Ping, scottsun94, bveeramani, bhmiller, tylerfreckmann, hefeiyun, pcmoritz, matthewdeng, dentiny, erictang000, gvspraveen, simonsays1980, aslonnie, shorbaji, LeoLiao123, justinvyu, israbbani, zcin, ruisearch42, khluu, kouroshHakha, sijieamoy, SergeCroise, raulchen, anson627, bluenote10, allenyin55, martinbomio, rueian, rynewang, owenowenisme, Betula-L, alexeykudinkin, crypdick, jujipotle, saihaj, EricWiener, kevin85421, MengjinYan, chris-ray-zhang, SumanthRH, chiayi, comaniac, angelinalg, kenchung285, tanmaychimurkar, andrewsykim, MortalHappiness, sven1977, richardliaw, omatthew98, fscnick, akyang-anyscale, cristianjd, Jay-ju, spencer-p, win5923, wxsms, stfp, letaoj, JDarDagran, jjyao, srinathk10, edoakes, vincent0426, dayshah, davidxia, DmitriGekhtman, GeneDer, HYLcool, gameofby, can-anyscale, ryanaoleary, eddyxu

2.42.1

Not secure

Ray Data<a id="ray-data"></a>

🔨 Fixes:

- Fixes incorrect assertion (50210)

2.42.0

Not secure

Ray Libraries<a id="ray-libraries"></a>

Ray Data<a id="ray-data"></a>
🎉 New Features:
- Added read_audio and read_video ([50016](https://github.com/ray-project/ray/pull/50016))

💫 Enhancements:
- Optimized multi-column groupbys ([45667](https://github.com/ray-project/ray/pull/45667))
- Included Ray user-agent in BigQuery client construction ([49922](https://github.com/ray-project/ray/pull/49922))

🔨 Fixes:
- Fixed bug that made read tasks non-deterministic ([49897](https://github.com/ray-project/ray/pull/49897))

🗑️ Deprecations:
- Deprecated num_rows_per_file in favor of min_rows_per_file ([49978](https://github.com/ray-project/ray/pull/49978))

Ray Train<a id="ray-train"></a>
💫 Enhancements:
- Add Train v2 user-facing callback interface (49819)
- Add TuneReportCallback for propagating intermediate Train results to Tune (49927)

Ray Tune<a id="ray-tune"></a>
📖 Documentation:
- Fix BayesOptSearch docs (49848)

Ray Serve<a id="ray-serve"></a>
💫 Enhancements:
- Cache metrics in replica and report on an interval ([49971](https://github.com/ray-project/ray/pull/49971))
- Cache expensive calls to inspect.signature ([49975](https://github.com/ray-project/ray/pull/49975))
- Remove extra pickle serialization for gRPCRequest ([49943](https://github.com/ray-project/ray/pull/49943))
- Shared LongPollClient for Routers ([48807](https://github.com/ray-project/ray/pull/48807))
- DeploymentHandle API is now stable ([49840](https://github.com/ray-project/ray/pull/49840))

🔨 Fixes:
- Fix batched requests hanging after request cancellation bug ([50054](https://github.com/ray-project/ray/pull/50054))

RLlib<a id="rllib"></a>
💫 Enhancements:
- Add metrics to replay buffers. ([49822](https://github.com/ray-project/ray/pull/49822))
- Enhance node-failure tolerance (new API stack). ([50007](https://github.com/ray-project/ray/pull/50007))
- MetricsLogger cleanup throughput logic. ([49981](https://github.com/ray-project/ray/pull/49981))
- Split AddStates... connectors into 2 connector pieces (`AddTimeDimToBatchAndZeroPad` and `AddStatesFromEpisodesToBatch`) ([49835](https://github.com/ray-project/ray/pull/49835))

🔨 Fixes:
- Old API stack IMPALA/APPO: Re-introduce mixin-replay-buffer pass, even if `replay-ratio=0` (fixes a memory leak). ([49964](https://github.com/ray-project/ray/pull/49964))
- Fix MetricsLogger race conditions. ([49888](https://github.com/ray-project/ray/pull/49888))
- APPO/IMPALA: Bug fix for > 1 Learner actor. ([49849](https://github.com/ray-project/ray/pull/49849))

📖 Documentation:
- New MetricsLogger API rst page. ([49538](https://github.com/ray-project/ray/pull/49538))
- Move "new API stack" info box right below page titles for better visibility. ([49921](https://github.com/ray-project/ray/pull/49921))
- Add example script for how to log custom metrics in `training_step()`. ([49976](https://github.com/ray-project/ray/pull/49976))
- Enhance/redo autoregressive action distribution example. ([49967](https://github.com/ray-project/ray/pull/49967))
- Make the "tiny CNN" example RLModule run with APPO (by implementing `TargetNetAPI`) ([49825](https://github.com/ray-project/ray/pull/49825))

Ray Core and Ray Clusters

Ray Core<a id="ray-core"></a>
💫 Enhancements:
- Only get single node info rather then all when needed ([49727](https://github.com/ray-project/ray/pull/49727))
- Introduce with_tensor_transport API ([49753](https://github.com/ray-project/ray/pull/49753))

🔨 Fixes:
- Fix tqdm manager thread safe [50040](https://github.com/ray-project/ray/pull/50040)

Ray Clusters <a id="ray-clusters"></a>
🔨 Fixes:
- Fix token expiration for ray autoscaler ([48481](https://github.com/ray-project/ray/pull/48481))

Thanks

Thank you to everyone who contributed to this release! 🥳
wingkitlee0, saihaj, win5923, justinvyu, kevin85421, edoakes, cristianjd, rynewang, richardliaw, LeoLiao123, alexeykudinkin, simonsays1980, aslonnie, ruisearch42, pcmoritz, fscnick, bveeramani, mattip, till-m, tswast, ujjawal-khare, wadhah101, nikitavemuri, akshay-anyscale, srinathk10, zcin, dayshah, dentiny, LydiaXwQ, matthewdeng, JoshKarpel, MortalHappiness, sven1977, omatthew98

2.41.0

Not secure

Highlights

- Major update of RLlib docs and example scripts for the new API stack.

Ray Libraries

Ray Data

🎉 New Features:

- Expression support for filters (49016)
- Support `partition_cols` in `write_parquet` (49411)
- Feature: implement multi-directional sort over Ray Data datasets (49281)

💫 Enhancements:

- Use dask 2022.10.2 (48898)
- Clarify schema validation error (48882)
- Raise `ValueError` when the data sort key is `None` (48969)
- Provide more messages when webdataset format is error (48643)
- Upgrade Arrow version from 17 to 18 (48448)
- Update `hudi` version to 0.2.0 (48875)
- `webdataset`: expand JSON objects into individual samples (48673)
- Support passing kwargs to map tasks. (49208)
- Add `ExecutionCallback` interface (49205)
- Add seed for read files (49129)
- Make `select_columns` and `rename_columns` use Project operator (49393)

🔨 Fixes:

- Fix partial function name parsing in `map_groups` (48907)
- Always launch one task for `read_sql` (48923)
- Reimplement of fix memory pandas (48970)
- `webdataset`: flatten return args (48674)
- Handle `numpy > 2.0.0` behaviour in `_create_possibly_ragged_ndarray` (48064)
- Fix `DataContext` sealing for multiple datasets. (49096)
- Fix `to_tf` for `List` types (49139)
- Fix type mismatch error while mapping nullable column (49405)
- Datasink: support passing write results to `on_write_completes` (49251)
- Fix `groupby` hang when value contains `np.nan` (49420)
- Fix bug where `file_extensions` doesn't work with compound extensions (49244)
- Fix map operator fusion when concurrency is set (49573)

Ray Train

🎉 New Features:

- Output JSON structured log files for system and application logs (49414)
- Add support for AMD ROCR_VISIBLE_DEVICES (49346)

💫 Enhancements:

- Implement Train Tune API Revamp REP (49376, 49467, 49317, 49522)

🏗 Architecture refactoring:

- LightGBM: Rewrite `get_network_params` implementation (49019)

Ray Tune

🎉 New Features:

- Update `optuna_search` to allow users to configure optuna storage (48547)

🏗 Architecture refactoring:

- Make changes to support Train Tune API Revamp REP (49308, 49317, 49519)

Ray Serve

💫 Enhancements:

- Improved request_id generation to reduce proxy CPU overhead (49537)
- Tune GC threshold by default in proxy (49720)
- Use `pickle.dumps` for faster serialization from `proxy` to `replica` (49539)

🔨 Fixes:

- Handle nested ‘=’ in serve run arguments (49719)
- Fix bug when `ray.init()` is called multiple times with different `runtime_envs` (49074)

🗑️ Deprecations:

- Adds a warning that the default behavior for sync methods will change in a future release. They will be run in a threadpool by default. You can opt into this behavior early by setting `RAY_SERVE_RUN_SYNC_IN_THREADPOOL=1`. (48897)

RLlib

🎉 New Features:

- Add support for external Envs to new API stack: New example script and custom tcp-capable EnvRunner. (49033)

💫 Enhancements:

- Offline RL:
- Add sequence sampling to `EpisodeReplayBuffer`. (48116)
- Allow incomplete `SampleBatch` data and fully compressed observations. (48699)
- Add option to customize `OfflineData`. (49015)
- Enable offline training without specifying an environment. (49041)
- Various fixes: 48309, 49194, 49195
- APPO/IMPALA acceleration (new API stack):
- Add support for `AggregatorActors` per Learner. (49284)
- Auto-sleep time AND thread-safety for MetricsLogger. (48868)
- Activate APPO cont. actions release- and CI tests (HalfCheetah-v1 and Pendulum-v1 new in `tuned_examples`). (49068)
- Add "burn-in" period setting to the training of stateful RLModules. (49680)
- Callbacks API: Add support for individual lambda-style callbacks. (49511)
- Other enhancements: 49687, 49714, 49693, 49497, 49800, 49098

📖 Documentation:

- New example scripts:
- How to write a custom algorithm (VPG) from scratch. (49536)
- How to customize an offline data pipeline. (49046)
- GPUs on EnvRunners. (49166)
- Hierarchical training. (49127)
- Async gym vector env. (49527)
- Other fixes and enhancements: 48988, 49071
- New/rewritten html pages:
- Rewrite checkpointing page. (49504)
- New scaling guide. (49528)
- New callbacks page. (49513)
- Rewrite `RLModule` page. (49387)
- New AlgorithmConfig page and redo `package_ref` page for algo configs. (49464)
- Rewrite offline RL page. (48818)
- Rewrite “key concepts" rst page. (49398)
- Rewrite RL environments pages. (49165, 48542)
- Fixes and enhancements: 49465, 49037, 49304, 49428, 49474, 49399, 49713, 49518

🔨 Fixes:

- Add `on_episode_created` callback to SingleAgentEnvRunner. (49487)
- Fix `train_batch_size_per_learner` problems. (49715)
- Various other fixes: 48540, 49363, 49418, 49191

🏗 Architecture refactoring:

- RLModule: Introduce `Default[algo]RLModule` classes (49366, 49368)
- Remove RLlib dependencies from setup.py; add `ormsgpack` (49489)

🗑️ Deprecations:

- 49488, 49144

Ray Core and Ray Clusters

Ray Core

💫 Enhancements:

- Add `task_name`, `task_function_name` and `actor_name` in Structured Logging (48703)
- Support redis/valkey authentication with username (48225)
- Add v6e TPU Head Resource Autoscaling Support (48201)
- compiled graphs: Support all driver and actor read combinations (48963)
- compiled graphs: Add ascii based CG visualization (48315)
- compiled graphs: Add ray[cg] pip install option (49220)
- Allow uv cache at installation (49176)
- Support != Filter in GCS for Task State API (48983)
- compiled graphs: Add CPU-based NCCL communicator for development (48440)
- Support gcs and raylet log rotation (48952)
- compiled graphs: Support `nsight.nvtx` profiling (49392)

🔨 Fixes:

- autoscaler: Health check logs are not visible in the autoscaler container's stdout (48905)
- Only publish `WORKER_OBJECT_EVICTION` when the object is out of scope or manually freed (47990)
- autoscaler: Autoscaler doesn't scale up correctly when the KubeRay RayCluster is not in the goal state (48909)
- autoscaler: Fix incorrectly terminating nodes misclassified as idle in autoscaler v1 (48519)
- compiled graphs: Fix the missing dependencies when num_returns is used (49118)
- autoscaler: Fuse scaling requests together to avoid overloading the Kubernetes API server (49150)
- Fix bug to support S3 pre-signed url for `.whl` file (48560)
- Fix data race on gRPC client context (49475)
- Make sure draining node is not selected for scheduling (49517)

Ray Clusters

💫 Enhancements:

- Azure: Enable accelerated networking as a flag in azure vms (47988)

📖 Documentation:

- Kuberay: Logging: Add Fluent Bit `DaemonSet` and Grafana Loki to "Persist KubeRay Operator Logs" (48725)
- Kuberay: Logging: Specify the Helm chart version in "Persist KubeRay Operator Logs" (48937)

Dashboard

💫 Enhancements:

- Add instance variable to many default dashboard graphs (49174)
- Display duration in milliseconds if under 1 second. (49126)
- Add `RAY_PROMETHEUS_HEADERS` env for carrying additional headers to Prometheus (49353)
- Document about the `RAY_PROMETHEUS_HEADERS` env for carrying additional headers to Prometheus (49700)

🏗 Architecture refactoring:

- Move `memray` dependency from default to observability (47763)
- Move `StateHead`'s methods into free functions. (49388)

Thanks

raulchen, alanwguo, omatthew98, xingyu-long, tlinkin, yantzu, alexeykudinkin, andrewsykim, win5923, csy1204, dayshah, richardliaw, stephanie-wang, gueraf, rueian, davidxia, fscnick, wingkitlee0, KPostOffice, GeneDer, MengjinYan, simonsays1980, pcmoritz, petern48, kashiwachen, pfldy2850, zcin, scottjlee, Akhil-CM, Jay-ju, JoshKarpel, edoakes, ruisearch42, gorloffslava, jimmyxie-figma, bthananjeyan, sven1977, bnorick, jeffreyjeffreywang, ravi-dalal, matthewdeng, angelinalg, ivanthewebber, rkooo567, srinathk10, maresb, gvspraveen, akyang-anyscale, mimiliaogo, bveeramani, ryanaoleary, kevin85421, richardsliu, hartikainen, coltwood93, mattip, Superskyyy, justinvyu, hongpeng-guo, ArturNiederfahrenhorst, jecsand838, Bye-legumes, hcc429, WeichenXu123, martinbomio, HollowMan6, MortalHappiness, dentiny, zhe-thoughts, anyadontfly, smanolloff, richo-anyscale, khluu, xushiyan, rynewang, japneet-anyscale, jjyao, sumanthratna, saihaj, aslonnie

Many thanks to all those who contributed to this release!

Page 1 of 18

Releases

Has known vulnerabilities

Ray

Page 1 of 18

2.44.1

2.44.0

2.43.0

2.42.1

2.42.0

2.41.0

Page 1 of 18

Links

Releases