Ray

Latest version: v2.44.1

Safety actively analyzes 723650 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 18

2.8.1

Not secure
Release Highlights
The Ray 2.8.1 patch release contains fixes for the Ray Dashboard.

Additional context can be found here: https://www.anyscale.com/blog/update-on-ray-cves-cve-2023-6019-cve-2023-6020-cve-2023-6021-cve-2023-48022-cve-2023-48023

Ray Dashboard
🔨 Fixes:

[core][state][log] Cherry pick changes to prevent state API from reading files outside the Ray log directory (41520)
[Dashboard] Migrate Logs page to use state api. (41474) (41522)

2.8.0

Not secure
Release Highlights
This release features stability improvements and API clean-ups across the Ray libraries.

- In Ray Serve, we are deprecating the previously experimental DAG API for deployment graphs. Model composition will be supported through [deployment handles](https://docs.ray.io/en/latest/serve/model_composition.html) providing more flexibility and stability. The previously deprecated Ray Serve 1.x APIs have also been removed. We’ve also added a new Java APIs that aligns with the Ray Serve 2.x APIs. More API changes in the release notes below.
- In RLlib, we’ve moved 24 algorithms into `rllib_contrib` (still available within RLlib for Ray 2.8).
- We’ve added support for PyTorch-compatible input files shuffling for Ray Data. This allows users to randomly shuffle input files for better model training accuracy. This release also features new Ray Data datasources for Databricks and BigQuery.
- On the Ray Dashboard, we’ve added new metrics for Ray Data in the Metrics tab. This allows users to monitor Ray Data workload including real time metrics of cluster memory, CPU, GPU, output data size, etc. See [the doc](https://docs.ray.io/en/master/data/performance-tips.html#monitoring-your-application) for more details.
- Ray Core now supports profiling GPU tasks or actors using Nvidia Nsight. See [the documentation](https://docs.ray.io/en/master/ray-observability/user-guides/profiling.html?highlight=nsight#nsight-system-profiler) for instructions.
- We fixed 2 critical bugs raised by many kuberay / ML library users, including a child process leak issue from Ray worker that leaks the GPU memory (40182) and an job page excessive loading time issue when Ray HA cluster restarts a head node (40742)
- Python 3.7 support is officially deprecated from Ray.

Ray Libraries
Ray Data
🎉 New Features:
- Add support for shuffling input files (40154)
- Support streaming read of PyTorch dataset (39554)
- Add BigQuery datasource (37380)
- Add Databricks table / SQL datasource (39852)
- Add inverse transform functionality to LabelEncoder (37785)
- Add function arg params to `Dataset.map` and `Dataset.flat_map` (40010)

💫Enhancements:
- Hard deprecate `DatasetPipeline` (40129)
- Remove `BulkExecutor` code path (40200)
- Deprecate extraneous `Dataset` parameters and methods (40385)
- Remove legacy iteration code path (40013)
- Implement streaming output backpressure (40387)
- Cap op concurrency with exponential ramp-up (40275)
- Store ray dashboard metrics in `_StatsActor` (40118)
- Slice output blocks to respect target block size (40248)
- Drop columns before grouping by in `Dataset.unique()` (40016)
- Standardize physical operator runtime metrics (40173)
- Estimate blocks for limit and union operator (40072)
- Store bytes spilled/restored after plan execution (39361)
- Optimize `sample_boundaries` in `SortTaskSpec` (39581)
- Optimization to reduce ArrowBlock building time for blocks of size 1 (38833)

🔨 Fixes:
- Fix bug where `_StatsActor` errors with `PandasBlock` (40481)
- Remove deprecated `do_write` (40422)
- Improve error message when reading HTTP files (40462)
- Add flag to skip `get_object_locations` for metrics (39884)
- Fall back to fetch files info in parallel for multiple directories (39592)
- Replace deprecated `.pieces` with updated `.fragments` (39523)
- Backwards compatibility for `Preprocessor` that have been fit in older versions (39173)
- Removing unnecessary data copy in `convert_udf_returns_to_numpy` (39188)
- Do not eagerly free root `RefBundles` (39016)

📖Documentation:
- Remove out-of-date Data examples (40127)
- Remove unused and outdated source examples (40271)

Ray Train
🎉 New Features:
- Add initial support for scheduling workers on neuron_cores (39091)

💫Enhancements:
- Update PyTorch Lightning import path to support both `pytorch_lightning` and `lightning` (39841, 40266)
- Propagate driver `DataContext` to `RayTrainWorkers` (40116)

🔨 Fixes:
- Fix error propagation for as_directory if to_directory fails (40025)

📖Documentation:
- Update checkpoint hierarchy documentation for RayTrainReportCallbacks. (40174)
- Update Lightning RayDDPStrategy docstring (40376)

🏗 Architecture refactoring:
- Deprecate `LightningTrainer`, `AccelerateTrainer`, `TransformersTrainer (40163)
- Clean up legacy persistence mode code paths (39921, 40061, 40069, 40168)
- Deprecate legacy `DatasetConfig` (39963)
- Remove references to `DatasetPipeline` (40159)
- Enable isort (40172)

Ray Tune
💫Enhancements:
- Separate storage checkpoint index bookkeeping (39927, 40003)
- Raise an error if `Tuner.restore()` is called on an instance (39676)
🏗 Architecture refactoring:
- Clean up legacy persistence mode code paths (39918, 40061, 40069, 40168, 40175, 40192, 40181, 40193)
- Migrate TuneController tests (39704)
- Remove TuneRichReporter (40169)
- Remove legacy Ray Client tests (40415)

Ray Serve
💫Enhancements:
- The single-app configuration format for the [Serve Config](https://docs.ray.io/en/master/serve/production-guide/config.html#serve-in-production-config-file) (i.e. the Serve Config without the ‘applications’ field) has been deprecated in favor of the new configuration format.
Both single-app configuration and DAG API will be removed in 2.9.
- The Serve REST API is now accessible through the dashboard port, which defaults to `8265`.
- Accessing the Serve REST API through the dashboard agent port (default `52365`) is deprecated. The support will be removed in a future version.
- Ray job error tracebacks are now logged in the job driver log for easier access when jobs fail during start up.
- Deprecated single-application config file
- Deprecated DAG API: `InputNode` and `DAGDriver`
- Removed deprecated Deployment 1.x APIs: `Deployment.deploy()`, `Deployment.delete()`, `Deployment.get_handle()`
- Removed deprecated 1.x API: `serve.get_deployment` and `serve.list_deployments`
- New Java API supported (aligns with Ray Serve 2.x API)

🔨 Fixes:
- The `dedicated_cpu` and `detached` options in `serve.start()` have been fully disallowed.
- Error will be raised when users pass invalid gRPC service functions and fail early.
- The proxy’s readiness check now uses a linear backoff to avoid getting stuck in an infinite loop if it takes longer than usual to start.
- `grpc_options` on `serve.start()` was only allowing a `gRPCOptions` object in Ray 2.7.0. Dictionaries are now allowed to be used as`grpc_options` in the `serve.start()` call.

RLlib
💫Enhancements:
- `rllib_contrib` algorithms (A2C, A3C, AlphaStar 36584, AlphaZero 36736, ApexDDPG 36596, ApexDQN 36591, ARS 36607, Bandits 36612, CRR 36616, DDPG, DDPPO 36620, Dreamer(V1), DT 36623, ES 36625, LeelaChessZero 36627, MA-DDPG 36628, MAML, MB-MPO 36662, PG 36666, QMix 36682, R2D2, SimpleQ 36688, SlateQ 36710, and TD3 36726) all produce warnings now if used. See [here](https://github.com/ray-project/ray/tree/master/rllib_contrib#rllib-contrib) for more information on the `rllib_contrib` efforts. (36620, 36628, 3
- Provide msgpack checkpoint translation utility to convert checkpoint into msgpack format for being able to move in between python versions (38825).

🔨 Fixes:
- Issue 35440 (JSON output writer should include INFOS 39632)
- Issue 39453 (PettingZoo wrappers should use correct multi-agent dict spaces 39459)
- Issue 39421 (Multi-discrete action spaces not supported in new stack 39534)
- Issue 39234 (Multi-categorical distribution bug 39464)
39654, 35975, 39552, 38555
Ray Core and Ray Clusters
Ray Core
🎉 New Features:
- Python 3.7 support is officially deprecated from Ray.
- Supports profiling GPU tasks or actors using Nvidia Nsight. See [the doc](https://docs.ray.io/en/master/ray-observability/user-guides/profiling.html?highlight=nsight#nsight-system-profiler) for instructions.
- Ray on spark autoscaling is officially supported from Ray 2.8. See the [REP](https://github.com/ray-project/enhancements/pull/43) for more details.
💫Enhancements:
- IDLE node information in detail is available from ray status -v (39638)
- Adding a new accelerator to Ray is simplified with a new accelerator interface. See the in-flight [REP](https://github.com/ray-project/enhancements/blob/9936d231b4403cdbceb754a674395ffcf9a586e5/reps/2023-10-13-accelerator-support.md) for more details (#40286).
- Typing_extensions is removed from a dependency requirement because Python 3.7 support is deprecated. (40336)
- Ray state API supports case insensitive match. (34577)
- `ray start --runtime-env-agent-port` is officially supported. (39919)
- Driver exit code is available fromjob info (39675)

🔨 Fixes:
- Fixed a worker leak when Ray is used with placement group because Ray didn’t handle SIGTERM properly (40182)
- Fixed an issue job page loading takes a really long time when Ray HA cluster restarts a head node (40431)
- [core] loosen the check on release object (39570)
- [Core] ray init sigterm (39816)
- [Core] Non Unit Instance fractional value fix (39293)
- [Core]: Enable get_actor_name for actor runtime context (39347)
- [core][streaming][python] Fix asyncio.wait coroutines args deprecated warnings 40292

📖Documentation:
- The Ray streaming generator doc (alpha) is officially available at https://docs.ray.io/en/master/ray-core/ray-generator.html

Ray Clusters
💫Enhancements:
- Enable GPU support for vSphere cluster launcher (40667)

📖Documentation:
- Setup RBAC by KubeRay Helm chart
- KubeRay upgrade documentation
- RayService high availability

🔨 Fixes:
- Assorted fixes for vSphere cluster launcher (40487, 40516, 40655)

Dashboard
🎉 New Features:
- New metrics for ray data can be found in the Metrics tab.
🔨 Fixes:
- Fix bug where download log button did not download all logs for actors.

Thanks
Many thanks to all who contributed to this release!

scottjlee, chappidim, alexeykudinkin, ArturNiederfahrenhorst, stephanie-wang, chaowanggg, peytondmurray, maxpumperla, arvind-chandra, iycheng, JalinWang, matthewdeng, wfangchi, z4y1b2, alanwguo, Zandew, kouroshHakha, justinvyu, yuanchen8911, vitsai, hongchaodeng, allenwang28, caozy623, ijrsvt, omus, larrylian, can-anyscale, joncarter1, ericl, lejara, jjyao, Ox0400, architkulkarni, edoakes, raulchen, bveeramani, sihanwang41, WeichenXu123, zcin, Codle, dimakis, simonsays1980, cadedaniel, angelinalg, luv003, JingChen23, xwjiang2010, rynewang, Yicheng-Lu-llll, scrivy, michaelhly, shrekris-anyscale, xxnwj, avnishn, woshiyyya, aslonnie, amogkam, krfricke, pcmoritz, liuyang-my, jonathan-anyscale, rickyyx, scottsun94, richardliaw, rkooo567, stefanbschneider, kevin85421, c21, sven1977, GeneDer, matthew29tang, RocketRider, LaynePeng, samhallam-reverb, scv119, huchen2021

2.7.1

Not secure
Release Highlights



* Ray Serve:
* Added an `application` tag to the `ray_serve_num_http_error_requests` metric
* Fixed a bug where no data shows up on the `Error QPS per Application` panel in the Ray Dashboard
* RLlib:
* DreamerV3: Bug fix enabling support for continuous actions.
* Ray Train:
* Fix a bug where setting a local storage path on Windows errors ([39951](https://github.com/ray-project/ray/pull/39951))
* Ray Tune:
* Fix a broken `Trial.node_ip` property ([40028](https://github.com/ray-project/ray/pull/40028))
* Ray Core:
* Fixes a segfault when a streaming generator and actor cancel is used together
* Fix autoscaler sdk accidentally initialize ray worker leading to leaked driver showing up in the dashboard.
* Added a new user guide and fixes for the vSphere cluster launcher.
* Fixed a bug where `ray start `would occasionally fail with `ValueError: `acceleratorType` should match v(generation)-(cores/chips).`
* Dashboard:
* Improvement on cluster page UI
* Fix a bug that overview page UI will crash


Ray Libraries


Ray Serve

🔨 Fixes:



* Fixed a bug where no data shows up on the `Error QPS per Application` panel in the Ray Dashboard


RLlib

🔨 Fixes:



* DreamerV3: Bug fix enabling support for continuous actions ([39751](https://github.com/ray-project/ray/issues/39751)).


Ray Core and Ray Clusters

🔨 Fixes:



* Fixed Ray cluster stability on a high latency environment

Thanks

Many thanks to all those who contributed to this release!

chaowanggg, allenwang28, shrekris-anyscale, GeneDer, justinvyu, can-anyscale, edoakes, architkulkarni, rkooo567, rynewang, rickyyx, sven1977

2.7.0

Not secure
Release Highlights

Ray 2.7 release brings important stability improvements and enhancements to Ray libraries, with Ray Train and Ray Serve becoming generally available. Ray 2.7 is accompanied with a GA release of KubeRay.



* Following user feedback, we are rebranding “Ray AI Runtime (AIR)” to “Ray AI Libraries”. Without reducing any of the underlying functionality of the original Ray AI runtime vision as put forth in Ray 2.0, the underlying namespace (ray.air) is consolidated into ray.data, ray.train, and ray.tune. This change reduces the friction for new machine learning (ML) practitioners to quickly understand and leverage Ray for their production machine learning use cases.
* With this release, Ray Serve and Ray Train’s Pytorch support are becoming Generally Available -- indicating that the core APIs have been marked stable and that both libraries have undergone significant production hardening.
* In Ray Serve, we are introducing a new backwards-compatible `DeploymentHandle` API to unify various existing Handle APIs, a high performant gRPC proxy to serve gRPC requests through Ray Serve, along with various stability and usability improvements.
* In Ray Train, we are consolidating various Pytorch-based trainers into the TorchTrainer, reducing the amount of refactoring work new users needed to scale existing training scripts. We are also introducing a new train.Checkpoint API, which provides a consolidated way of interacting with remote and local storage, along with various stability and usability improvements.
* In Ray Core, we’ve added initial integrations with TPUs and AWS accelerators, enabling Ray to natively detect these devices and schedule tasks/actors onto them. Ray Core also officially now supports actor task cancellation and has an experimental streaming generator that supports streaming response to the caller.

Take a look at our [refreshed documentation](https://docs.ray.io/en/releases-2.7.0) and the [Ray 2.7 migration guide](https://docs.google.com/document/d/1J-09US8cXc-tpl2A1BpOrlHLTEDMdIJp6Ah1ifBUw7Y/view#heading=h.3eeweptnwn6p) and let us know your feedback!


Ray Libraries


Ray AIR

🏗 Architecture refactoring:



* **Ray AIR namespace**: We are sunsetting the "Ray AIR" concept and namespace (39516, 38632, 38338, 38379, 37123, 36706, 37457, 36912, 37742, 37792, 37023). The changes follow the proposal outlined in [this REP](https://github.com/ray-project/enhancements/pull/36).
* **Ray Train Preprocessors, Predictors**: We now recommend using Ray Data instead of Preprocessors (38348, 38518, 38640, 38866) and Predictors (38209).


Ray Data

🎉 New Features:



* In this release, we’ve integrated the Ray Core streaming generator API by default, which allows us to reduce memory footprint throughout the data pipeline (37736).
* Avoid unnecessary data buffering between `Read` and `Map` operator (zero-copy fusion) (38789)
* Add `Dataset.write_images` to write images (38228)
* Add `Dataset.write_sql()` to write SQL databases (38544)
* Support sort on multiple keys (37124)
* Support reading and writing JSONL file format (37637)
* Support class constructor args for `Dataset.map()` and `flat_map()` (38606)
* Implement streamed read from Hugging Face Dataset (38432)

💫Enhancements:



* Read data with multi-threading for `FileBasedDataSource` (39493)
* Optimization to reduce `ArrowBlock` building time for blocks of size 1 (38988)
* Add `partition_filter` parameter to `read_parquet `(38479)
* Apply limit to `Dataset.take()` and related methods (38677)
* Postpone `reader.get_read_tasks` until execution (38373)
* Lazily construct metadata providers (38198)
* Support writing each block to a separate file (37986)
* Make `iter_batches` an Iterable (37881)
* Remove default limit on `Dataset.to_pandas()` (37420)
* Add `Dataset.to_dask()` parameter to toggle consistent metadata check (37163)
* Add `Datasource.on_write_start` (38298)
* Remove support for `DatasetDict` as input into `from_huggingface()` (37555)

🔨 Fixes:



* Backwards compatibility for `Preprocessor` that have been fit in older versions (39488)
* Do not eagerly free root `RefBundles` (39085)
* Retry open files with exponential backoff (38773)
* Avoid passing `local_uri` to all non-Parquet data sources (38719)
* Add `ctx` parameter to `Datasource.write` (38688)
* Preserve block format on `map_batches` over empty blocks (38161)
* Fix args and kwargs passed to `ActorPool` `map_batches` (38110)
* Add `tif` file extension to `ImageDatasource` (38129)
* Raise error if PIL can't load image (38030)
* Allow automatic handling of string features as byte features during TFRecord serialization (37995)
* Remove unnecessary file system wrapping (38299)
* Remove `_block_udf` from `FileBasedDatasource` reads (38111)

📖Documentation:



* Standardize API references (37015, 36980, 37007, 36982, etc)


Ray Train

🤝 API Changes



* **Ray Train and Ray Tune Checkpoints:** Introduced a new `train.Checkpoint` class that unifies interaction with remote storage such as S3, GS, and HDFS. The changes follow the proposal in [[REP35] Consolidated persistence API for Ray Train/Tune](https://github.com/ray-project/enhancements/pull/35) (#38452, 38481, 38581, 38626, 38864, 38844)
* **Ray Train with PyTorch Lightning:** Moving away from the LightningTrainer in favor of the TorchTrainer as the recommended way of running distributed PyTorch Lightning. The changes follow the proposal outlined in [[REP37] [Train] Unify Torch based Trainers on the TorchTrainer API](https://github.com/ray-project/enhancements/pull/37) (#37989)
* **Ray Train with Hugging Face Transformers/Accelerate:** Moving away from the TransformersTrainer/AccelerateTrainer in favor of the TorchTrainer as the recommended way of running distributed Hugging Face Transformers and Accelerate. The changes follow the proposal outlined in [[REP37] [Train] Unify Torch based Trainers on the TorchTrainer API](https://github.com/ray-project/enhancements/pull/37) (#38083, 38295)
* Deprecated `preprocessor` arg to `Trainer` (38640)
* Removed deprecated `Result.log_dir` (38794)

💫Enhancements:



* Various improvements and fixes for the console output of Ray Train and Tune (37572, 37571, 37570, 37569, 37531, 36993)
* Raise actionable error message for missing dependencies (38497)
* Use posix paths throughout library code (38319)
* Group consecutive workers by IP (38490)
* Split all Ray Datasets by default (38694)
* Add static Trainer methods for getting tree-based models (38344)
* Don't set rank-specific local directories for Train workers (38007)

🔨 Fixes:



* Fix trainer restoration from S3 (38251)

🏗 Architecture refactoring:



* Updated internal usage of the new Checkpoint API (38853, 38804, 38697, 38695, 38757, 38648, 38598, 38617, 38554, 38586, 38523, 38456, 38507, 38491, 38382, 38355, 38284, 38128, 38143, 38227, 38141, 38057, 38104, 37888, 37991, 37962, 37925, 37906, 37690, 37543, 37475, 37142, 38855, 38807, 38818, 39515, 39468, 39368, 39195, 39105, 38563, 38770, 38759, 38767, 38715, 38709, 38478, 38550, 37909, 37613, 38876, 38868, 38736, 38871, 38820, 38457)

📖Documentation:



* Restructured the Ray Train documentation to make it easier to find relevant content (37892, 38287, 38417, 38359)
* Improved examples, references, and navigation items (38049, 38084, 38108, 37921, 38391, 38519, 38542, 38541, 38513, 39510, 37588, 37295, 38600, 38582, 38276, 38686, 38537, 38237, 37016)
* Removed outdated examples (38682, 38696, 38656, 38374, 38377, 38441, 37673, 37657, 37067)


Ray Tune

🤝 API Changes



* **Ray Train and Ray Tune Checkpoints:** Introduced a new `train.Checkpoint` class that unifies interaction with remote storage such as S3, GS, and HDFS. The changes follow the proposal in [[REP35] Consolidated persistence API for Ray Train/Tune](https://github.com/ray-project/enhancements/pull/35) (#38452, 38481, 38581, 38626, 38864, 38844)
* Removed deprecated `Result.log_dir` (38794)

💫Enhancements:



* Various improvements and fixes for the console output of Ray Train and Tune (37572, 37571, 37570, 37569, 37531, 36993)
* Raise actionable error message for missing dependencies (38497)
* Use posix paths throughout library code (38319)
* Improved the PyTorchLightning integration (38883, 37989, 37387, 37400)
* Improved the XGBoost/LightGBM integrations (38558, 38828)

🔨 Fixes:



* Fix hyperband r calculation and stopping (39157)
* Replace deprecated np.bool8 (38495)
* Miscellaneous refactors and fixes (38165, 37506, 37181, 37173)

🏗 Architecture refactoring:



* Updated internal usages of the new Checkpoint API (38853, 38804, 38697, 38695, 38757, 38648, 38598, 38617, 38554, 38586, 38523, 38456, 38507, 38491, 38382, 38355, 38284, 38128, 38143, 38227, 38141, 38057, 38104, 37888, 37991, 37962, 37925, 37906, 37690, 37543, 37475, 37142, 38855, 38807, 38818, 39515, 39468, 39368, 39195, 39105, 38563, 38770, 38759, 38767, 38715, 38709, 38478, 38550, 37909, 37613, 38876, 38868, 38736, 38871, 38820, 38457)
* Removed legacy TrialRunner/Executor (37927)


Ray Serve

🎉 New Features:



* Added keep_alive_timeout_s to Serve config file to allow users to configure HTTP proxy’s duration to keep idle connections alive when no requests are ongoing.
* Added gRPC proxy to serve gRPC requests through Ray Serve. It comes with feature parity with HTTP while offering better performance. Also, replaces the previous experimental gRPC direct ingress.
* Ray 2.7 introduces a new `DeploymentHandle` API that will replace the existing `RayServeHandle` and `RayServeSyncHandle` APIs in a future release. You are encouraged to migrate to the new API to avoid breakages in the future. To opt in, either use `handle.options(use_new_handle_api=True)` or set the global environment variable `export RAY_SERVE_ENABLE_NEW_HANDLE_API=1`. See https://docs.ray.io/en/latest/serve/model_composition.html for more details.
* Added a new API `get_app_handle` that gets a handle used to send requests to an application. The API uses the new `DeploymentHandle` API.
* Added a new developer API `get_deployment_handle` that gets a handle that can be used to send requests to any deployment in any application.
* Added replica placement group support.
* Added a new API `serve.status` which can be used to get the status of proxies and Serve applications (and their deployments and replicas). This is the pythonic equivalent of the CLI `serve status`.
* A `--reload` option has been added to the `serve run` CLI.
* Support X-Request-ID in http header

💫Enhancements:



* Downstream handlers will now be canceled when the HTTP client disconnects or an end-to-end timeout occurs.
* Ray Serve is now “generally available,” so the core APIs have been marked stable.
* `serve.start` and `serve.run` have a few small changes and deprecations in preparation for this, see [https://docs.ray.io/en/latest/serve/api/index.html](https://docs.ray.io/en/latest/serve/api/index.html) for details.
* Added a new metric (`ray_serve_num_ongoing_http_requests`) to track the number of ongoing requests in each proxy
* Add `RAY_SERVE_MULTIPLEXED_MODEL_ID_MATCHING_TIMEOUT_S` flag to wait until the model matching.
* Reduce the multiplexed model id information publish interval.
* Add Multiplex metrics into dashboard
* Added metrics to track controller restarts and control loop progress
* [https://github.com/ray-project/ray/pull/38177](https://github.com/ray-project/ray/pull/38177)
* [https://github.com/ray-project/ray/pull/38000](https://github.com/ray-project/ray/pull/38000)
* Various stability, flexibility, and performance enhancements to Ray Serve’s autoscaling.
* [https://github.com/ray-project/ray/pull/38107](https://github.com/ray-project/ray/pull/38107)
* [https://github.com/ray-project/ray/pull/38034](https://github.com/ray-project/ray/pull/38034)
* [https://github.com/ray-project/ray/pull/38267](https://github.com/ray-project/ray/pull/38267)
* [https://github.com/ray-project/ray/pull/38349](https://github.com/ray-project/ray/pull/38349)
* [https://github.com/ray-project/ray/pull/38351](https://github.com/ray-project/ray/pull/38351)

🔨 Fixes:



* Fixed a memory leak in Serve components by upgrading gRPC: [https://github.com/ray-project/ray/issues/38591](https://github.com/ray-project/ray/issues/38591).
* Fixed a memory leak due to `asyncio.Event`s not being removed in the long poll host: [https://github.com/ray-project/ray/pull/38516](https://github.com/ray-project/ray/pull/38516).
* Fixed a bug where bound deployments could not be passed within custom objects: [https://github.com/ray-project/ray/issues/38809](https://github.com/ray-project/ray/issues/38809).
* Fixed a bug where all replica handles were unnecessarily broadcasted to all proxies every minute: [https://github.com/ray-project/ray/pull/38539](https://github.com/ray-project/ray/pull/38539).
* Fixed a bug where `ray_serve_deployment_queued_queries` wouldn’t decrement when clients disconnected:[ https://github.com/ray-project/ray/pull/37965](https://github.com/ray-project/ray/pull/37965).

📖Documentation:



* Added docs for how to use keep_alive_timeout_s in the Serve config file.
* Added usage and examples for serving gRPC requests through Serve’s gRPC proxy.
* Added example for passing deployment handle responses by reference.
* Added a Ray Serve Autoscaling guide to the Ray Serve docs that goes over basic configurations and autoscaling examples. Also added an Advanced Ray Serve Autoscaling guide that goes over more advanced configurations and autoscaling examples.
* Added docs explaining how to debug memory leaks in Serve.
* Added docs that explain how Serve cancels disconnected requests and how to handle those disconnections.


RLlib

🎉 New Features:



* In Ray RLlib, we have implemented Google’s new [DreamerV3](https://github.com/ray-project/ray/tree/master/rllib/algorithms/dreamerv3), a sample-efficient, model-based, and hyperparameter hassle-free algorithm. It solves a wide variety of challenging reinforcement learning environments out-of-the-box (e.g. the MineRL diamond challenge), for arbitrary observation- and action-spaces as well as dense and sparse reward functions.

💫Enhancements:



* Added support for Gymnasium 0.28.1 [(35698](https://github.com/ray-project/ray/pull/35698))
* Dreamer V3 tuned examples and support for “XL” Dreamer models ([38461](https://github.com/ray-project/ray/pull/38461))
* Added an action masking example for RL Modules ([38095](https://github.com/ray-project/ray/pull/38095))

🔨 Fixes:



* Multiple fixes to DreamerV3 ([37979](https://github.com/ray-project/ray/pull/37979)) ([#38259](https://github.com/ray-project/ray/pull/38259)) ([#38461](https://github.com/ray-project/ray/pull/38461)) ([#38981](https://github.com/ray-project/ray/pull/38981))
* Fixed TorchBinaryAutoregressiveDistribution.sampled_action_logp() returning probs not log probs. ([37240](https://github.com/ray-project/ray/pull/37240))
* Fix a bug in Multi-Categorical distribution. It should use logp and not log_p. ([36814](https://github.com/ray-project/ray/pull/36814))
* Index tensors in slate epsilon greedy properly so SlateQ does not fail on multiple GPUs ([37481](https://github.com/ray-project/ray/pull/37481))
* Removed excessive deprecation warnings in exploration related files ([37404](https://github.com/ray-project/ray/pull/37404))
* Fixed missing agent index in policy input dict on environment reset ([37544](https://github.com/ray-project/ray/pull/37544))

📖Documentation:



* Added docs for DreamerV3 [(37978](https://github.com/ray-project/ray/pull/37978))
* Added docs on torch.compile usage ([37252](https://github.com/ray-project/ray/pull/37252))
* Added docs for the Learner API [(37729](https://github.com/ray-project/ray/pull/37729))
* Improvements to Catalogs and RL Modules docs + Catalogs improvements ([37245](https://github.com/ray-project/ray/pull/37245))
* Extended our metrics and callbacks example to showcase how to do custom summarisation on custom metrics ([37292](https://github.com/ray-project/ray/pull/37292))


Ray Core and Ray Clusters


Ray Core

🎉 New Features:
* [Actor task cancelation](https://docs.ray.io/en/master/ray-core/actors.html#cancelling-actor-tasks) is officially supported.
* The experimental streaming generator is now available. It means the yielded output is sent to the caller before the task is finished and overcomes the [limitation from `num_returns="dynamic"` generator](https://docs.ray.io/en/latest/ray-core/tasks/generators.html#limitations). The API could be used by specifying `num_returns="streaming"`. The API has been used for Ray data and Ray serve to support streaming use cases. [See the test script](https://github.com/ray-project/ray/blob/0d6bc79bbba400e91346a021279501e05940b51e/python/ray/tests/test_streaming_generator.py#L123) to learn how to use the API. The documentation will be available in a few days.

💫Enhancements:
* Minimal Ray installation `pip install ray` doesn't require the Python grpcio dependency anymore.
* [Breaking change] `ray job submit` now exits with `1` if the job fails instead of `0`. To get the old behavior back, you may use `ray job submit ... || true` . ([38390](https://github.com/ray-project/ray/pull/38390))
* [Breaking change] `get_assigned_resources` in pg will return the name of the original resources instead of formatted name (37421)
* [Breaking change] Every env var specified via `${ENV_VAR} ` now can be replaced. Previous versions only supported limited number of env vars. (36187)
* [Java] Update Guava package (38424)
* [Java] Update Jackson Databind XML Parsing (38525)
* [Spark] Allow specifying CPU / GPU / Memory resources for head node of Ray cluster on spark (38056)

🔨 Fixes:
* [Core] Internal gRPC version is upgraded from 1.46.6 to 1.50.2, which fixes the memory leak issue
* [Core] Bind jemalloc to raylet and GCS (38644) to fix memory fragmentation issue
* [Core] Previously, when a ray is started with `ray start --node-ip-address=...`, the driver also had to specify `ray.init(_node_ip_address)`. Now Ray finds the node ip address automatically. (37644)
* [Core] Child processes of workers are cleaned up automatically when a raylet dies (38439)
* [Core] Fix the issue where there are lots of threads created when using async actor (37949)
* [Core] Fixed a bug where tracing did not work when an actor/task was defined prior to calling `ray.init`: [https://github.com/ray-project/ray/issues/26019](https://github.com/ray-project/ray/issues/26019)
* Various other bug fixes
* [Core] loosen the check on release object (39570)
* [Core][agent] fix the race condition where the worker process terminated during the get_all_workers call 37953
* [Core]Fix PG leakage caused by GCS restart when PG has not been successfully remove after the job died (35773)
* [Core]Fix internal_kv del api bug in client proxy mode (37031)
* [Core] Pass logs through if sphinx-doctest is running (36306)
* [Core][dashboard] Make intentional ray system exit from worker exit non task failing (38624)
* [Core][dashboard] Add worker pid to task info (36941)
* [Core] Use 1 thread for all fibers for an actor scheduling queue. (37949)
* [runtime env] Fix Ray hangs when nonexistent conda environment is specified 28105 (34956)

Ray Clusters

💫Enhancements:



* New Cluster Launcher for vSphere [37815](https://github.com/ray-project/ray/pull/37815)
* TPU pod support for cluster launcher [37934](https://github.com/ray-project/ray/pull/37934)

📖Documentation:



* The KubeRay documentation has been moved to [https://docs.ray.io/en/latest/cluster/kubernetes/index.html](https://docs.ray.io/en/latest/cluster/kubernetes/index.html) from its old location at [https://ray-project.github.io/kuberay/](https://ray-project.github.io/kuberay/).
* New guide: GKE Ingress on KubeRay ([39073](https://github.com/ray-project/ray/pull/39073))
* New tutorial: Cloud storage from GKE on KubeRay [38858](https://github.com/ray-project/ray/pull/38858)
* New tutorial: Batch inference tutorial using KubeRay RayJob CR [38857](https://github.com/ray-project/ray/pull/38857)
* New benchmarks for RayService custom resource on KubeRay [38647](https://github.com/ray-project/ray/pull/38647)
* New tutorial: Text summarizer using NLP with RayService [38647](https://github.com/ray-project/ray/pull/38647)

Thanks

Many thanks to all those who contributed to this release!

simran-2797, can-anyscale, akshay-anyscale, c21, EdwardCuiPeacock, rynewang, volks73, sven1977, alexeykudinkin, mattip, Rohan138, larrylian, DavidYoonsik, scv119, alpozcan, JalinWang, peterghaddad, rkooo567, avnishn, JoshKarpel, tekumara, zcin, jiwq, nikosavola, seokjin1013, shrekris-anyscale, ericl, yuxiaoba, vymao, architkulkarni, rickyyx, bveeramani, SongGuyang, jjyao, sihanwang41, kevin85421, ArturNiederfahrenhorst, justinvyu, pleaseupgradegrpcio, aslonnie, kukushking, 94929, jrosti, MattiasDC, edoakes, PRESIDENT810, cadedaniel, ddelange, alanwguo, noahjax, matthewdeng, pcmoritz, richardliaw, vitsai, Michaelvll, tanmaychimurkar, smiraldr, wfangchi, amogkam, crypdick, WeichenXu123, darthhexx, angelinalg, chaowanggg, GeneDer, xwjiang2010, peytondmurray, z4y1b2, scottsun94, chappidim, jovany-wang, jaidisido, krfricke, woshiyyya, Shubhamurkade, ijrsvt, scottjlee, kouroshHakha, allenwang28, raulchen, stephanie-wang, iycheng

2.6.3

Not secure
The Ray 2.6.3 patch release contains fixes for Ray Serve, and Ray Core streaming generators.

Ray Core
🔨 Fixes:
* [Core][Streaming Generator] Fix memory leak from the end of object stream object 38152 (38206)

Ray Serve
🔨 Fixes:
* [Serve] Fix `serve run` help message (37859) (38018)
* [Serve] Decrement `ray_serve_deployment_queued_queries` when client disconnects (37965) (38020)


RLib
📖 Documentation:
* [RLlib][docs] Learner API Docs (37729) (38137)

2.6.2

Not secure
The Ray 2.6.2 patch release contains a critical fix for ray's logging setup, as well fixes for Ray Serve, Ray Data, and Ray Job.

Ray Core
🔨 Fixes:
* [Core] Pass logs through if sphinx-doctest is running (36306) (37879)
* [cluster-launcher] Pick GCP cluster launcher tests and fix (37797)


Ray Serve
🔨 Fixes:
* [Serve] Apply `request_timeout_s` from Serve config to the cluster (37884) (37903)

Ray Air
🔨 Fixes:
* [air] fix pyarrow lazy import (37670) (37883)

Page 6 of 18

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.