Ray

Latest version: v2.42.1

Safety actively analyzes 707435 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 17

2.42.0

Not secure
Ray Libraries<a id="ray-libraries"></a>

Ray Data<a id="ray-data"></a>
πŸŽ‰ New Features:
- Added read_audio and read_video ([50016](https://github.com/ray-project/ray/pull/50016))

πŸ’« Enhancements:
- Optimized multi-column groupbys ([45667](https://github.com/ray-project/ray/pull/45667))
- Included Ray user-agent in BigQuery client construction ([49922](https://github.com/ray-project/ray/pull/49922))

πŸ”¨ Fixes:
- Fixed bug that made read tasks non-deterministic ([49897](https://github.com/ray-project/ray/pull/49897))

πŸ—‘οΈ Deprecations:
- Deprecated num_rows_per_file in favor of min_rows_per_file ([49978](https://github.com/ray-project/ray/pull/49978))

Ray Train<a id="ray-train"></a>
πŸ’« Enhancements:
- Add Train v2 user-facing callback interface (49819)
- Add TuneReportCallback for propagating intermediate Train results to Tune (49927)

Ray Tune<a id="ray-tune"></a>
πŸ“– Documentation:
- Fix BayesOptSearch docs (49848)

Ray Serve<a id="ray-serve"></a>
πŸ’« Enhancements:
- Cache metrics in replica and report on an interval ([49971](https://github.com/ray-project/ray/pull/49971))
- Cache expensive calls to inspect.signature ([49975](https://github.com/ray-project/ray/pull/49975))
- Remove extra pickle serialization for gRPCRequest ([49943](https://github.com/ray-project/ray/pull/49943))
- Shared LongPollClient for Routers ([48807](https://github.com/ray-project/ray/pull/48807))
- DeploymentHandle API is now stable ([49840](https://github.com/ray-project/ray/pull/49840))

πŸ”¨ Fixes:
- Fix batched requests hanging after request cancellation bug ([50054](https://github.com/ray-project/ray/pull/50054))

RLlib<a id="rllib"></a>
πŸ’« Enhancements:
- Add metrics to replay buffers. ([49822](https://github.com/ray-project/ray/pull/49822))
- Enhance node-failure tolerance (new API stack). ([50007](https://github.com/ray-project/ray/pull/50007))
- MetricsLogger cleanup throughput logic. ([49981](https://github.com/ray-project/ray/pull/49981))
- Split AddStates... connectors into 2 connector pieces (`AddTimeDimToBatchAndZeroPad` and `AddStatesFromEpisodesToBatch`) ([49835](https://github.com/ray-project/ray/pull/49835))

πŸ”¨ Fixes:
- Old API stack IMPALA/APPO: Re-introduce mixin-replay-buffer pass, even if `replay-ratio=0` (fixes a memory leak). ([49964](https://github.com/ray-project/ray/pull/49964))
- Fix MetricsLogger race conditions. ([49888](https://github.com/ray-project/ray/pull/49888))
- APPO/IMPALA: Bug fix for > 1 Learner actor. ([49849](https://github.com/ray-project/ray/pull/49849))

πŸ“– Documentation:
- New MetricsLogger API rst page. ([49538](https://github.com/ray-project/ray/pull/49538))
- Move "new API stack" info box right below page titles for better visibility. ([49921](https://github.com/ray-project/ray/pull/49921))
- Add example script for how to log custom metrics in `training_step()`. ([49976](https://github.com/ray-project/ray/pull/49976))
- Enhance/redo autoregressive action distribution example. ([49967](https://github.com/ray-project/ray/pull/49967))
- Make the "tiny CNN" example RLModule run with APPO (by implementing `TargetNetAPI`) ([49825](https://github.com/ray-project/ray/pull/49825))

Ray Core and Ray Clusters

Ray Core<a id="ray-core"></a>
πŸ’« Enhancements:
- Only get single node info rather then all when needed ([49727](https://github.com/ray-project/ray/pull/49727))
- Introduce with_tensor_transport API ([49753](https://github.com/ray-project/ray/pull/49753))

πŸ”¨ Fixes:
- Fix tqdm manager thread safe [50040](https://github.com/ray-project/ray/pull/50040)

Ray Clusters <a id="ray-clusters"></a>
πŸ”¨ Fixes:
- Fix token expiration for ray autoscaler ([48481](https://github.com/ray-project/ray/pull/48481))

Thanks

Thank you to everyone who contributed to this release! πŸ₯³
wingkitlee0, saihaj, win5923, justinvyu, kevin85421, edoakes, cristianjd, rynewang, richardliaw, LeoLiao123, alexeykudinkin, simonsays1980, aslonnie, ruisearch42, pcmoritz, fscnick, bveeramani, mattip, till-m, tswast, ujjawal-khare, wadhah101, nikitavemuri, akshay-anyscale, srinathk10, zcin, dayshah, dentiny, LydiaXwQ, matthewdeng, JoshKarpel, MortalHappiness, sven1977, omatthew98

2.41.0

Not secure
Highlights

- Major update of RLlib docs and example scripts for the new API stack.

Ray Libraries

Ray Data

πŸŽ‰ New Features:

- Expression support for filters (49016)
- Support `partition_cols` in `write_parquet` (49411)
- Feature: implement multi-directional sort over Ray Data datasets (49281)

πŸ’« Enhancements:

- Use dask 2022.10.2 (48898)
- Clarify schema validation error (48882)
- Raise `ValueError` when the data sort key is `None` (48969)
- Provide more messages when webdataset format is error (48643)
- Upgrade Arrow version from 17 to 18 (48448)
- Update `hudi` version to 0.2.0 (48875)
- `webdataset`: expand JSON objects into individual samples (48673)
- Support passing kwargs to map tasks. (49208)
- Add `ExecutionCallback` interface (49205)
- Add seed for read files (49129)
- Make `select_columns` and `rename_columns` use Project operator (49393)

πŸ”¨ Fixes:

- Fix partial function name parsing in `map_groups` (48907)
- Always launch one task for `read_sql` (48923)
- Reimplement of fix memory pandas (48970)
- `webdataset`: flatten return args (48674)
- Handle `numpy > 2.0.0` behaviour in `_create_possibly_ragged_ndarray` (48064)
- Fix `DataContext` sealing for multiple datasets. (49096)
- Fix `to_tf` for `List` types (49139)
- Fix type mismatch error while mapping nullable column (49405)
- Datasink: support passing write results to `on_write_completes` (49251)
- Fix `groupby` hang when value contains `np.nan` (49420)
- Fix bug where `file_extensions` doesn't work with compound extensions (49244)
- Fix map operator fusion when concurrency is set (49573)

Ray Train

πŸŽ‰ New Features:

- Output JSON structured log files for system and application logs (49414)
- Add support for AMD ROCR_VISIBLE_DEVICES (49346)

πŸ’« Enhancements:

- Implement Train Tune API Revamp REP (49376, 49467, 49317, 49522)

πŸ— Architecture refactoring:

- LightGBM: Rewrite `get_network_params` implementation (49019)

Ray Tune

πŸŽ‰ New Features:

- Update `optuna_search` to allow users to configure optuna storage (48547)

πŸ— Architecture refactoring:

- Make changes to support Train Tune API Revamp REP (49308, 49317, 49519)

Ray Serve

πŸ’« Enhancements:

- Improved request_id generation to reduce proxy CPU overhead (49537)
- Tune GC threshold by default in proxy (49720)
- Use `pickle.dumps` for faster serialization from `proxy` to `replica` (49539)

πŸ”¨ Fixes:

- Handle nested β€˜=’ in serve run arguments (49719)
- Fix bug when `ray.init()` is called multiple times with different `runtime_envs` (49074)

πŸ—‘οΈ Deprecations:

- Adds a warning that the default behavior for sync methods will change in a future release. They will be run in a threadpool by default. You can opt into this behavior early by setting `RAY_SERVE_RUN_SYNC_IN_THREADPOOL=1`. (48897)

RLlib

πŸŽ‰ New Features:

- Add support for external Envs to new API stack: New example script and custom tcp-capable EnvRunner. (49033)

πŸ’« Enhancements:

- Offline RL:
- Add sequence sampling to `EpisodeReplayBuffer`. (48116)
- Allow incomplete `SampleBatch` data and fully compressed observations. (48699)
- Add option to customize `OfflineData`. (49015)
- Enable offline training without specifying an environment. (49041)
- Various fixes: 48309, 49194, 49195
- APPO/IMPALA acceleration (new API stack):
- Add support for `AggregatorActors` per Learner. (49284)
- Auto-sleep time AND thread-safety for MetricsLogger. (48868)
- Activate APPO cont. actions release- and CI tests (HalfCheetah-v1 and Pendulum-v1 new in `tuned_examples`). (49068)
- Add "burn-in" period setting to the training of stateful RLModules. (49680)
- Callbacks API: Add support for individual lambda-style callbacks. (49511)
- Other enhancements: 49687, 49714, 49693, 49497, 49800, 49098

πŸ“– Documentation:

- New example scripts:
- How to write a custom algorithm (VPG) from scratch. (49536)
- How to customize an offline data pipeline. (49046)
- GPUs on EnvRunners. (49166)
- Hierarchical training. (49127)
- Async gym vector env. (49527)
- Other fixes and enhancements: 48988, 49071
- New/rewritten html pages:
- Rewrite checkpointing page. (49504)
- New scaling guide. (49528)
- New callbacks page. (49513)
- Rewrite `RLModule` page. (49387)
- New AlgorithmConfig page and redo `package_ref` page for algo configs. (49464)
- Rewrite offline RL page. (48818)
- Rewrite β€œkey concepts" rst page. (49398)
- Rewrite RL environments pages. (49165, 48542)
- Fixes and enhancements: 49465, 49037, 49304, 49428, 49474, 49399, 49713, 49518

πŸ”¨ Fixes:

- Add `on_episode_created` callback to SingleAgentEnvRunner. (49487)
- Fix `train_batch_size_per_learner` problems. (49715)
- Various other fixes: 48540, 49363, 49418, 49191

πŸ— Architecture refactoring:

- RLModule: Introduce `Default[algo]RLModule` classes (49366, 49368)
- Remove RLlib dependencies from setup.py; add `ormsgpack` (49489)

πŸ—‘οΈ Deprecations:

- 49488, 49144

Ray Core and Ray Clusters

Ray Core

πŸ’« Enhancements:

- Add `task_name`, `task_function_name` and `actor_name` in Structured Logging (48703)
- Support redis/valkey authentication with username (48225)
- Add v6e TPU Head Resource Autoscaling Support (48201)
- compiled graphs: Support all driver and actor read combinations (48963)
- compiled graphs: Add ascii based CG visualization (48315)
- compiled graphs: Add ray[cg] pip install option (49220)
- Allow uv cache at installation (49176)
- Support != Filter in GCS for Task State API (48983)
- compiled graphs: Add CPU-based NCCL communicator for development (48440)
- Support gcs and raylet log rotation (48952)
- compiled graphs: Support `nsight.nvtx` profiling (49392)

πŸ”¨ Fixes:

- autoscaler: Health check logs are not visible in the autoscaler container's stdout (48905)
- Only publish `WORKER_OBJECT_EVICTION` when the object is out of scope or manually freed (47990)
- autoscaler: Autoscaler doesn't scale up correctly when the KubeRay RayCluster is not in the goal state (48909)
- autoscaler: Fix incorrectly terminating nodes misclassified as idle in autoscaler v1 (48519)
- compiled graphs: Fix the missing dependencies when num_returns is used (49118)
- autoscaler: Fuse scaling requests together to avoid overloading the Kubernetes API server (49150)
- Fix bug to support S3 pre-signed url for `.whl` file (48560)
- Fix data race on gRPC client context (49475)
- Make sure draining node is not selected for scheduling (49517)

Ray Clusters

πŸ’« Enhancements:

- Azure: Enable accelerated networking as a flag in azure vms (47988)

πŸ“– Documentation:

- Kuberay: Logging: Add Fluent Bit `DaemonSet` and Grafana Loki to "Persist KubeRay Operator Logs" (48725)
- Kuberay: Logging: Specify the Helm chart version in "Persist KubeRay Operator Logs" (48937)

Dashboard

πŸ’« Enhancements:

- Add instance variable to many default dashboard graphs (49174)
- Display duration in milliseconds if under 1 second. (49126)
- Add `RAY_PROMETHEUS_HEADERS` env for carrying additional headers to Prometheus (49353)
- Document about the `RAY_PROMETHEUS_HEADERS` env for carrying additional headers to Prometheus (49700)

πŸ— Architecture refactoring:

- Move `memray` dependency from default to observability (47763)
- Move `StateHead`'s methods into free functions. (49388)

Thanks

raulchen, alanwguo, omatthew98, xingyu-long, tlinkin, yantzu, alexeykudinkin, andrewsykim, win5923, csy1204, dayshah, richardliaw, stephanie-wang, gueraf, rueian, davidxia, fscnick, wingkitlee0, KPostOffice, GeneDer, MengjinYan, simonsays1980, pcmoritz, petern48, kashiwachen, pfldy2850, zcin, scottjlee, Akhil-CM, Jay-ju, JoshKarpel, edoakes, ruisearch42, gorloffslava, jimmyxie-figma, bthananjeyan, sven1977, bnorick, jeffreyjeffreywang, ravi-dalal, matthewdeng, angelinalg, ivanthewebber, rkooo567, srinathk10, maresb, gvspraveen, akyang-anyscale, mimiliaogo, bveeramani, ryanaoleary, kevin85421, richardsliu, hartikainen, coltwood93, mattip, Superskyyy, justinvyu, hongpeng-guo, ArturNiederfahrenhorst, jecsand838, Bye-legumes, hcc429, WeichenXu123, martinbomio, HollowMan6, MortalHappiness, dentiny, zhe-thoughts, anyadontfly, smanolloff, richo-anyscale, khluu, xushiyan, rynewang, japneet-anyscale, jjyao, sumanthratna, saihaj, aslonnie

Many thanks to all those who contributed to this release!

2.40.0

Not secure
Ray Libraries
Ray Data
πŸŽ‰ New Features:
- Added read_hudi (https://github.com/ray-project/ray/pull/46273)

πŸ’« Enhancements:
- Improved performance of DelegatingBlockBuilder (https://github.com/ray-project/ray/pull/48509)
- Improved memory accounting of pandas blocks (https://github.com/ray-project/ray/pull/46939)

πŸ”¨ Fixes:
- Fixed bug where you can’t specify a schema with write_parquet (https://github.com/ray-project/ray/issues/48630)
- Fixed bug where to_pandas errors if your dataset contains Arrow and pandas blocks (https://github.com/ray-project/ray/pull/48583)
- Fixed bug where map_groups doesn’t work with pandas data (https://github.com/ray-project/ray/pull/48287)
- Fixed bug where write_parquet errors if your data contains nullable fields (https://github.com/ray-project/ray/pull/48478)
- Fixed bug where β€œIteration Blocked Time” charts looks incorrect (https://github.com/ray-project/ray/pull/48618)
- Fixed bug where unique fails with null values (https://github.com/ray-project/ray/pull/48750)
- Fixed bug where β€œRows Outputted” is 0 in the Data dashboard (https://github.com/ray-project/ray/pull/48745)
- Fixed bug where methods like drop_columns cause spilling (https://github.com/ray-project/ray/pull/48140)
- Fixed bug where async map tasks hang (https://github.com/ray-project/ray/pull/48861)

πŸ—‘οΈ Deprecations:
- Deprecated read_parquet_bulk https://github.com/ray-project/ray/pull/48691
- Deprecated iter_tf_batches https://github.com/ray-project/ray/pull/48693
- Deprecated meta_provider parameter of read functions (https://github.com/ray-project/ray/pull/48690)
- Deprecated to_torch (https://github.com/ray-project/ray/pull/48692)

Ray Train
πŸ”¨ Fixes:
- Fix StartTracebackWithWorkerRank serialization (48548)

πŸ“– Documentation:
- Add example for fine-tuning Llama3.1 with AWS Trainium (48768)

Ray Tune
πŸ”¨ Fixes:
- Remove the `clear_checkpoint` function during Trial restoration error handling. (48532)

Ray Serve
πŸŽ‰ New Features:
- Initial version of local_testing_mode ([48477](https://github.com/ray-project/ray/pull/48477))

πŸ’« Enhancements:
- Handle multiple changed objects per LongPollHost.listen_for_change RPC ([48803](https://github.com/ray-project/ray/pull/48803/files))
- Add more nuanced checks for http proxy status errors ([47896](https://github.com/ray-project/ray/pull/47896))
- Improve replica access log messages to include HTTP status info and better resemble standard log format ([48819](https://github.com/ray-project/ray/pull/48819))
- Propagate replica constructor error to deployment status message and print num retries left ([48531](https://github.com/ray-project/ray/pull/47896))

πŸ”¨ Fixes:
- Pending requests that are cancelled before they were assigned to a replica now also return a serve.RequestCancelledError ([48496](https://github.com/ray-project/ray/pull/48496))

RLlib
πŸ’« Enhancements:
- Release test enhancements. ([45803](https://github.com/ray-project/ray/pull/45803), [#48681](https://github.com/ray-project/ray/pull/48681))
- Make opencv-python-headless default over opencv-python ([48776](https://github.com/ray-project/ray/pull/48776)[)](https://github.com/ray-project/ray/commit/aaac19c8307038021dd96ffc4c2e616fbbf14896)
- Reverse learner queue behavior of IMPALA/APPO (consume oldest batches first, instead of newest, BUT drop oldest batches if queue full). ([48702](https://github.com/ray-project/ray/pull/48702))

πŸ”¨ Fixes:
- Fix torch scheduler stepping and reporting. ([48125](https://github.com/ray-project/ray/pull/48125)[)](https://github.com/ray-project/ray/commit/ec9775d86fbf7eb93358d95268e9f62e53f790bd)
- Fix accumulation of results over n training_step calls within same iteration (new API stack). ([48136](https://github.com/ray-project/ray/pull/48136))
- Various other fixes: [48563](https://github.com/ray-project/ray/pull/48563), [#48314](https://github.com/ray-project/ray/pull/48314), [#48698](https://github.com/ray-project/ray/pull/48698), [#48869](https://github.com/ray-project/ray/pull/48869).

πŸ“– Documentation:
- Upgrade examples script overview page (new API stack). ([48526](https://github.com/ray-project/ray/pull/48526)[)](https://github.com/ray-project/ray/commit/d39c9df1b69ba0451abff7075963c3a6e2824c9c)
- Enable RLlib + Serve example in CI and translate to new API stack. ([48687](https://github.com/ray-project/ray/pull/48687))

πŸ— Architecture refactoring:
- Switch new API stack on by default, APPO, IMPALA, BC, MARWIL, and CQL. ([48516](https://github.com/ray-project/ray/pull/48516), [#48599](https://github.com/ray-project/ray/pull/48599)[)](https://github.com/ray-project/ray/commit/03ea4f6663fafaf64b8d10ac8db8e962302be561)
- Various APPO enhancements (new API stack): Circular buffer [(](https://github.com/ray-project/ray/commit/05915c1b389ab0bada23217a3cb2768311d1184b)[#48798](https://github.com/ray-project/ray/pull/48798)), minor loss math fixes ([#48800](https://github.com/ray-project/ray/pull/48800)), target network update logic ([#48802](https://github.com/ray-project/ray/pull/48802)), smaller cleanups ([#48844](https://github.com/ray-project/ray/pull/48844)).
- Remove `rllib_contrib` from repo. ([48565](https://github.com/ray-project/ray/pull/48565)[)](https://github.com/ray-project/ray/commit/d2de98323f0848fec2dbeb61bbd39b507b9c97d8)

Ray Core and Ray Clusters

Ray Core
πŸŽ‰ New Features:
- [Core] uv runtime env support ([48479](https://github.com/ray-project/ray/pull/48479), [#48486](https://github.com/ray-project/ray/pull/48486), [#48611](https://github.com/ray-project/ray/pull/48611), [#48619](https://github.com/ray-project/ray/pull/48619), [#48632](https://github.com/ray-project/ray/pull/48632), [#48634](https://github.com/ray-project/ray/pull/48634), [#48637](https://github.com/ray-project/ray/pull/48637), [#48670](https://github.com/ray-project/ray/pull/48670), [#48731](https://github.com/ray-project/ray/pull/48731))
- [Core] GCS FT with redis sentinel (47335)

πŸ’« Enhancements:
- [CompiledGraphs] Refine schedule visualization (48594)

πŸ”¨ Fixes:
- [CompiledGraphs] Don't persist input_nodes in _CollectiveOperation to avoid wrong understanding about DAGs (48463)
- [Core] Fix Ascend NPU discovery to support 8+ cards per node (48543)
- [Core] Make Placement Group Wildcard and Indexed Resource Assignments Consistent (48088)
- [Core] Stop the GRPC server before Shut down the Object Store (48572)

Ray Clusters
πŸ”¨ Fixes:
- [KubeRay]: Fix ConnectionError on Autoscaler CR lookups in K8s clusters with custom DNS for Kubernetes API. ([48541](https://github.com/ray-project/ray/pull/48541))

Dashboard
πŸ’« Enhancements:
- Add global UTC timezone button in navbar with local storage (48510)
- Add memory graphs optimized for OOM debugging (48530)
- Improve tasks/actors metric naming and add graph for running tasks (48528)
add actor pid to dashboard (48791)

πŸ”¨ Fixes:
- Fix Placement Group Table table cells overflow (47323)
- Fix Rows Outputted being zero on Ray Data Dashboard (48745)
- fix confusing dataset operator name (48805)

Thanks
Thanks to all those who contributed to this release!
rynewang, rickyyx, bveeramani, marwan116, simonsays1980, dayshah, dentiny, KepingYan, mimiliaogo, kevin85421, SeaOfOcean, stephanie-wang, mohitjain2504, azayz, xushiyan, richardliaw, can-anyscale, xingyu-long, kanwang, aslonnie, MortalHappiness, jjyao, SumanthRH, matthewdeng, alexeykudinkin, sven1977, raulchen, andrewsykim, zcin, nadongjun, hongpeng-guo, miguelteixeiraa, saihaj, khluu, ArturNiederfahrenhorst, ryanaoleary, ltbringer, pcmoritz, JoshKarpel, akyang-anyscale, frances720, BeingGod, edoakes, Bye-legumes, Superskyyy, liuxsh9, MengjinYan, ruisearch42, scottjlee, angelinalg

2.39.0

Not secure
Ray Libraries

Ray Data

πŸ”¨ Fixes:
- Fixed InvalidObjectError edge case with Dataset.split() (https://github.com/ray-project/ray/pull/48130)
- Made Concatenator preserve order of concatenated columns (https://github.com/ray-project/ray/pull/47997)

πŸ“– Documentation:
- Improved documentation around Parquet column and predicate pushdown (https://github.com/ray-project/ray/pull/48095)
- Marked num_rows_per_file parameter of write APIs as experimental (https://github.com/ray-project/ray/pull/48208)
- One hot encoder now returns an encoded vector (https://github.com/ray-project/ray/pull/48173)
- transform_batch no longer fails on missing columns (https://github.com/ray-project/ray/pull/48137)

πŸ— Architecture refactoring:
- Dataset.count() now uses a Count logical operator (https://github.com/ray-project/ray/pull/48126)

πŸ—‘ Deprecations:
- Removed long-deprecated set_progress_bars (https://github.com/ray-project/ray/pull/48203)

Ray Train

πŸ”¨ Fixes:
- Safely check if the storage filesystem is `pyarrow.fs.S3FileSystem` (48216)

Ray Tune

πŸ”¨ Fixes:
- Safely check if the storage filesystem is `pyarrow.fs.S3FileSystem` (48216)

Ray Serve

πŸ’« Enhancements:
- Cancelled requests now return a serve.RequestCancelledError (https://github.com/ray-project/ray/pull/48444)
- Exposed application source in app details model (https://github.com/ray-project/ray/pull/45522)

πŸ”¨ Fixes:
- Basic HTTP deployments will now return β€œInternal Server Error” instead of a traceback to match FastAPI behavior (https://github.com/ray-project/ray/pull/48491)
- Fixed an issue where high values of max_ongoing_requests couldn’t be reached due to an interaction with core’s max_concurrency (https://github.com/ray-project/ray/pull/48274)
- Fixed an edge case where pending requests were not canceled properly (https://github.com/ray-project/ray/pull/47873)
- Removed deprecated API to set route_prefix per-deployment (https://github.com/ray-project/ray/pull/48223)

πŸ“– Documentation:
- Added ProxyStatus model to reference docs (https://github.com/ray-project/ray/pull/48299)
- Added ApplicationStatus model to reference docs (https://github.com/ray-project/ray/pull/48220)

RLlib

πŸ’« Enhancements:
- Upgrade to gymnasium==1.0.0 (support new API for vector env resets). ([48443](https://github.com/ray-project/ray/pull/48443), [#45328](https://github.com/ray-project/ray/pull/45328))
- Add off-policy'ness metric to new API stack. ([48227](https://github.com/ray-project/ray/pull/48227))
- Validate episodes before adding them to the buffer. ([48083](https://github.com/ray-project/ray/pull/48083))

πŸ“– Documentation:
- New example script for custom metrics on `EnvRunners` (using `MetricsLogger` API on the new stack). ([47969](https://github.com/ray-project/ray/pull/47969))
- Do-over: New RLlib index page. ([48285](https://github.com/ray-project/ray/pull/48285), [#48442](https://github.com/ray-project/ray/pull/48442))
- Do-over: Example script for AutoregressiveActionsRLM. ([47972](https://github.com/ray-project/ray/pull/47972))

πŸ— Architecture refactoring:
- New API stack on by default for PPO. ([48284](https://github.com/ray-project/ray/pull/48284))
- Change config.fault_tolerance default behavior (from `recreate_failed_env_runners=False` to `True`). ([48286](https://github.com/ray-project/ray/pull/48286))

πŸ”¨ Fixes:
- Various bug and CI fixes: [47993](https://github.com/ray-project/ray/pull/47993), [#48450](https://github.com/ray-project/ray/pull/48450), [#48213](https://github.com/ray-project/ray/pull/48213)
- Cleanup `evaluation` folder ([48493](https://github.com/ray-project/ray/pull/48493))

Ray Core

πŸŽ‰ New Features:
- [CompiledGraphs] Support all reduce collective in aDAG ([47621](https://github.com/ray-project/ray/pull/47621))
- [CompiledGraphs] Add visualization of compiled graphs ([47958](https://github.com/ray-project/ray/pull/47958))

πŸ’« Enhancements:
- [**Distributed Debugger**] The distributed debugger can now be used without having to set RAY_DEBUG=1, see https://github.com/ray-project/ray/pull/48301 and https://docs.ray.io/en/latest/ray-observability/ray-distributed-debugger.html. If you want to restore the previous behavior and use the CLI based debugger, you need to set RAY_DEBUG=legacy.
- [Core] Add more infos to each breakpoint for ray debug CLI ([48202](https://github.com/ray-project/ray/pull/48202))
- [Core] Add demands info to GCS debug state ([48115](https://github.com/ray-project/ray/pull/48115))
- [Core] Add PENDING_ACTOR_TASK_ARGS_FETCH and PENDING_ACTOR_TASK_ORDERING_OR_CONCURRENCY TaskStatus ([48242](https://github.com/ray-project/ray/pull/48242))
- [Core] Add metrics ray_io_context_event_loop_lag_ms. ([47989](https://github.com/ray-project/ray/pull/47989))
- [Core] Better log format when show the disk size ([46869](https://github.com/ray-project/ray/pull/46869))
- [CompiledGraphs] Support asyncio.gather on multiple CompiledDAGFutures ([47860](https://github.com/ray-project/ray/pull/47860))
- [CompiledGraphs] Raise an exception if a leaf node is found during compilation ([47757](https://github.com/ray-project/ray/pull/47757))


πŸ”¨ Fixes:
- [Core] Posts CoreWorkerMemoryStore callbacks onto io_context to fix deadlock ([47833](https://github.com/ray-project/ray/pull/47833))

Dashboard

πŸ”¨ Fixes:
- [Dashboard] Reworking dashboard_max_actors_to_cache to RAY_maximum_gcs_destroyed_actor_cached_count ([48229](https://github.com/ray-project/ray/pull/48229))

Thanks
Many thanks to all those who contributed to this release!

akyang-anyscale, rkooo567, bveeramani, dayshah, martinbomio, khluu, justinvyu, slfan1989, alexeykudinkin, simonsays1980, vigneshka, ruisearch42, rynewang, scottjlee, jjyao, JoshKarpel, win5923, MengjinYan, MortalHappiness, ujjawal-khare-27, zcin, ccoulombe, Bye-legumes, dentiny, stephanie-wang, LeoLiao123, dengwxn, richo-anyscale, pcmoritz, sven1977, omatthew98, GeneDer, srinathk10, can-anyscale, edoakes, kevin85421, aslonnie, jeffreyjeffreywang, ArturNiederfahrenhorst

2.38.0

Not secure
Ray Libraries<a id="ray-libraries"></a>

Ray Data<a id="ray-data"></a>

πŸŽ‰ New Features:
- Add `Dataset.rename_columns` (47906)
- Basic structured logging (47210)

πŸ’« Enhancements:
- Add `partitioning` parameter to `read_parquet` (47553)
- Add `SERVICE_UNAVAILABLE` to list of retried transient errors (47673)
- Re-phrase the streaming executor current usage string (47515)
- Remove ray.kill in ActorPoolMapOperator (47752)
- Simplify and consolidate progress bar outputs (47692)
- Refactor `OpRuntimeMetrics` to support properties (47800)
- Refactor `plan_write_op` and `Datasink`s (47942)
- Link `PhysicalOperator` to its `LogicalOperator` (47986)
- Allow specifying both `num_cpus` and `num_gpus` for map APIs (47995)
- Allow specifying insertion index when registering custom plan optimization `Rule`s (48039)
- Adding in better framework for substituting logging handlers (48056)

πŸ”¨ Fixes:
- Fix bug where Ray Data incorrectly emits progress bar warning (47680)
- Yield remaining results from async `map_batches` (47696)
- Fix event loop mismatch with async map (47907)
- Make sure `num_gpus` provide to Ray Data is appropriately passed to `ray.remote` call (47768)
- Fix unequal partitions when grouping by multiple keys (47924)
- Fix reading multiple parquet files with ragged ndarrays (47961)
- Removing unneeded test case (48031)
- Adding in better json checking in test logging (48036)
- Fix bug with inserting custom optimization rule at index 0 (48051)
- Fix logging output from `write_xxx` APIs (48096)

πŸ“– Documentation:
- Add docs section for Ray Data progress bars (47804)
- Add reference to parquet predicate pushdown (47881)
- Add tip about how to understand map_batches format (47394)

Ray Train<a id="ray-train"></a>

πŸ— Architecture refactoring:
- Remove deprecated mosaic and sklearn trainer code (47901)

Ray Tune<a id="ray-tune"></a>

πŸ”¨ Fixes:
- Fix WandbLoggerCallback to reuse actors upon restore (47985)

Ray Serve<a id="ray-serve"></a>

πŸ”¨ Fixes:
- Stop scheduling task early when requests have been canceled (47847)

RLlib<a id="rllib"></a>

πŸŽ‰ New Features:
- Enable cloud checkpointing. (47682)

πŸ’« Enhancements:
- PPO on new API stack now shuffles batches properly before each epoch. (47458)
- Other enhancements: 47705, 47501, 47731, 47451, 47830, 47970, 47157

πŸ”¨ Fixes:
- Fix spot node preemption problem (RLlib now run stably with EnvRunner workers on spot nodes) (47940)
- Fix action masking example. (47817)
- Various other fixes: 47973, 46721, 47914, 47880, 47304, 47686

πŸ— Architecture refactoring:
- Switch on new API stack by default for SAC and DQN. (47217)
- Remove Tf support on new API stack for PPO/IMPALA/APPO (only DreamerV3 on new API stack remains with tf now). (47892)
- Discontinue support for "hybrid" API stack (using RLModule + Learner, but still on RolloutWorker and Policy) (46085)
- RLModule (new API stack) refinements: 47884, 47885, 47889, 47908, 47915, 47965, 47775

πŸ“– Documentation:
- Add new API stack migration guide. (47779)
- New API stack example script: BC pre training, then PPO finetuning using same RLModule class. (47838)
- New API stack: Autoregressive actions example. (47829)
- Remove old API stack connector docs entirely. (47778)

Ray Core and Ray Clusters
Ray Core <a id="ray-core"></a>

πŸŽ‰ New Features:
- CompiledGraphs: support multi readers in multi node when DAG is created from an actor (47601)

πŸ’« Enhancements:
- Add a flag to raise exception for out of band serialization of `ObjectRef` (47544)
- Store each GCS table in its own Redis Hash (46861)
- Decouple create worker vs pop worker request. (47694)
- Add metrics for GCS jobs (47793)

πŸ”¨ Fixes:
- Fix broken dashboard cluster page when there are dead nodes (47701)
- Fix the `ray_tasks{State="PENDING_ARGS_FETCH"}` metric counting (47770)
- Separate the attempt_number with the task_status in memory summary and object list (47818)
- Fix object reconstruction hang on arguments pending creation (47645)
- Fix check failure: `sync_reactors_.find(reactor->GetRemoteNodeID()) == sync_reactors_.end()` (47861)
- Fix check failure `RAY_CHECK(it != current_tasks_.end())`; (47659)

πŸ“– Documentation:
- KubeRay docs: Add docs for YuniKorn Gang scheduling 47850

Dashboard<a id="dashboard"></a>

πŸ’« Enhancements:
- Performance improvements for large scale clusters (47617)

πŸ”¨ Fixes:
- Placement group and required resources not showing correctly in dashboard (47754)

Thanks

Many thanks to all those who contributed to this release!
GeneDer, rkooo567, dayshah, saihaj, nikitavemuri, bill-oconnor-anyscale, WeichenXu123, can-anyscale, jjyao, edoakes, kekulai-fredchang, bveeramani, alexeykudinkin, raulchen, khluu, sven1977, ruisearch42, dentiny, MengjinYan, Mark2000, simonsays1980, rynewang, PatricYan, zcin, sofianhnaide, matthewdeng, dlwh, scottjlee, MortalHappiness, kevin85421, win5923, aslonnie, prithvi081099, richardsliu, milesvant, omatthew98, Superskyyy, pcmoritz

2.37.0

Not secure
Ray Libraries<a id="ray-libraries"></a>

Ray Data<a id="ray-data"></a>
πŸ’« Enhancements:
- Simplify custom metadata provider API (47575)
- Change counts of metrics to rates of metrics (47236)
- Throw exception for non-streaming HF datasets with "override_num_blocks" argument (47559)
- Refactor custom optimizer rules (47605)

πŸ”¨ Fixes:
- Remove ineffective retry code in `plan_read_op` (47456)
- Fix incorrect pending task size if outputs are empty (47604)

Ray Train<a id="ray-train"></a>
πŸ’« Enhancements:
- Update run status and add stack trace to `TrainRunInfo` (46875)

Ray Serve<a id="ray-serve"></a>
πŸ’« Enhancements:
- Allow control of some serve configuration via env vars ([47533](https://github.com/ray-project/ray/pull/47533))
- [serve] Faster detection of dead replicas ([47237](https://github.com/ray-project/ray/pull/47237))

πŸ”¨ Fixes:
- [Serve] fix component id logging field ([47609](https://github.com/ray-project/ray/pull/47609))

RLlib<a id="rllib"></a>
πŸ’« Enhancements:
- New API stack:
- Add restart-failed-env option to EnvRunners. ([47608](https://github.com/ray-project/ray/pull/47608)[)](https://github.com/ray-project/ray/commit/e75f5e7aa950e30097a0323f4baf14d90b1b6b9b)
- Offline RL: Store episodes in state form. ([47294](https://github.com/ray-project/ray/pull/47294)[)](https://github.com/ray-project/ray/commit/aa7179a6fa24a0d95a1c9b85014bfb322d3447e6)
- Offline RL: Replace GAE in MARWILOfflinePreLearner with `GeneralAdvantageEstimation` connector in learner pipeline. ([47532](https://github.com/ray-project/ray/pull/47532))
- Off-policy algos: Add episode sampling to EpisodeReplayBuffer. ([47500](https://github.com/ray-project/ray/pull/47500))
- RLModule APIs: Add `SelfSupervisedLossAPI` for RLModules[ that bri](https://github.com/ray-project/ray/commit/f422376cda3ae0dc52fc7686df3b1cb03342be7f)ng their own loss and `InferenceOnlyAPI`. ([#47581](https://github.com/ray-project/ray/pull/47581), [#47572](https://github.com/ray-project/ray/pull/47572))

Ray Core<a id="ray-core"></a>
πŸ’« Enhancements:
- [aDAG] Allow custom NCCL group for aDAG (47141)
- [aDAG] support buffered input (47272)
- [aDAG] Support multi node multi reader (47480)
- [Core] Make is_gpu, is_actor, root_detached_id fields late bind to workers. (47212)
- [Core] Reconstruct actor to run lineage reconstruction triggered actor task (47396)
- [Core] Optimize GetAllJobInfo API for performance (47530)

πŸ”¨ Fixes:
- [aDAG] Fix ranks ordering for custom NCCL group (47594)

Ray Clusters<a id="ray-clusters"></a>
πŸ“– Documentation:
- [KubeRay] add a guide for deploying vLLM with RayService (47038)

Thanks

Many thanks to all those who contributed to this release!
ruisearch42, andrewsykim, timkpaine, rkooo567, WeichenXu123, GeneDer, sword865, simonsays1980, angelinalg, sven1977, jjyao, woshiyyya, aslonnie, zcin, omatthew98, rueian, khluu, justinvyu, bveeramani, nikitavemuri, chris-ray-zhang, liuxsh9, xingyu-long, peytondmurray, rynewang

Page 1 of 17

Β© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.