Skip to content

Conversation

red-hat-konflux[bot]
Copy link

@red-hat-konflux red-hat-konflux bot commented Nov 23, 2024

This PR contains the following updates:

Package Change Age Confidence
ray ~=2.47.1 -> ~=2.49.0 age confidence
ray ~=2.47.0 -> ~=2.49.0 age confidence

Warning

Some dependencies could not be looked up. Check the warning logs for more information.


Release Notes

ray-project/ray (ray)

v2.49.0

Compare Source

Release Highlights

Ray Data:

  • We’ve implemented a variety of performance enhancements, including improved actor/node autoscaling with budget-aware decisions; faster/more accurate shuffle accounting; reduced Parquet metadata footprint; and out-of-order execution for higher throughput.
  • We’ve also implemented anti/semi joins, stratified train_test_split, and added Snowflake connectors.

Ray Core:

  • Performance/robustness cleanups around GCS publish path and raylet internals; simpler OpenTelemetry flagging; new user-facing API to wait for GPU tensor free; plus assorted test/infra tidy-ups

Ray Train:

  • We’ve introduced a new JaxTrainer with SPMD support for TPUs.

Ray Serve:

  • Custom Autoscaling per Deployment Serve now supports user-defined autoscaling policies via AutoscalingContext and AutoscalingPolicy, enabling fine-grained scaling logic at the deployment level. This is part of a large effort where we are adding support for autoscaling based on custom metrics in Serve, see this RFC for more details.
  • Async Inference (Initial Support): Ray Serve introduces asynchronous inference execution, laying the foundation for better throughput and latency in async workloads. Please see this RFC for more details.
  • Major Performance Gains: This version of ray serve brings double digit % performance improvements both in throughput and latency. See release notes for more details.

Ray Serve/Data LLM:

  • We’ve refactored Ray Serve LLM to be fully compatible with the default vllm serve and also now supports vLLM=0.10.
  • We’ve added a prefix cache-aware router with PrefixCacheAffinityRouter for optimized cache utilization; dynamic cache management via reset prefix cache remote methods; enhanced LMCacheConnectorV1 with kv_transfer_config support.

Ray Libraries

Ray Data

🎉 New Features:

  • Wrapped batch indices in a BatchMetadata object to make per-batch metadata explicit. (#​55643)
  • Added support for Anti/Semi Join types. (#​55272)
  • Introduced an Issue Detection Framework. (#​55155)
  • Added an option to enable out-of-order execution for better performance. (#​54504)
  • Introduced a StreamingSplit logical operator for DAG rewrite. (#​54994)
  • Added a stratify parameter to train_test_split. (#​54624)
  • Added Snowflake connectors. (#​51429)
  • Updated Hudi integration to support incremental query. (#​54301)
  • Added an Actor location tracker. (#​54590)
  • Added BundleQueue.has_next. (#​54710)
  • Made DEFAULT_OBJECT_STORE_MEMORY_LIMIT_FRACTION configurable. (#​54873)
  • Added Expression support & a with_columns API. (#​54322)
  • Allocate GPU resources in ResourceManager. (#​54445)

💫 Enhancements:

  • Decoupled actor and node autoscaling; autoscaling now also considers budget. (#​55673, #​54902)
  • Faster hash-shuffle resource usage calculation; more accurate shuffle progress totals. (#​55503, #​55543)
  • Reduced Parquet metadata storage usage. (#​54821)
  • Export API improvements: refresh dataset/operator state, sanitize metadata, and truncate exported metadata. (#​55355, #​55379, #​55216, #​54623)
  • Metrics & observability: task metric improvements, external-buffer block-count metric, row-based metrics, clearer operator names in logs, single debug log when aggregators are ready. (#​55429, #​55022, #​54693, #​52949, #​54483)
  • Dashboard: added “Max Bytes to Read” panel/budget, panels for blocks-per-task and bytes-per-block, and streaming executor duration. (#​55024, #​55020, #​54614)
  • Planner/execution & infra cleanups: ExecutionResources and StatsManager cleanup, planner interface refactor, node trackers init, removed ray.get in _MapWorker ctor, removed target_shuffle_max_block_size. (#​54694, #​55400, #​55018, #​54665, #​54734, #​55158)
  • Behavior/interop tweaks: map_batches defaults to row_modification=False and avoids pushing past limit; limited operator pushdown; prefetch for PandasJSONDatasource; use cloudpickle for Arrow tensor extension ser/des; bumped Arrow to 21.0; schema warning tone change. (#​54992, #​54457, #​54667, #​54831, #​55426, #​54630)
  • Removed randomize-blocks reorder rule for more stable behavior. (#​55278)

🔨 Fixes:

  • AutoscalingActorPool now properly downscales after execution. (#​55565)
  • StatsManager handles StatsActor loss on disconnect. (#​55163)
  • Handle missing chunks key when Databricks UC query returns zero rows. (#​54526)
  • Handle empty fragments in sampling when num_row_groups=0. (#​54822)
  • Restored handling of PyExtensionType to keep compatibility with previously written datasets. (#​55498)
  • Prevent negative resource budget when concurrency exceeds the global limit; fixed resource-manager log calculation. (#​54986, #​54878)
  • Default write_parquet warning removed; handled unhashable types in OneHotEncoding. (#​54864, #​54863)
  • Overwrite mode now maps to the correct Arrow behavior for parallel writes. (#​55118)
  • Added back from_daft Arrow-version checks. (#​54907)
  • Pandas chained in-place assignment warning resolved. (#​54486)
  • Test stability/infra: fixed flaky tests, adjusted bounds and sizes, added additional release tests/chaos variants for image workloads, increased join test size, adjusted sorting release test to produce 1 GB blocks. (#​55485, #​55489, #​54806, #​55120, #​54716, #​55402, #​54971)

📖 Documentation:

  • Added a user guide for aggregations. (#​53568)
  • Added a code snippet in docs for partitioned writes. (#​55002)
  • Updated links to Lance documentation. (#​54836)

Ray Train

🎉 New Features:

  • Introduced JaxTrainer with SPMD support on TPUs (#​55207)

💫 Enhancements:

  • ray.train.get_dataset_shard now lazily configures dataset sharding for better startup behavior (#​55230)
  • Clearer worker error logging (#​55222)
  • Fail fast when placement group requirements can never be satisfied (#​54402)
  • New ControllerError surfaced and handled via failure policy for improved resiliency (#​54801, #​54833)
  • TrainStateActor periodically checks controller health and aborts when necessary (#​53818)

🔨 Fixes:

  • Resolve circular import in ray.train.v2.lightning.lightning_utils (#​55668)
  • Fix XGBoost v2 callback behavior (#​54787)
  • Suppress a spurious type error (#​50994)
  • Reduce test flakiness: remove randomness and bump a data-integration test size (#​55315, #​55633)

📖 Documentation:

  • New LightGBMTrainer user guide (#​54492)
  • Fix code-snippet syntax highlighting (#​54909)
  • Minor correction in experiment-tracking guide comment (#​54605)

🏗 Architecture refactoring:

  • Public Train APIs routed through TrainFnUtils for consistency (#​55226)
  • LoggingManager utility for Train logging (#​55121)
  • Convert DEFAULT variables from strings to bools (#​55581)

Ray Tune

🎉 New Features:

  • Add video FPS support to WandbLoggerCallback (#​53638)

💫 Enhancements:

  • Typing: reset_config now explicitly returns bool (#​54581)
  • CheckpointManager supports recording scoring metric only (#​54642)

🔨 Fixes:

  • Fix XGBoost v2 callback integration (#​54787)
  • Correct type for RunConfig.progress_reporter (#​48439)

📖 Documentation:

Ray Serve

🎉 New Features:

  • Async inference support in Ray Serve (initial phase). Provides basic asynchronous inference execution, with follow-up work planned for failed/unprocessed queues and additional tests. #​54824
  • Per-deployment custom autoscaling controls. Introduces AutoscalingContext and AutoscalingPolicy classes, enabling user-defined autoscaling strategies at the deployment level. #​55253
  • Same event loop router. Adds option to run the Serve router in the same event loop as the proxy, yielding ~17% throughput improvement. #​55030

💫 Enhancements:

  • Async get_current_servable_instance(). Converts the FastAPI dependency to async def, removing threadpool overhead and boosting performance: 35% higher RPS and reduced latency. #​55457
  • Access log optimization. Cached contexts in request path logging improved request throughput by ~16% with lower average latency. #​55166
  • Batching improvements. Default batch wait timeout increased from 0.0s to 0.01s (10ms) to enable meaningful batching. #​55126
  • HTTP receive refactor. Cleaned up handling of replica-side HTTP receive tasks. #​54543 / #​54565
  • Configurable replica router backoff. Added knobs for retry/backoff control when routing to replicas. #​54723
  • Autoscaling ergonomics. Marked per-deployment autoscaling metrics push interval config as deprecated for consistency. #​55102
  • Health check & env var safety. Introduced warnings for invalid/zero/negative environment variable values, with migration path planned for Ray 2.50.0. #​55464, #​54944
  • Improved CLI UX. serve config now prints No configuration was found. instead of an empty string. #​54767

🔨 Fixes:

📖 Documentation:

  • Unexpected queuing behavior. Documented quirks in handle request queuing. #​54542

🏗 Architecture refactoring:

  • Router/handle internals refactored for clarity and future feature expansion. #​55635
  • Model composition benchmarks. Added benchmarking to track performance of common composition patterns. #​55549
  • Constants refactor. Utility functions moved out of constants.py for better readability and stricter env var validation. #​54944, #​55464
  • Ray internals migration. Moved usage, ray_option_utils, and selected constants from _private to _common. #​54915, #​54578

Ray Serve/Data LLM

🎉 New Features:

  • Prefix cache-aware router with PrefixCacheAffinityRouter for optimized cache utilization. (#​55218, #​55588)
  • Reset prefix cache remote method for dynamic cache management. (#​55658)
  • LMCacheConnectorV1 support for kv_transfer_config to enhance key-value transfer configurations. (#​54579)
  • LLMServer and LLMEngine major refactor for 100% vLLM serve frontend compatibility. (#​54554)

💫 Enhancements:

  • vLLM engine upgrade to version 0.10.0 with improved performance and compatibility. (#​55067)
  • Enhanced error handling for invalid model_id parameters with clearer error messages. (#​55589)
  • Improved telemetry handling with better race condition management for push operations. (#​55558)
  • Optimized deployment defaults with better configuration values to prevent bottlenecks. (#​54696)
  • LoRA workflow improvements with refactored downloading and utility functions. (#​54946)
  • LLMServer refactor to synchronous initialization for better reliability. (#​54835)
  • Mistral tokenizer support for tekken tokenizer compatibility. (#​54666)
  • Smart batching logic that skips batching when batch_interval_ms == 0. (#​54751)
  • Dashboard enhancements with improved LLM metrics and monitoring capabilities. (#​54797)

🔨 Fixes:

  • Pyright linting corrections for Ray Serve LLM examples. (#​55284)
  • Test stability improvements for DeepSeek model and vLLM engine processor tests. (#​55401, #​55120)
  • Serialization fixes for ChatCompletionRequest tool_calls ValidatorIterator objects. (#​55538)

📖 Documentation:

  • Prefix cache router documentation with comprehensive usage examples. (#​55218)
  • Multi-LoRA documentation improvements with clearer setup instructions. (#​54788)
  • STRICT_PACK strategy FAQ documentation explaining data.llm packing behavior. (#​55505)

🏗 Architecture refactoring:

  • Docker image optimizations with UCX and NCCL updates, plus GKE GPU operator compatibility paths. (#​54598, #​55206)

RLlib

🎉 New Features:

💫 Enhancements:

  • Upgraded RLlink protocol for external env/simulator training. (#​53550)
  • Performance improvements in Offline RL API through switching to iter_torch_batches. (#​54277)
  • Added an example for curriculum learning in Atari Pong. (#​55304)

🔨 Fixes:

  • Corrected TensorType handling. (#​55694)
  • Fixed a bug with multi-learner setups in Offline RL API. (#​55693)
  • Addressed ImportError in Atari examples. (#​54967)
  • Fixed some bugs in the docs for IQL and CQL. (#​55614)
  • Increased default timesteps on two experiments. (#​54185)
  • Fixed TorchMultiCategorical.to_deterministic when having different number of categories and logits with time dimension. (#​54414)
  • Added missing documentation for SACConfig's training(). (#​53918)
  • Fixed bug in restore_from_path such that connector states are also restored on remote EnvRunners. (#​54672)
  • Fixed missing support for config.count_steps_by = "agent_steps". (#​54885)
  • Added missing colon to CUBLAS_WORKSPACE_CONFIG. (#​53913)
  • Removed rllib_contrib completely from RLlib. (#​55182)

🏗 Architecture refactoring:

  • Deprecated TensorFlow support from new API stack. (#​55042)
  • Deprecated input/output specs from RLModule. (#​55141)
  • Deprecated --enable-new-api-stack flag from all scripts. (#​54853, #​54702)

Ray Core

🎉 New Features:

💫 Enhancements:

🔨 Fixes:

📖 Documentation:

  • Added guide on using type hints with Ray Core. (#​55013)

🏗 Architecture refactoring:

Dashboard

💫 Enhancements:

  • Grafana: new Operator filter for Data; Prometheus adds RayNodeType label on for nodes. (#​55493, #​55192)

🔨 Fixes:

  • Removed references to a deleted Data metrics panel. (#​55478)

Ray Images

🎉 New Features:

💫 Enhancements:

Docs

💫 Enhancements:

  • KubeRay docs: added InteractiveMode quick-start details; expanded Core type-hints guidance; Serve LLM example coverage; Data LLM batching FAQ (#​55570, #​55284)

🔨 Fixes:

  • Various formatting/mis-highlighting and lints across Train/Tune/Serve LLM docs. (#​55284, #​54763)

Thanks!

Thank you to everyone who contributed to this release!
@​pavitrabhalla, @​Daraan, @​Sparks0219, @​daiping8, @​abrarsheikh, @​sven1977, @​Toshaksha, @​bveeramani, @​MengjinYan, @​GokuMohandas, @​codope, @​nadongjun, @​SolitaryThinker, @​matthewdeng, @​elliot-barn, @​isimluk, @​avibasnet31, @​OneSizeFitsQuorum, @​Future-Outlier, @​marosset, @​jackfrancis, @​kshanmol, @​eicherseiji, @​dayshah, @​iamjustinhsu, @​Qiaolin-Yu, @​goutamvenkat-anyscale, @​Yicheng-Lu-llll, @​yantarou, @​rclough, @​zcin, @​NeilGirdhar, @​VarunBhandary, @​400Ping, @​akshay-anyscale, @​vickytsang, @​xushiyan, @​JasonLi1909, @​n-elia, @​simonsays1980, @​dragongu, @​Kishanthan, @​ruisearch42, @​jectpro7, @​TimothySeah, @​liulehui, @​rueian, @​HollowMan6, @​akyang-anyscale, @​axreldable, @​czgdp1807, @​alanwguo, @​justinvyu, @​ok-scale, @​my-vegetable-has-exploded, @​landscapepainter, @​fscnick, @​machichima, @​mpashkovskii, @​ZacAttack, @​gvspraveen, @​sword865, @​lmsh7, @​Ziy1-Tan, @​rebel-scottlee, @​sampan-s-nayak, @​coqian, @​can-anyscale, @​Bye-legumes, @​win5923, @​MortalHappiness, @​angelinalg, @​khluu, @​aslonnie, @​krishnakalyan3, @​minosvasilias, @​x-tong, @​xinyuangui2, @​raulchen, @​Yangruipis, @​edoakes, @​kevin85421, @​wingkitlee0, @​Fokko, @​cristianjd, @​srinathk10, @​owenowenisme, @​JoshKarpel, @​MengqingCao, @​leopardracer, @​westonpace, @​LeslieWongCV, @​VassilisVassiliadis, @​crypdick, @​alexeykudinkin, @​mjacar, @​kunling-anyscale, @​saihaj, @​kouroshHakha, @​ema-pe, @​markjm, @​avigyabb, @​dshepelev15, @​mauvilsa, @​omatthew98, @​nrghosh, @​ryanaoleary, @​Aydin-ab, @​lk-chen, @​stephanie-wang, @​harshit-anyscale, @​jjyao, @​bullgom, @​Yevet, @​israbbani

v2.48.0

Compare Source

Release Highlights

  • Ray Data: This release features a new Delta Lake and Unity Catalog integration and performance improvements to various reading/writing operators.
  • Ray Core: Enhanced GPU object support with intra-process communication and improved Autoscaler v2 functionality
  • Ray Train: Improved hardware metrics integration with Grafana and enhanced collective operations support
  • Ray Serve LLM: This release features early proof of concept for prefill-decode disaggregation deployment and LLM-aware request routing such as prefix-cache aware routing.
  • Ray Data LLM: Improved throughput and CPU memory utilization for ray data workers.

Ray Libraries

Ray Data

🎉 New Features:

  • Add reading from Delta Lake tables and Unity Catalog integration (#​53701)
  • Enhanced pin_memory support in iter_torch_batches (#​53792)
  • Add pin_memory to iter_torch_batches (#​53792)

💫 Enhancements:

  • Re-enabled sorting in Ray Data tests with performance improvements (#​54475)
  • Enhanced handling of mismatched columns and pandas.NA values (#​53861, #​53859)
  • Improved read_text trailing newline semantics (#​53860)
  • Optimized backpressure handling with policy-based resource management (#​54376)
  • Enhanced write_parquet with support for both partition_by and row limits (#​53930)
  • Prevent filename collisions on write operations (#​53890)
  • Improved execution performance for One Hot encoding in preprocessors (#​54022)

🔨 Fixes:

  • Fixing map_groups issues (#​54462)
  • Prevented Op fusion for streaming repartition to avoid performance degradation (#​54469)
  • Fixed ActorPool autoscaler scaling up logic (#​53983)
  • Resolved empty dataset repartitioning issues (#​54107)
  • Fixed PyArrow overflow handling in data processing (#​53971, #​54390)
  • Fixed IcebergDatasink to properly generate individual file uuids (#​52956)
  • Avoid OOMs with read_json(..., lines=True) (#​54436)
  • Handle HuggingFace parquet dataset resolve URLs (#​54146)
  • Fixed BlockMetadata derivation for Read operator (#​53908)

📖 Documentation:

  • Updated AggregateFnV2 documentation to clarify finalize method (#​53835)
  • Improved preprocessor and vectorizer API documentation

Ray Train

🎉 New Features:

  • Added broadcast_from_rank_zero and barrier collective operations (#​54066)
  • Enhanced hardware metrics integration with Grafana dashboards (#​53218)
  • Added support for dynamically loading callbacks via environment variables (#​54233)

💫 Enhancements:

  • Improved checkpoint population from before_init_train_context (#​54453)
  • Enhanced controller state logging and metrics (#​52805)
  • Added structured logging environment variable support (#​52952)
  • Improved handling of Noop scaling decisions for smoother scaling logic (#​53180)
  • Logging of controller state transitions to aid in debugging and analysis (#​53344)

🔨 Fixes:

  • Fixed GPU tensor reporting in ray.train.report (#​53725)
  • Enhanced move_tensors_to_device utility for complex tensor structures (#​53109)
  • Improved worker health check error handling with trace information (#​53626)
  • Fixed GPU transfer support for non-contiguous tensors (#​52548)
  • Force abort on SIGINT spam and do not abort finished runs (#​54188)

📖 Documentation:

  • Updated beginner PyTorch example (#​54124)
  • Added documentation for ray.train.collective APIs (#​54340)
  • Added a note about PyTorch DataLoader's multiprocessing and forkserver usage (#​52924)
  • Fixed various docstring format and indentation issues (#​52855, #​52878)
  • Added note that ray.train.report API docs should mention optional checkpoint_dir_name (#​54391)

🏗 Architecture refactoring:

  • Removed subclass relationship between RunConfig and RunConfigV1 (#​54293)
  • Enhanced error handling for finished training runs (#​54188)
  • Deduplicated ML doctest runners in CI for efficiency (#​53157)
  • Converted isort configuration to Ruff for consistency (#​52869)

Ray Tune

💫 Enhancements:

  • Updated test_train_v2_integration to use the correct RunConfig (#​52882)

🔨 Fixes:

  • Fixed RayTaskError serialization logic (#​54396)
  • Improved experiment restore timeout handling (#​53387)

📖 Documentation:

  • Replaced session.report with tune.report and corrected import paths (#​52801)
  • Removed outdated graphics cards reference in docs (#​52922)
  • Fixed various docstring format issues (#​52879)

Ray Serve

🎉 New Features:

  • Added RouterConfig field to DeploymentConfig for custom RequestRouter configuration (#​53870)
  • Added support for implementing custom request routing algorithms (#​53251)

💫 Enhancements:

  • Enhanced FastAPI ingress deployment validation for multiple deployments (#​53647)
  • Optimized get_live_deployments performance (#​54454)
  • Progress towards making ray.serve.llm compatible with vLLM serve frontend (#​54481, #​54443, #​54440)

🔨 Fixes:

  • Fixed deployment scheduler issues with component scheduling (#​54479)
  • Fixed runtime_env validation for py_modules (#​53186)
  • Added descriptive error message when deployment name is not found (#​45181)

📖 Documentation:

  • Added troubleshooting guide for DeepSeek/multi-node GPU deployment on KubeRay (#​54229)
  • Updated the guide on serving models with Triton Server in Ray Serve
  • Added documentation for custom request routing algorithms
  • Added custom request router docs (#​53511)

🏗 Architecture refactoring:

  • Remove indirection layers of node initialization (#​54481)
  • Incremental refactor of LLMEngine (#​54443)
  • Remove random v0 logic from serve endpoints (#​54440)
  • Remove usage of internal_api.memory_summary() (#​54417)
  • Remove usage of ray._private.state (#​54140)

Ray Serve/Data LLM

🎉 New Features

  • Support separate deployment config for PDProxy in PrefixAwareReplicaSet (#​53935)
  • Support for prefix-aware request router (#​52725)

💫 Enhancements

  • Log engine stats after each batch task is done. (#​54360)
  • Decouple max_tasks_in_flight from max_concurrent_batches (#​54362)
  • Make llm serve endpoints compatible with vLLM serve frontend, including streaming, tool_code, and health check support (#​54440)
  • Remove botocore dependency in Ray Serve LLM (#​54156)
  • Update vLLM version to 0.9.2 (#​54407)

🔨 Fixes

  • Fix health check in prefill disagg (#​53937)
  • Fix doc to only support int concurrency (#​54196)
  • Fix vLLM batch test by changing to Pixtral (#​53744)
  • Fix pickle error with remote code models in vLLM Ray workloads (#​53868)
  • Adaption of the change of vllm.PoolingOutput (#​54467)

📖 Documentation

  • Ray serve/lora doc fix (#​53553)
  • Add Ray serve/LLM doc (#​52832)
  • Add a doc snippet to inform users about existing diffs between vLLM and Ray Serve LLM behavior in some APIs like streaming, tool_code, and health check (#​54123)
  • Troubleshooting DeepSeek/multi-node GPU deployment on KubeRay (#​54229)

🏗 Architecture refactoring

  • Make llm serve endpoints compatible with vLLM serve frontend, including streaming, tool_code, and health check support (#​54490)
  • Prefix-aware scheduler [2/N] Configure PrefixAwareReplicaSet to correctly handle the number of available GPUs for each worker and to ensure efficient GPU utilization in vLLM (#​53192)
  • Organize spread out utils.py (#​53722)
  • Remove ImageRetriever class and related tests from the LLM serving codebase. (#​54018)
  • Return a batch of rows in the udf instead of row by row (#​54329)

RLlib

🎉 New Features:

  • Implemented Offline Policy Evaluation (OPE) via Importance Sampling (#​53702)
  • Enhanced ConnectorV2 ObservationPreprocessor APIs with multi-agent support (#​54209)
  • Add GPU inference to offline evaluation (#​52718)

💫 Enhancements:

  • Enhanced MetricsLogger to handle tensors in state management (#​53514)
  • Improved env seeding in EnvRunners with deterministic training example rewrite (#​54039)
  • Cleanup of meta learning classes and examples (#​52680)

🔨 Fixes:

  • Fixed EnvRunner restoration when no local EnvRunner is available (#​54091)
  • Fixed shapes in explained_variance for recurrent policies (#​54005)
  • Resolved device check issues in Learner implementation (#​53706)
  • Enhanced numerical stability in MeanStdFilter (#​53484)
  • Fixed weight synching in offline evaluation (#​52757)
  • Fixed bug in split_and_zero_pad utility function (#​52818)

📖 Documentation:

  • Do-over of examples for connector pipelines (#​52604)
  • Remove "new API stack" banner from all RLlib docs pages as it's now the default (#​54282)

Ray Core

🎉 New Features:

  • Enhanced GPU object support with intra-process communication (#​53798)
  • Integrated single-controller collective APIs with GPU objects (#​53720)
  • Added support for ray.get on driver process for GPU objects (#​53902)
  • Supporting allreduce on list of input nodes in compiled graphs (#​51047)
  • Add single-controller API for ray.util.collective and torch gloo backend (#​53319)

💫 Enhancements:

  • Improved autoscaler v2 functionality with cloud instance ID reusing (#​54397)
  • Enhanced cluster task manager with better resource management (#​54413)
  • Upgraded OpenTelemetry SDK for better observability (#​53745)
  • Improved actor scheduling to prevent deadlocks in ordered actors (#​54034)
  • Enhanced get_max_resources_from_cluster_config functionality (#​54455)
  • Use std::move in cluster task manager constructor (#​54413)
  • Improve status messages and add comments about stale seq_no handling (#​54470)
  • uv run integration is now enabled by default (#​53060)

🔨 Fixes:

  • Fixed race conditions in object eviction and repinning for recovery (#​53934)
  • Resolved GCS crash issues on duplicate MarkJobFinished RPCs (#​53951)
  • Enhanced actor restart handling on node failures (#​54088)
  • Improved reference counting during worker graceful shutdown (#​53002)
  • Fix race condition when canceling task that hasn't started yet (#​52703)
  • Fix the issue where a valid RestartActor rpc is ignored (#​53330)
  • Fixed "Check failed: it->second.num_retries_left == -1" error (#​54116)
  • Fix detached actor being unexpectedly killed (#​53562)

📖 Documentation:

  • Enhanced troubleshooting guides and API documentation
  • Updated reStructuredText

Configuration

📅 Schedule: Branch creation - "after 5am on saturday" (UTC), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about these updates again.


  • If you want to rebase/retry this PR, check this box

To execute skipped test pipelines write comment /ok-to-test.

This PR has been generated by MintMaker (powered by Renovate Bot).

@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/main/ray-2.x branch from 138adcc to b1f9f3b Compare December 7, 2024 08:55
@red-hat-konflux red-hat-konflux bot changed the title Update dependency ray to ~=2.39.0 Update dependency ray to ~=2.40.0 Dec 7, 2024
@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/main/ray-2.x branch from b1f9f3b to 27eb9e4 Compare January 25, 2025 08:49
@red-hat-konflux red-hat-konflux bot changed the title Update dependency ray to ~=2.40.0 chore(deps): update dependency ray to ~=2.41.0 Jan 25, 2025
@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/main/ray-2.x branch from 27eb9e4 to 1b21afe Compare February 8, 2025 12:44
@red-hat-konflux red-hat-konflux bot changed the title chore(deps): update dependency ray to ~=2.41.0 Update dependency ray to ~=2.42.0 Feb 8, 2025
@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/main/ray-2.x branch from 1b21afe to dd4cb72 Compare February 15, 2025 09:04
@red-hat-konflux red-hat-konflux bot changed the title Update dependency ray to ~=2.42.0 Update dependency ray to ~=2.42.1 Feb 15, 2025
@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/main/ray-2.x branch from dd4cb72 to 960f764 Compare March 1, 2025 05:12
@red-hat-konflux red-hat-konflux bot changed the title Update dependency ray to ~=2.42.1 Update dependency ray to ~=2.43.0 Mar 1, 2025
@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/main/ray-2.x branch from 960f764 to ce04e7e Compare March 22, 2025 08:04
@red-hat-konflux red-hat-konflux bot changed the title Update dependency ray to ~=2.43.0 Update dependency ray to ~=2.44.0 Mar 22, 2025
@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/main/ray-2.x branch from ce04e7e to b47304d Compare March 29, 2025 10:09
@red-hat-konflux red-hat-konflux bot changed the title Update dependency ray to ~=2.44.0 chore(deps): update dependency ray to ~=2.44.1 Mar 29, 2025
@red-hat-konflux red-hat-konflux bot changed the title chore(deps): update dependency ray to ~=2.44.1 Update dependency ray to ~=2.44.1 Apr 26, 2025
@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/main/ray-2.x branch from b47304d to 8007d36 Compare May 3, 2025 06:19
@red-hat-konflux red-hat-konflux bot changed the title Update dependency ray to ~=2.44.1 chore(deps): update dependency ray to ~=2.45.0 May 3, 2025
@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/main/ray-2.x branch from 8007d36 to a0a8415 Compare May 10, 2025 07:21
@red-hat-konflux red-hat-konflux bot changed the title chore(deps): update dependency ray to ~=2.45.0 chore(deps): update dependency ray to ~=2.46.0 May 10, 2025
@red-hat-konflux red-hat-konflux bot changed the title chore(deps): update dependency ray to ~=2.46.0 Update dependency ray to ~=2.46.0 May 24, 2025
@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/main/ray-2.x branch from a0a8415 to daf8ba4 Compare June 14, 2025 12:17
@red-hat-konflux red-hat-konflux bot changed the title Update dependency ray to ~=2.46.0 Update dependency ray to ~=2.47.0 Jun 14, 2025
@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/main/ray-2.x branch from daf8ba4 to 1ca04c7 Compare June 21, 2025 05:30
@red-hat-konflux red-hat-konflux bot changed the title Update dependency ray to ~=2.47.0 chore(deps): update dependency ray to ~=2.47.1 Jun 21, 2025
@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/main/ray-2.x branch from 1ca04c7 to 57d6dca Compare July 19, 2025 17:00
@red-hat-konflux red-hat-konflux bot changed the title chore(deps): update dependency ray to ~=2.47.1 Update dependency ray to ~=2.48.0 Jul 19, 2025
Signed-off-by: red-hat-konflux <126015336+red-hat-konflux[bot]@users.noreply.github.com>
@red-hat-konflux red-hat-konflux bot force-pushed the konflux/mintmaker/main/ray-2.x branch from 57d6dca to 83034d0 Compare August 30, 2025 09:04
@red-hat-konflux red-hat-konflux bot changed the title Update dependency ray to ~=2.48.0 chore(deps): update dependency ray to ~=2.49.0 Aug 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants