[Data] Support strict=False mode for StreamingRepartition by machichima · Pull Request #60295 · ray-project/ray

machichima · 2026-01-19T12:00:26Z

Description

Currently, StreamingRepartition operator is essentially strict=True. We want to relax this to allow non-strict mode with following guarantees:

Strict mode: is guaranteeing that all output blocks (maybe except for the last one), will be of size target_num_rows
Non-strict mode: will provide more relaxed guarantee – it can produce 1 block that is < target_num_rows blocks per input block (ie it wouldn’t do any stitching)

This mode will be the default mode and would allow StreamingRepartition to be fused into previous operator

Related issues

Closes #60026

Additional information

Added strict: bool = False parameter to repartition()
Added mode-specific bundler selection in _get_fused_streaming_repartition_operator() and plan_streaming_repartition_op():
- Strict: uses ref_bundler=StreamingRepartitionRefBundler
- Non-strict: uses ref_bundler=None (default BlockRefBundler)
Add unit tests

Signed-off-by: machichima <nary12321@gmail.com>

gemini-code-assist

Code Review

This pull request introduces a strict parameter to StreamingRepartition, allowing for a non-strict mode. In non-strict mode, repartitioning doesn't stitch blocks, which enables more operator fusion opportunities. The changes are well-implemented across the logical planning, fusion rules, and physical planning layers. The default for repartition is now non-strict, which is a good choice for performance. The added tests are comprehensive and cover both the new non-strict behavior and the fusion logic. My main feedback is to add documentation for the new strict parameter in the user-facing Dataset.repartition method to ensure users understand how to use it.

gemini-code-assist · 2026-01-19T12:01:50Z

python/ray/data/dataset.py

        num_blocks: Optional[int] = None,
        target_num_rows_per_block: Optional[int] = None,
        *,
+        strict: bool = False,


The new strict parameter should be documented in the repartition method's docstring. Explaining the difference between strict=True (the old behavior) and strict=False (the new default) is important for users to understand its impact on block sizes and fusion.

You could add something like this to the Args section:

strict: If ``True``, `repartition` guarantees that all output blocks, except for the last one, will have `target_num_rows_per_block` rows. If ``False``, `repartition` is more relaxed and may produce blocks smaller than `target_num_rows_per_block` without stitching them. This is only used with `target_num_rows_per_block`. Defaults to ``False``.

Updated in dc609e1

Signed-off-by: machichima <nary12321@gmail.com>

machichima · 2026-01-19T12:22:47Z

@owenowenisme PTAL. Thank you!

Signed-off-by: machichima <nary12321@gmail.com>

…artition-strict-false Signed-off-by: machichima <nary12321@gmail.com>

Signed-off-by: machichima <nary12321@gmail.com>

owenowenisme

test_operator_fusion is failing could you please take a look?

python/ray/data/_internal/logical/rules/combine_shuffles.py

owenowenisme · 2026-01-21T16:14:05Z

python/ray/data/_internal/planner/plan_udf_map_op.py

        input_physical_dag,
        data_context,
        name=op.name,
        compute_strategy=compute,


I think we need min_rows_per_bundle = op.target_num_rows_per_block here if strict=False?

Updated in 89965d0

Seems like when we set min_rows_per_bundle here, the BlockRefBundler will try to stitch the output:

ray/python/ray/data/_internal/execution/operators/map_operator.py

Line 864 in 68d01c4

return list(output_buffer), _merge_ref_bundles(*output_buffer)

Therefor, I think we should keep it as None here to prevent stitching

ray/python/ray/data/_internal/execution/operators/map_operator.py

Lines 828 to 835 in 68d01c4

if self._min_rows_per_bundle is None:

# Short-circuit if no bundle row target was defined.

assert len(self._bundle_buffer) == 1

bundle = self._bundle_buffer[0]

self._bundle_buffer = []

self._bundle_buffer_size = 0

self._bundle_buffer_size_bytes = 0

return [bundle], bundle

owenowenisme · 2026-01-21T16:15:40Z

python/ray/data/dataset.py

+            strict: If ``True``, ``repartition`` guarantees that all output blocks,
+                except for the last one, will have exactly ``target_num_rows_per_block`` rows.
+                If ``False``, ``repartition`` is more relaxed and may produce blocks smaller
+                than ``target_num_rows_per_block`` without stitching them together.
+                This parameter is only used with ``target_num_rows_per_block``.
+                Defaults to ``False``.


Might be better to say that will only produce at most 1 block that is < target_num_rows_per_block per input block if strict is false.

Updated in f748b79

owenowenisme · 2026-01-21T16:17:04Z

python/ray/data/tests/test_repartition_e2e.py

+
+
+@pytest.mark.parametrize("batch_size", [30, 35, 45])
+def test_streaming_repartition_fusion_non_strict(


I think fusion test should be in python/ray/data/tests/test_operator_fusion.py

There's existing fusion and streaming repartition related test in this file, I think we can put this here as it align with existing tests. WDYT?

ray/python/ray/data/tests/test_repartition_e2e.py

Line 313 in 45b5d6b

def test_streaming_repartition_fusion_output_shape(

owenowenisme · 2026-01-21T16:18:36Z

python/ray/data/_internal/logical/rules/operator_fusion.py

+            ref_bundler = StreamingRepartitionRefBundler(batch_size)
+            # No further fusion because StreamingRepartitionRefBundler is stateful
+            # and maintains internal buffering state across bundles.
+            supports_fusion = False


Will this prevent fusion when batch_size == target_num_rows_per_block ?

Yes, but I think it's intended. As the original code (strict mode) hard-coded supports_fusion=False to prevent further fusion

# For now, we don't want to over-fuse StreamingRepartition with other map operators, # so the result operator does not support further fusion. supports_fusion=False,

owenowenisme · 2026-01-21T16:20:01Z

python/ray/data/_internal/logical/operators/map_operator.py

+        strict: If True, guarantees that all output blocks, except for the last one,
+            will have exactly target_num_rows_per_block rows. If False, is more relaxed
+            and may produce blocks smaller than target_num_rows_per_block without
+            stitching them together. Defaults to False.


Ditto with the comment in dataset.py

Updated in f748b79

Signed-off-by: machichima <nary12321@gmail.com>

python/ray/data/_internal/logical/rules/combine_shuffles.py

python/ray/data/_internal/logical/rules/operator_fusion.py

Signed-off-by: machichima <nary12321@gmail.com>

python/ray/data/_internal/logical/rules/combine_shuffles.py

alexeykudinkin · 2026-01-29T02:44:16Z

python/ray/data/_internal/logical/rules/operator_fusion.py

+            if not (
                isinstance(up_logical_op, MapBatches)
                and up_logical_op._batch_size is not None
                and down_logical_op.target_num_rows_per_block is not None
                and down_logical_op.target_num_rows_per_block > 0
-                # When the batch_size is a multiple of target_num_rows_per_block, fusing would still produce exactly identical sequence of blocks.
-                # See `_fuse_streaming_repartition_operators_in_dag` docstring for details.
-                # TODO: when the StreamingRepartition supports none_strict_mode, we can fuse
-                # `MapBatches -> StreamingRepartition` no matter what the `batch_size` and `target_num_rows` are.
-                # https://anyscale1.atlassian.net/browse/DATA-1731
-                and up_logical_op._batch_size
-                % down_logical_op.target_num_rows_per_block
+            ):
+                return False


I don't think this logic is correct -- if _batch_size is None we'd still allow to fuse StreamingRepartition

Hi @alexeykudinkin ,
I was following the original logic here, which also return False when _batch_size is None

ray/python/ray/data/_internal/logical/rules/operator_fusion.py

Lines 280 to 294 in 8e2e0aa

if isinstance(down_logical_op, StreamingRepartition):

return (

isinstance(up_logical_op, MapBatches)

and up_logical_op._batch_size is not None

and down_logical_op.target_num_rows_per_block is not None

and down_logical_op.target_num_rows_per_block > 0

# When the batch_size is a multiple of target_num_rows_per_block, fusing would still produce exactly identical sequence of blocks.

# See `_fuse_streaming_repartition_operators_in_dag` docstring for details.

# TODO: when the StreamingRepartition supports none_strict_mode, we can fuse

# `MapBatches -> StreamingRepartition` no matter what the `batch_size` and `target_num_rows` are.

# https://anyscale1.atlassian.net/browse/DATA-1731

and up_logical_op._batch_size

% down_logical_op.target_num_rows_per_block

== 0

)

Also, while we use StreamingRepartitionRefBundler(batch_size), based on the class def, the batch_size cannot be None

ray/python/ray/data/_internal/streaming_repartition.py

Lines 34 to 37 in 68d01c4

def __init__(self, target_num_rows_per_block: int):

assert (

target_num_rows_per_block > 0

), "target_num_rows_per_block must be positive for streaming repartition."

Therefor, I think we should keep this here?

Well, you're relaxing this, right?

There are now should be 2 modes:

StreamingRepartition(strict=True): batch-size need to be exact multiple of target_num_rows_per_block to produce correct results.

StreamingRepartition(strict=False): batch-size could be anything (even null)

Make sense! Thank you for pointing this out. Updated in c77787c

Suggested change

if not (

isinstance(up_logical_op, MapBatches)

and up_logical_op._batch_size is not None

and down_logical_op.target_num_rows_per_block is not None

and down_logical_op.target_num_rows_per_block > 0

# When the batch_size is a multiple of target_num_rows_per_block, fusing would still produce exactly identical sequence of blocks.

# See `_fuse_streaming_repartition_operators_in_dag` docstring for details.

# TODO: when the StreamingRepartition supports none_strict_mode, we can fuse

# `MapBatches -> StreamingRepartition` no matter what the `batch_size` and `target_num_rows` are.

# https://anyscale1.atlassian.net/browse/DATA-1731

and up_logical_op._batch_size

% down_logical_op.target_num_rows_per_block

):

return False

if (

not isinstance(up_logical_op, MapBatches)

or not down_logical_op.target_num_rows_per_block

):

return False

Can we simplify the logic here like this? And add check and raise error at dataset api to check if target_num_rows_per_block is not None it should not be negative

(Maybe move this)

ray/python/ray/data/_internal/streaming_repartition.py

Lines 35 to 37 in f5a53c4

assert (

target_num_rows_per_block > 0

), "target_num_rows_per_block must be positive for streaming repartition."

…artition-strict-false Signed-off-by: machichima <nary12321@gmail.com>

Signed-off-by: machichima <nary12321@gmail.com>

alexeykudinkin · 2026-02-06T22:50:52Z

python/ray/data/_internal/logical/rules/operator_fusion.py

+            ref_bundler = StreamingRepartitionRefBundler(batch_size)
+            # No further fusion because StreamingRepartitionRefBundler is stateful
+            # and maintains internal buffering state across bundles.
+            supports_fusion = False


We'd not be blocking any subsequent fusion like that

Let's add a test that we're able to fuse multiple ops like this:

Map > Map > SR

Map > SR > SR

While the comment is on line 338 (supports_fusion=False), I want to make sure do we want to support fusion for strict mode? Or just add test for non-strict mode? I think it's the latter one?

The Map > SR > SR case cannot work here because after the first Map > SR fusion, the logical operator becomes AbstractUDFMap rather than MapBatches.

ray/python/ray/data/_internal/logical/rules/operator_fusion.py

Lines 355 to 369 in f3d444a

logical_op = AbstractUDFMap(

name,

input_op,

up_logical_op.fn,

can_modify_num_rows=up_logical_op.can_modify_num_rows,

fn_args=up_logical_op.fn_args,

fn_kwargs=up_logical_op.fn_kwargs,

fn_constructor_args=up_logical_op.fn_constructor_args,

fn_constructor_kwargs=up_logical_op.fn_constructor_kwargs,

min_rows_per_bundled_input=batch_size,

compute=compute,

ray_remote_args_fn=ray_remote_args_fn,

ray_remote_args=ray_remote_args,

)

self._op_map[op] = logical_op

The current implementation only allows MapBatches > SR fusion:

ray/python/ray/data/_internal/logical/rules/operator_fusion.py

Line 126 in f3d444a

and isinstance(self._op_map[upstream_ops[0]], MapBatches)

To support Map > SR > SR fusion, we will need more changes, which I think is a bit out of scope of this PR.

Updated in:

111c054

83c5ddb

Let's keep it MapBatches then. Map > SR > SR needs to work

I look into it more, seems like Map > SR > SR already worked, but it's CombineShuffles._combine() combining two SR into one, so the result will just be Map > SR

Updated the test in 8552ff9

Signed-off-by: machichima <nary12321@gmail.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

cursor · 2026-02-09T13:55:32Z

python/ray/data/tests/test_operator_fusion.py


    assert (
-        f"MapBatches(<lambda>)->StreamingRepartition[num_rows_per_block={target_rows}]"
+        f"MapBatches(<lambda>)->StreamingRepartition[num_rows_per_block={target_rows},strict=False]"


Test name misleading after non-strict mode behavior change

Low Severity

The test function test_streaming_repartition_no_further_fuse has a name and docstring that describe strict mode behavior ("doesn't fuse further"), but the test uses default strict=False (line 811). In non-strict mode, the fused operator has supports_fusion=True which DOES allow further fusion. The test assertions still pass because they check for substring matches, but the test name no longer accurately describes what it tests.

machichima added 7 commits January 15, 2026 04:29

feat: add strict option in StreamingRepartition

838513c

Signed-off-by: machichima <nary12321@gmail.com>

feat: enable fusion for non-strict mode

08008d1

Signed-off-by: machichima <nary12321@gmail.com>

fix: .strict to ._strict

591b00f

Signed-off-by: machichima <nary12321@gmail.com>

test: add non strict tests

ae95e03

Signed-off-by: machichima <nary12321@gmail.com>

test: set stirct=True for existing tests

8f7282a

Signed-off-by: machichima <nary12321@gmail.com>

fix: include strict=... in operator name

cdf8f9f

Signed-off-by: machichima <nary12321@gmail.com>

fix: min_row_per_bundle and support fusion issue

def13b2

Signed-off-by: machichima <nary12321@gmail.com>

machichima requested a review from a team as a code owner January 19, 2026 12:00

gemini-code-assist bot reviewed Jan 19, 2026

View reviewed changes

This comment was marked as outdated.

Sign in to view

machichima added 2 commits January 19, 2026 20:10

docs: update docstring

dc609e1

Signed-off-by: machichima <nary12321@gmail.com>

feat: validate strict with target_num_rows_per_block

2c87758

Signed-off-by: machichima <nary12321@gmail.com>

ray-gardener bot added data Ray Data-related issues community-contribution Contributed by the community labels Jan 19, 2026

machichima added 2 commits January 20, 2026 04:49

refactor: precommit

d9d4295

Signed-off-by: machichima <nary12321@gmail.com>

docs: update docstring

04964bc

Signed-off-by: machichima <nary12321@gmail.com>

machichima force-pushed the streamingrepartition-strict-false branch from 6cfbfc5 to 04964bc Compare January 19, 2026 21:25

This comment was marked as outdated.

Sign in to view

machichima added 4 commits January 21, 2026 20:40

fix: pass strict param in CombineRepartitions

a9fbce0

Signed-off-by: machichima <nary12321@gmail.com>

fix: verify target_num_rows_per_block in StreamingRepartition

7b825e5

Signed-off-by: machichima <nary12321@gmail.com>

Merge branch 'master' of github.com:ray-project/ray into streamingrep…

55c79bd

…artition-strict-false Signed-off-by: machichima <nary12321@gmail.com>

fix: pass strict param in CombineShuffles

accb54a

Signed-off-by: machichima <nary12321@gmail.com>

owenowenisme reviewed Jan 21, 2026

View reviewed changes

machichima added 3 commits January 23, 2026 19:14

fix: pass min_rows_per_bundle in non-strict mode

89965d0

Signed-off-by: machichima <nary12321@gmail.com>

docs: update docstring

f748b79

Signed-off-by: machichima <nary12321@gmail.com>

test: set strict=True

49cc5fc

Signed-off-by: machichima <nary12321@gmail.com>

cursor bot reviewed Jan 23, 2026

View reviewed changes

python/ray/data/_internal/logical/rules/combine_shuffles.py Show resolved Hide resolved

python/ray/data/_internal/logical/rules/operator_fusion.py Show resolved Hide resolved

fix: set min_rows_per_bundle to None

68d01c4

Signed-off-by: machichima <nary12321@gmail.com>

machichima requested a review from owenowenisme January 27, 2026 03:22

alexeykudinkin reviewed Jan 29, 2026

View reviewed changes

iamjustinhsu assigned alexeykudinkin Feb 4, 2026

alexeykudinkin added the go add ONLY when ready to merge, run all tests label Feb 5, 2026

machichima added 3 commits February 5, 2026 18:12

Merge branch 'master' of github.com:ray-project/ray into streamingrep…

8a48fdd

…artition-strict-false Signed-off-by: machichima <nary12321@gmail.com>

fix: update _can_fuse logic for batch size

c77787c

Signed-off-by: machichima <nary12321@gmail.com>

refactor: precommit

6a2fec8

Signed-off-by: machichima <nary12321@gmail.com>

alexeykudinkin reviewed Feb 6, 2026

View reviewed changes

machichima and others added 4 commits February 7, 2026 14:07

fix: enable fuse with other operations in non-strict mode

111c054

Signed-off-by: machichima <nary12321@gmail.com>

test: add map>map>sr and map>sr>map case

83c5ddb

Signed-off-by: machichima <nary12321@gmail.com>

test: update test case to Map > SR > SR

6656b1e

Signed-off-by: machichima <nary12321@gmail.com>

Merge branch 'master' into streamingrepartition-strict-false

8552ff9

cursor bot reviewed Feb 9, 2026

View reviewed changes

	if self._min_rows_per_bundle is None:
	# Short-circuit if no bundle row target was defined.
	assert len(self._bundle_buffer) == 1
	bundle = self._bundle_buffer[0]
	self._bundle_buffer = []
	self._bundle_buffer_size = 0
	self._bundle_buffer_size_bytes = 0
	return [bundle], bundle



		@pytest.mark.parametrize("batch_size", [30, 35, 45])
		def test_streaming_repartition_fusion_non_strict(

	if isinstance(down_logical_op, StreamingRepartition):
	return (
	isinstance(up_logical_op, MapBatches)
	and up_logical_op._batch_size is not None
	and down_logical_op.target_num_rows_per_block is not None
	and down_logical_op.target_num_rows_per_block > 0
	# When the batch_size is a multiple of target_num_rows_per_block, fusing would still produce exactly identical sequence of blocks.
	# See `_fuse_streaming_repartition_operators_in_dag` docstring for details.
	# TODO: when the StreamingRepartition supports none_strict_mode, we can fuse
	# `MapBatches -> StreamingRepartition` no matter what the `batch_size` and `target_num_rows` are.
	# https://anyscale1.atlassian.net/browse/DATA-1731
	and up_logical_op._batch_size
	% down_logical_op.target_num_rows_per_block
	== 0
	)

	def __init__(self, target_num_rows_per_block: int):
	assert (
	target_num_rows_per_block > 0
	), "target_num_rows_per_block must be positive for streaming repartition."

	logical_op = AbstractUDFMap(
	name,
	input_op,
	up_logical_op.fn,
	can_modify_num_rows=up_logical_op.can_modify_num_rows,
	fn_args=up_logical_op.fn_args,
	fn_kwargs=up_logical_op.fn_kwargs,
	fn_constructor_args=up_logical_op.fn_constructor_args,
	fn_constructor_kwargs=up_logical_op.fn_constructor_kwargs,
	min_rows_per_bundled_input=batch_size,
	compute=compute,
	ray_remote_args_fn=ray_remote_args_fn,
	ray_remote_args=ray_remote_args,
	)
	self._op_map[op] = logical_op

Conversation

machichima commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues

Additional information

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

machichima commented Jan 19, 2026

Uh oh!

This comment was marked as outdated.

Uh oh!

owenowenisme left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

machichima Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

machichima Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

machichima Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

machichima commented Jan 19, 2026 •

edited

Loading

machichima Jan 26, 2026 •

edited

Loading

machichima Jan 23, 2026 •

edited

Loading

machichima Feb 7, 2026 •

edited

Loading