Skip to content

Fix: skip irrelevant sharding conditions to prevent route explosion#38535

Open
Senrian wants to merge 2 commits intoapache:masterfrom
Senrian:master
Open

Fix: skip irrelevant sharding conditions to prevent route explosion#38535
Senrian wants to merge 2 commits intoapache:masterfrom
Senrian:master

Conversation

@Senrian
Copy link
Copy Markdown

@Senrian Senrian commented Mar 20, 2026

Summary

Fix issue #38456: Subquery with non-binding sharding tables causes full-route broadcast.

Root Cause

In ShardingStandardRouteEngine, when routing a table, the code iterates over ALL sharding conditions and accumulates results via result.addAll(dataNodes).

For subqueries, conditions are split per table. When routing um_merchant_user_tag, the condition for um_tag_user (a different table) is not relevant, but since getShardingValuesFromShardingConditions returns empty for non-matching conditions, route0 treats it as "no condition" and returns full broadcast (512 DataNodes).

Fix

Added isConditionRelevantToTable() method to check if a condition is relevant to the current routing table (either matches the table directly or belongs to a binding table group).

In routeByShardingConditionsWithCondition() and routeByMixedConditionsWithCondition(), we now skip irrelevant conditions before calling route0(). If no relevant conditions are found, we fall back to full routing (empty sharding values) to preserve original behavior.

This ensures that mixed relevant and irrelevant conditions do not trigger full routing, while correct fallback behavior is maintained when no relevant conditions exist.

Testing

  • Added SubqueryNonBindingTableRouteTest with regression coverage
  • Verified all existing sharding route tests pass

Fixes #38456

Copy link
Copy Markdown
Member

@terrymanu terrymanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decision

Merge Verdict: Not Mergeable
Reviewed Scope: latest PR head 9c112ad, production diff in ShardingStandardRouteEngine.java (line 132), and existing related tests in ShardingStandardRouteEngineTest:118 (line 118) and SubqueryRouteTest:59 (line 59).
Not Reviewed Scope: full JDBC/Proxy runtime reproduction, end-to-end federation behavior outside the existing unit/integration tests, and broader dialect-specific SQL semantics beyond this shared route-engine path.
Need Expert Review: yes. A sharding-routing maintainer should recheck after the fix because this touches a shared routing path used across JDBC/Proxy and multiple dialects.
Positive Feedback

The direction is correct. In the pure condition-routing branch, filtering irrelevant ShardingConditions first and then falling back to full routing when none are relevant does address the route-explosion symptom described in issue 38456.
Major Issues

[P1] The PR introduces a new regression on the mixed-routing path.
Symptom:
ShardingStandardRouteEngine.java (line 164) always enters routeByMixedConditionsWithCondition() as long as any ShardingCondition exists in the statement. After this patch, ShardingStandardRouteEngine.java (line 171) skips every condition irrelevant to the current table, and then ShardingStandardRouteEngine.java (line 183) returns an empty result directly.
Risk:
For mixed sharding rules, the pre-patch behavior still routed through the hint side when the current table had no relevant condition. The new code can leave the current table with no RouteUnit at all. That is not part of the original issue fix; it is a newly introduced regression on a shared path. The existing tests already show that mixed routing is a supported scenario, see ShardingStandardRouteEngineTest:118 (line 118) and ShardingStandardRouteEngineTest:156 (line 156).
Suggested action:
Please do not copy the pure-condition filtering logic directly into the mixed branch. When all conditions are skipped in the mixed branch, the original hint-fallback semantics still need to be preserved instead of returning an empty collection.

[P2] The latest version does not provide regression tests mapped one-to-one to the fix points, so the validation evidence is still insufficient.
Symptom:
The latest diff changes only one production file and does not include any test file. Existing coverage in SubqueryRouteTest:59 (line 59) through SubqueryRouteTest:89 (line 89) covers same-table, binding-table, and non-sharding-table subqueries, but there is still no one-to-one regression case for the reported non-binding sharding-table subquery. There is also no counterexample test for the mixed path.
Risk:
Per CODE_OF_CONDUCT.md (line 18) and CODE_OF_CONDUCT.md (line 20), this kind of shared routing fix needs successful build evidence and coverage evidence that is not weaker than the master branch. Even though the current related tests pass, they still do not prove that both the original symptom and the newly introduced mixed-path regression are covered.
Suggested action:
Please add two categories of tests:
one for the issue-specific non-binding subquery regression, directly covering the original object and SQL shape;
one for the mixed-hint counterexample, at least covering t_hint_ds_test and t_hint_table_test when there is a condition for another table but no relevant condition for the current table, and the current table should still route correctly.

Newly Introduced Issues

The mixed-path empty-routing issue is a blocker newly introduced in the latest head, not a pre-existing problem. It comes from the behavior change in ShardingStandardRouteEngine.java (line 175) through ShardingStandardRouteEngine.java (line 183).
Next Steps

Keep the irrelevant-condition filtering in the pure-condition branch.
Restore the original hint-fallback semantics in the mixed branch when the current table has no relevant condition.
Add dedicated regression tests for both the original issue scenario and the mixed-path counterexample.
After that, I suggest another focused review specifically on whether these two points are fully closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Subquery with non-binding sharding tables causes full-route broadcast

2 participants