Skip to content

Conversation

@xudong963
Copy link
Member

No description provided.

Copy link
Collaborator

@suremarc suremarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have been thinking about this, it is really unfortunate that some joins will be converted into cross joins. It means we can't efficiently support join filters like this:

SELECT t1.year FROM t1 JOIN t2 ON (
    t1.year BETWEEN t2.year AND t2.year + 5
)

which is a valid and common pattern. Instead with this change, the t2.year BETWEEN t3.year AND t3.year + 5 filter will be marked as irrelevant and the dependency relation will be a cross join, which will make maintenance not incremental at all.

I know we did this to materialize a table at work, but I think I'd rather keep this commit in our fork. The proper way to fix it would be to analyze which columns in the source scans are partition columns (or more generally RowMetadata columns) and traverse the plan bottom-up, marking which columns in the plan can be computed from the row metadata scans.

We can maybe discuss some other possible shorter-term fixes that don't have this problem, perhaps for example we could add another stage to the MV analysis that removes expressions that reference non-row-metadata columns.

@xudong963
Copy link
Member Author

Thank you @suremarc , let's close the PR and fix it in a more robust way later

@xudong963 xudong963 closed this May 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants