Skip to content

Conversation

@teskje
Copy link
Contributor

@teskje teskje commented Jan 15, 2026

The previous apply update sorting logic assumed that ID order equals dependency order for "derived items". That's not correct anymore: An altered materialized view can depend on items with greater catalog IDs.

To ensure Materialize doesn't panic during startup when trying to parse such altered materialized views, this PR changes the item update sorting to use a topological sort for derived items instead.

Additionally, I noticed that the existing topo sort implementation was inefficient, which seemed problematic given the larger amount of items it now has to process. Thus I added a second commit that replaces it by a more efficient implementation using Kahn's algorithm.

Motivation

  • This PR fixes a recognized bug.

Fixes https://github.com/MaterializeInc/database-issues/issues/10018

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

The previous apply update sorting logic assumed that ID order equals
dependency order for "derived items". That's not correct anymore: An
altered materialized view can depend on items with greater catalog IDs.

To ensure Materialize doesn't panic during startup when trying to parse
such altered materialized views, this commit changes the item update
sorting to use a topological sort for derived items instead.
The topological sorting in apply was using an inefficient quadratic
algorithm. That was fine when we were only sorting connection items but
is not fine anymore now that all most catalog items need to be sorted
topologically.

This commit changes the topo sort implementation to use Kahn's algorithm
(https://en.wikipedia.org/wiki/Topological_sorting#Kahn's_algorithm),
which does the job in time linear in the number of items and dependency
edges.
@teskje teskje force-pushed the apply-derived-item-topo-sort branch from 384632f to eab89da Compare January 15, 2026 14:53
@teskje teskje marked this pull request as ready for review January 16, 2026 15:48
@teskje teskje requested a review from a team as a code owner January 16, 2026 15:48
@teskje teskje requested a review from aljoscha January 16, 2026 15:48
sort_items_topological(&mut connections);
sort_items_topological(&mut derived_items);

// Other groups we can simply sort by ID.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this bite us again at some point? 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be! Also the grouping might become wrong in the future. As the comment at the top says, ideally we would just throw everything into a single topo search. The comment says we don't know how to find the dependencies, which I'm not quite sure what that means. But also we'd have to do parse the SQL of each object for that, which makes me hesitant to try doing more than we need here.

(Random thought: Would be nice if we didn't store the SQL but a parsed format that we don't have to re-parse all the time...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had that random thought as well, but then again you're now stuck with a format that you have to stick to or have an evolvability problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd imagine this would be encoded as part of mz-catalog-protos, for which we already have a migration framework in place to evolve that format.

@teskje
Copy link
Contributor Author

teskje commented Jan 19, 2026

TFTR!

@teskje teskje merged commit c9c0593 into MaterializeInc:main Jan 19, 2026
338 checks passed
@teskje teskje deleted the apply-derived-item-topo-sort branch January 19, 2026 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants