Skip to content

[repr types] eliminate noop casts (lowering impl)#34350

Merged
mgree merged 8 commits intoMaterializeInc:mainfrom
mgree:repr-types-lower-eliminate-varchar_to_text
Jan 7, 2026
Merged

[repr types] eliminate noop casts (lowering impl)#34350
mgree merged 8 commits intoMaterializeInc:mainfrom
mgree:repr-types-lower-eliminate-varchar_to_text

Conversation

@mgree
Copy link
Contributor

@mgree mgree commented Dec 3, 2025

The varchar_to_text cast is functionally a noop, but exists to satisfy the typechecker: varchar(8) and varchar and text are different SQL types.

But they are not different representation types---all of these will be Datum::String at runtime. So: in MIR, we can eliminate this cast.

This PR alters lowering (gated behind the feature flag enable_cast_elimination) to not bother generating varchar_to_text when lowering HIR to MIR.

Alternative implementation to #34348, which fails because the non-repr typechecker gets upset (and because I didn't rewrite the tests yet).

Both PRs will need to have only the repr typechecker running to avoid inconsistencies #34351.

Motivation

Tips for reviewers

There's a lot of noise in the commits, since there are (it turns out) two lowering functions, one for possible correlated expressions and one for uncorrelated ones. The latter (lower_uncorrelated) didn't take any configuration options, so there are many changes to accommodate the boolean flag enable_cast_elimination.

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

@mgree mgree requested review from a team as code owners December 3, 2025 21:18
@mgree mgree requested a review from ggevay December 3, 2025 21:18
@mgree mgree changed the title alter lowering (with feature flag enable_cast_elimination) to elimi… [repr types] eliminate noop costs (lowering impl) Dec 3, 2025
@ggevay
Copy link
Contributor

ggevay commented Dec 4, 2025

My earlier attempt to do this (#27029) failed in part due to the adapter crate also calling MIR's typ in several places, to get RelationDescs. Has the repr types workstream progressed far enough in the meantime to have eliminated this issue?

@mgree
Copy link
Contributor Author

mgree commented Dec 10, 2025

I think so, but we won't know for sure until I rebase onto #34351 .

@ggevay
Copy link
Contributor

ggevay commented Dec 10, 2025

Well, I'm seeing 6 calls to MirRelationExpr::typ from the adapter crate. Maybe I'm misunderstanding something, but wouldn't some of these break from losing the distinction between the types involved in the no-op casts? Seeing adapter crate makes me think that these calls want SQL-level types instead of repr types. This is what I was referring to in the RelationDesc comment on the design doc.

@mgree
Copy link
Contributor Author

mgree commented Dec 10, 2025

Well, I'm seeing 6 calls to MirRelationExpr::typ from the adapter crate. Maybe I'm misunderstanding something, but wouldn't some of these break from losing the distinction between the types involved in the no-op casts? Seeing adapter crate makes me think that these calls want SQL-level types instead of repr types. This is what I was referring to in the RelationDesc comment on the design doc.

Got it, thank you! I'll prepare a separate PR that uses the HIR type, to avoid SQL/repr type confusion.

@mgree mgree force-pushed the repr-types-lower-eliminate-varchar_to_text branch from b53d8e1 to 0174ba9 Compare December 10, 2025 19:28
@mgree mgree force-pushed the repr-types-lower-eliminate-varchar_to_text branch from ded0f5f to 6f91e23 Compare December 10, 2025 21:15
@mgree mgree requested a review from a team as a code owner December 17, 2025 21:41
@mgree
Copy link
Contributor Author

mgree commented Dec 17, 2025

@ggevay I think this hits the notes we talked about on Wednesday morning---feature flag defaults to off, but we have some deliberate exposure to the flag being turned on in the freshmart SLT. (We could do more, if you think that's worth it!)

Copy link
Contributor

@ggevay ggevay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I wrote some minor comments.

Literal(datum, typ, _name) => SS::Literal(Ok(datum), typ),
CallUnmaterializable(func, _name) => SS::CallUnmaterializable(func),
CallUnary {
func: func::UnaryFunc::CastVarCharToString(_),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later, when we have more casts that we want to remove, we could make this a configuration on the sqlfunc macro. But this is not so important for now.

if let Ok(mut mir) = hir.lower_uncorrelated() {
if let Ok(mut mir) =
hir.lower_uncorrelated(catalog.system_config().enable_cast_elimination())
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the assert_eq!(mir_typ.scalar_type, return_styp); call a few lines below fail due to eliminating a cast here? For example when the function being tested is the cast function that we are removing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I turned it into a soft_assert_eq_or_log! with a message to help us figure things out if it fails. But also, this a test that we were already passing. (Is the smoketest generating random queries?)

Copy link
Contributor

@ggevay ggevay Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

smoketest_fn generates very simple HIR expressions that just call a single function on some random parameters. It does this for almost all functions. I don't understand how we are passing this test: at some point it should generate an expression that simply calls this cast function that is being eliminated here, and then observe a type mismatch in this assert. Maybe it misses this function due to some unknown reason, or I am misunderstanding some type stuff.

Note that this kind of Rust tests uses the default of the flag from the Rust code, not the Python stuff. But I did try turning on your flag there and running this tests, and it passed. :confused-psyduck:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because varchar_to_text is not a real function name---it's just generated from casts. So when we test every function in the catalog, we won't find varchar_to_text in PG_CATALOG_BUILTINS, so we won't test it.

I can change the test to ensure it's repr type correct, or I can leave it with the soft assert... the former shouldn't fail as we make more changes here, but it means things will stay silent. I'm tempted to leave the soft assert?

Copy link
Contributor

@ggevay ggevay Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok! Thanks for figuring it out!

Edit: I've added this here: https://github.com/MaterializeInc/database-issues/issues/8562

@mgree mgree force-pushed the repr-types-lower-eliminate-varchar_to_text branch from f489d82 to d863d4e Compare January 7, 2026 18:31
@mgree mgree changed the title [repr types] eliminate noop costs (lowering impl) [repr types] eliminate noop casts (lowering impl) Jan 7, 2026
@mgree mgree merged commit 17d21f7 into MaterializeInc:main Jan 7, 2026
330 of 336 checks passed
SangJunBak pushed a commit to SangJunBak/materialize that referenced this pull request Jan 23, 2026
The `varchar_to_text` cast is functionally a noop, but exists to satisfy
the typechecker: `varchar(8)` and `varchar` and `text` are different SQL
types.

But they are _not_ different representation types---all of these will be
`Datum::String` at runtime. So: in MIR, we can eliminate this cast.

This PR alters lowering (gated behind the feature flag
`enable_cast_elimination`) to not bother generating `varchar_to_text`
when lowering HIR to MIR.

### Motivation

  * This PR adds a known-desirable feature.
  MaterializeInc#27239
mgree added a commit that referenced this pull request Jan 28, 2026
As MIR moves to representation types (#34351), we must be careful to
compute SQL types while we still have HIR. If we don't we run the risk
of confusing clients who expected a `VARCHAR` column and got a `TEXT`
column. (To be clear: the `Datum`s on the wire will always be the same,
but the type metadata might be different. BI tools may get confused.)

This PR increases confidence in the types we're presenting to users (and
so confidence in repr-type optimizations like #34350).

I'm happy to take suggestions on how best to add tests for this
change---I think the right level is pgwire or even through a BI tool,
but I'm not certain of the best way to build such a test.

### Motivation

  * This PR adds a known-desirable feature.
  #27239
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants