Skip to content

Conversation

@jtuyls
Copy link
Contributor

@jtuyls jtuyls commented Jan 29, 2026

Implements phase 4 for adding specialization support on dynamic values for data-tiling: #22370. This carries the encoding dynamic values through Flow and Stream dialects.

Assisted-by: Claude

@jtuyls jtuyls force-pushed the users/jtuyls/encoding-dims-flow-stream-conversion branch from aff97cf to bcc050b Compare January 29, 2026 21:19
Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TLDR: I think it will break encoding unification; I need to re-think about the design. See inline comments for a little more details.

Comment on lines +1661 to +1676
// Verify encoding_dims count matches what the encoding expects.
// Check both operand (for decode) and result (for encode) encodings.
for (RankedTensorType type : {operandType, resultType}) {
if (IREE::Encoding::SerializableAttr encodingAttr =
dyn_cast_or_null<IREE::Encoding::SerializableAttr>(
type.getEncoding())) {
std::optional<int64_t> expectedDims =
encodingAttr.getNumDynamicEncodingDims();
if (expectedDims.has_value() &&
static_cast<int64_t>(getEncodingDims().size()) != *expectedDims) {
return emitOpError() << "encoding expects " << *expectedDims
<< " encoding dim(s), but "
<< getEncodingDims().size() << " provided";
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can create a method in EncodingTypes.h and reuse it for Encoding/Flow/Stream ops.

Comment on lines +1663 to +1668
for (RankedTensorType type : {operandType, resultType}) {
if (IREE::Encoding::SerializableAttr encodingAttr =
dyn_cast_or_null<IREE::Encoding::SerializableAttr>(
type.getEncoding())) {
std::optional<int64_t> expectedDims =
encodingAttr.getNumDynamicEncodingDims();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen this pattern at least three times. Do we consider introducing a method in Encoding dialect? E.g., IREE::Encoding::getNumDynamicEncodingDims(Type).

rewriter.replaceOpWithNewOp<IREE::Stream::AsyncDispatchOp>(
encodeOp, exportOp,
/*workload=*/dynamicDims, encodeOp.getResult().getType(), operands,
/*workload=*/workload, encodeOp.getResult().getType(), operands,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove the argument comment

Suggested change
/*workload=*/workload, encodeOp.getResult().getType(), operands,
workload, encodeOp.getResult().getType(), operands,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is broken when an encoding op has encoding_dims while the other does not. (assuming the same encoding attributes are identical.)

The encodingSignature does not take encoding_dims into account.

IRRewriter rewriter(ctx);
int executableId = 0;
for (auto encodeOp : encodeOps) {
ArrayAttr encodingSignature = getEncodingSignature(rewriter, encodeOp);
if (!cachedExecutables.contains(encodingSignature)) {
cachedExecutables[encodingSignature] =
createExecutableAndExport(rewriter, encodeOp, executableId++);
}
auto [executableOp, exportOp] = cachedExecutables[encodingSignature];
replaceEncodeOpWithDispatchOp(rewriter, encodeOp, executableOp, exportOp);
}
}

Comment on lines +820 to +822
// encoding1 has dynamic M, encoding2 has static M=4. The GPU resolver's
// getUnifiedEncoding returns the first encoding, so both should be unified
// to #encoding1 (with iteration_sizes = [?, 4096, 4096]).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks broken to me. How does encoding dimensions get propagated to other functions (outside initializer)?

I feel that we'll need to create new globals and store these values; perform updates for dispatch ops. However, the executables are already outlined, which makes this pass broken when the number of encoding dims do not match.

I don't have a solution atm. We probably can't do encoding unification if the number of encoding_dims is not the same.

Then the next design question pops out in my mind is that we may have to carry all the values (including c128) in this case. I need to think more about it. Thanks for raising the PR and providing the examples.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks broken to me. How does encoding dimensions get propagated to other functions (outside initializer)?

Do we really need that? This is before SpecializeEncodings, so we have unserialized encodings at this point and we don't update the internals of the executables right? After SpecializeEncodings, we have serialized encodings and shouldn't need the encoding dims anymore? Maybe I am missing something though and this may need a higher bw discussion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had such question when I saw the result of MaterializeEncoding pass:

stream.async.dispatch on(#{{.+}}) @[[$EX]]::@[[$ENTRY]][%[[D0]], %[[D1]], %[[D0]], %[[D1]], %[[M]], %[[N]], %[[K]]]

I forgot that we'll need to pass in the encoding dims to dispatch op. If the unified encoding does not carry the same encoding dims, the dispatch op is broken, right?

In this example, the dispatch op that uses #encoding1 takes %M and the dispatch op that uses #encoding2 takes zero encoding dims. If the result of encoding unification is #encoding1, the latter dispatch op has to be updated with taking one more encoding dimension. Otherwise, you're not able to do your specialization later on.

I may be missing something. Your specialization (about layouts) is exercised inside the executable, no?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants