Skip to content

Conversation

farzonl
Copy link
Member

@farzonl farzonl commented Oct 13, 2025

fixes #159438

HLSL supports 0 based index and one based index accessors that are equivalent to the column row bracket subscripting operators.

This change adds support for these accessors by hooking into LookupMemberExpr and adding them similar to how CheckExtVectorComponent and the hlsl specific scalar to vectorsplat accessors work.

Since these accessors are HLSL specific The implementation details are kept to SemaHLSL via tryBuildHLSLMatrixElementAccessor.

Comment on lines +4351 to +4376
if (IsNumericAccessor) {
Row--;
Col--;
// Note: we add diagnostic errors here because otherwise we will return -1
if (!isOneBasedIndex(Name[0])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*row*/ 0 << /*one*/ 1;
HasIndexingError = true;
}
if (!isOneBasedIndex(Name[1])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*col*/ 1 << /*one*/ 1;
HasIndexingError = true;
}
} else {
if (!isZeroBasedIndex(Name[0])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*row*/ 0 << /*zero*/ 0;
HasIndexingError = true;
}
if (!isZeroBasedIndex(Name[1])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*col*/ 1 << /*zero*/ 0;
HasIndexingError = true;
}
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the language should be on these errors but DXC today has a way of erroring if you are out of bounds of the indexing scheme

I could exactly match this:
https://godbolt.org/z/nWvf5EM47

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we need separate *not_in_bounds vs. *out_of_bounds errors. Maybe parseHLSLMatrixAccessor could take the actual matrix dimensions and check directly against them? I like the *not_in_bounds version of the error better because it includes what kind of indexing it is (0-based or 1-based).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking actual matrix dimensions is needed because we can't do correct codegen unless we know it. Checking index base is also needed because of accessor restrictions. They do overlap, but given the potential of user misuse of ._m vs ._ it is important to let users know which one is 0-based vs 1-based. How we present that information is the real question here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I was just suggesting that we could check all of these in one step if parseHLSLMatrixAccessor is aware of the actual dimensions, including letting users know if the indexing is 0-based or 1-based.

fixes llvm#159438

HLSL supports 0 based index and one based index accessors that
are equivalent to the column row bracket subscripting operators.

This change adds support for these accessors by hooking into
LookupMemberExpr and adding them similar to how `CheckExtVectorComponent`
and the hlsl specific scalar to vectorsplat accessors work.

Since these accessors are HLSL specific The implementation details
are kept to SemaHLSL via `tryBuildHLSLMatrixElementAccessor`.
@farzonl farzonl force-pushed the feature/issue-159438 branch from 4a15512 to 48d63d3 Compare October 13, 2025 16:45
@farzonl farzonl self-assigned this Oct 13, 2025
Copy link

github-actions bot commented Oct 13, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@farzonl farzonl force-pushed the feature/issue-159438 branch from 18351d7 to 4091565 Compare October 13, 2025 17:09
if (IsNumericAccessor) {
Row--;
Col--;
// Note: we add diagnostic errors here because otherwise we will return -1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Row & Col are unsigned, so this'll underflow rather than return -1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case where it would be -1 we emit an expression error. the -1 will never make it to be consumed by CreateBuiltinMatrixSubscriptExpr let alone llvm::APInt.

Also the RowIdx and ColIdx values to MatrixSubscriptExpr are suppose to be Context.UnsignedIntTy. Thats how the other uses have been doing it. And is how I set it up when I do IntegerLiteral::Create. I think it should be fine to keep this unsigned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wasn't suggesting we change the type of the variable. Just that it is inaccurate to say that we'll return -1, because that's not a value than can be expressed in an unsigned.

[this, MemberLoc, MemberId](unsigned &Row, unsigned &Col) -> ExprResult {
StringRef Name = MemberId->getNameStart();
bool IsMaccessor = Name.consume_front("_m");
// consume numeric accessor second so if _m exist we don't consume _ to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// consume numeric accessor second so if _m exist we don't consume _ to
// consume numeric accessor second so if _m exist we don't consume _ too

def err_typecheck_subscript_not_in_bounds : Error<
"%select{row|column}0 subscript is out of bounds of %select{zero|one}1 based indexing">;
def err_typecheck_subscript_out_of_bounds : Error<
"%select{row|column}0 subscripted is out of bounds">;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"%select{row|column}0 subscripted is out of bounds">;
"%select{row|column}0 subscript is out of bounds">;

Comment on lines +7504 to +7505
def err_typecheck_subscript_not_in_bounds : Error<
"%select{row|column}0 subscript is out of bounds of %select{zero|one}1 based indexing">;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an HLSL-specific error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. i can add hlsl to the err name. also still working out what the text should be for this. If it should exactly match the dxc string or not.

Comment on lines +7504 to +7507
def err_typecheck_subscript_not_in_bounds : Error<
"%select{row|column}0 subscript is out of bounds of %select{zero|one}1 based indexing">;
def err_typecheck_subscript_out_of_bounds : Error<
"%select{row|column}0 subscripted is out of bounds">;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def err_typecheck_subscript_not_in_bounds : Error<
"%select{row|column}0 subscript is out of bounds of %select{zero|one}1 based indexing">;
def err_typecheck_subscript_out_of_bounds : Error<
"%select{row|column}0 subscripted is out of bounds">;
def err_typecheck_subscript_not_in_bounds : Error<
"matrix %select{row|column}0 subscript is out of bounds of %select{zero|one}1 based indexing">;
def err_typecheck_subscript_out_of_bounds : Error<
"matrix %select{row|column}0 subscript is out of bounds">;

@@ -225,6 +225,9 @@ class SemaHLSL : public SemaBase {
bool transformInitList(const InitializedEntity &Entity, InitListExpr *Init);
bool handleInitialization(VarDecl *VDecl, Expr *&Init);
void deduceAddressSpace(VarDecl *Decl);
ExprResult tryBuildHLSLMatrixElementAccessor(Expr *Base,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ExprResult tryBuildHLSLMatrixElementAccessor(Expr *Base,
ExprResult tryBuildMatrixElementAccessor(Expr *Base,

There is already HLSL in class name SemaHLSL.

Comment on lines +8 to +13
// CHECK: BinaryOperator 0x{{[0-9a-fA-F]+}} <line:{{[0-9]+}}:{{[0-9]+}}, col:{{[0-9]+}}> 'float' lvalue matrixcomponent '='
// CHECK-NEXT: MatrixSubscriptExpr 0x{{[0-9a-fA-F]+}} <col:{{[0-9]+}}, col:{{[0-9]+}}> 'float' lvalue matrixcomponent
// CHECK-NEXT: DeclRefExpr 0x{{[0-9a-fA-F]+}} <col:{{[0-9]+}}> 'float3x3':'matrix<float, 3, 3>' lvalue Var 0x{{[0-9a-fA-F]+}} 'A' 'float3x3':'matrix<float, 3, 3>'
// CHECK-NEXT: IntegerLiteral 0x{{[0-9a-fA-F]+}} <col:{{[0-9]+}}> 'unsigned int' 1
// CHECK-NEXT: IntegerLiteral 0x{{[0-9a-fA-F]+}} <col:{{[0-9]+}}> 'unsigned int' 2
// CHECK-NEXT: FloatingLiteral 0x{{[0-9a-fA-F]+}} <col:{{[0-9]+}}> 'float' 3.140000e+00
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// CHECK: BinaryOperator 0x{{[0-9a-fA-F]+}} <line:{{[0-9]+}}:{{[0-9]+}}, col:{{[0-9]+}}> 'float' lvalue matrixcomponent '='
// CHECK-NEXT: MatrixSubscriptExpr 0x{{[0-9a-fA-F]+}} <col:{{[0-9]+}}, col:{{[0-9]+}}> 'float' lvalue matrixcomponent
// CHECK-NEXT: DeclRefExpr 0x{{[0-9a-fA-F]+}} <col:{{[0-9]+}}> 'float3x3':'matrix<float, 3, 3>' lvalue Var 0x{{[0-9a-fA-F]+}} 'A' 'float3x3':'matrix<float, 3, 3>'
// CHECK-NEXT: IntegerLiteral 0x{{[0-9a-fA-F]+}} <col:{{[0-9]+}}> 'unsigned int' 1
// CHECK-NEXT: IntegerLiteral 0x{{[0-9a-fA-F]+}} <col:{{[0-9]+}}> 'unsigned int' 2
// CHECK-NEXT: FloatingLiteral 0x{{[0-9a-fA-F]+}} <col:{{[0-9]+}}> 'float' 3.140000e+00
// CHECK: BinaryOperator {{.*}} 'float' lvalue matrixcomponent '='
// CHECK-NEXT: MatrixSubscriptExpr {{.*}} 'float' lvalue matrixcomponent
// CHECK-NEXT: DeclRefExpr {{.*}} 'float3x3':'matrix<float, 3, 3>' lvalue Var {{.*}} 'A' 'float3x3':'matrix<float, 3, 3>'
// CHECK-NEXT: IntegerLiteral {{.*}} 'unsigned int' 1
// CHECK-NEXT: IntegerLiteral {{.*}} 'unsigned int' 2
// CHECK-NEXT: FloatingLiteral {{.*}} 'float' 3.140000e+00

If you don't care about the line & column values, the check lines can be simplified to be easier to read.

@@ -0,0 +1,39 @@
// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-compute -x hlsl -ast-dump -o - %s | FileCheck %s

typedef float float3x3 __attribute__((matrix_type(3,3)));
Copy link
Member

@hekota hekota Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to use using asymmetrical matrix and subscripts to make sure rows don't get mixed up with columns.

Comment on lines +9 to +16
float bad0 = A._m44; // expected-error {{row subscript is out of bounds of zero based indexing}} expected-error {{column subscript is out of bounds of zero based indexing}}
float bad1 = A._m33; // expected-error {{row subscripted is out of bounds}} expected-error {{column subscripted is out of bounds}}
float bad2 = A._mA2; // expected-error {{matrix subscript is not an integer}}
float bad3 = A._m2F; // expected-error {{matrix subscript is not an integer}}

float bad4 = A._00; // expected-error {{row subscript is out of bounds of one based indexing}} expected-error {{column subscript is out of bounds of one based indexing}}
float bad5 = A._44; // expected-error {{row subscripted is out of bounds}} expected-error {{column subscripted is out of bounds}}
float bad6 = A._55; // expected-error {{row subscript is out of bounds of one based indexing}} expected-error {{column subscript is out of bounds of one based indexing}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include tests for incomplete access (A._m1) and too many indices (A._m12345). It might also be good if some of these were testing that just one dimension is out of bounds. Also an asymmetrical case where column & row order is mixed up (i.e. that accessing ._23 on float2x3 is ok but ._32 is not).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at dxc we need a new error to cover the too few\many indicies case https://godbolt.org/z/o6PGjdYe3
error: invalid format for matrix subscript.

return LookupMemberExpr(S, R, BaseExpr, IsArrow, OpLoc, SS, ObjCImpDecl,
HasTemplateArgs, TemplateKWLoc);
if (S.getLangOpts().HLSL) {
if (BaseType->isScalarType()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the comment from lines 1671-2 here.


auto parseHLSLMatrixAccessor =
[this, MemberLoc, MemberId](unsigned &Row, unsigned &Col) -> ExprResult {
StringRef Name = MemberId->getNameStart();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
StringRef Name = MemberId->getNameStart();
StringRef Name = MemberId->getName();

Would getName() be better here? It returns StringRef directly while getNameStart() returns zero-terminated const char *.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. I was using getNameStart because thats what vector_ext were using.

auto isDigit = [](char c) { return c >= '0' && c <= '9'; };
auto isZeroBasedIndex = [](char c) { return c >= '0' && c <= '3'; };
auto isOneBasedIndex = [](char c) { return c >= '1' && c <= '4'; };
if (Name.empty() || !isDigit(Name[0]) || !isDigit(Name[1])) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name[1] will index beyond the string if the index is complete (_m0). And too many indices are not diagnosed (_0123).

Comment on lines +4351 to +4376
if (IsNumericAccessor) {
Row--;
Col--;
// Note: we add diagnostic errors here because otherwise we will return -1
if (!isOneBasedIndex(Name[0])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*row*/ 0 << /*one*/ 1;
HasIndexingError = true;
}
if (!isOneBasedIndex(Name[1])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*col*/ 1 << /*one*/ 1;
HasIndexingError = true;
}
} else {
if (!isZeroBasedIndex(Name[0])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*row*/ 0 << /*zero*/ 0;
HasIndexingError = true;
}
if (!isZeroBasedIndex(Name[1])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*col*/ 1 << /*zero*/ 0;
HasIndexingError = true;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (IsNumericAccessor) {
Row--;
Col--;
// Note: we add diagnostic errors here because otherwise we will return -1
if (!isOneBasedIndex(Name[0])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*row*/ 0 << /*one*/ 1;
HasIndexingError = true;
}
if (!isOneBasedIndex(Name[1])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*col*/ 1 << /*one*/ 1;
HasIndexingError = true;
}
} else {
if (!isZeroBasedIndex(Name[0])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*row*/ 0 << /*zero*/ 0;
HasIndexingError = true;
}
if (!isZeroBasedIndex(Name[1])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*col*/ 1 << /*zero*/ 0;
HasIndexingError = true;
}
}
if (IsNumericAccessor) {
Row--;
Col--;
}
// Note: we add diagnostic errors here because otherwise we will return -1
for (unsigned I = 0; I < 2; I++) {
bool IsValid = IsNumericAccessor ? isOneBasedIndex(Name[I]) : isZeroBasedIndex(Name[I]);
if (!IsValid) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*row*/ I << /*one*/ (unsigned)IsNumericAccessor;
HasIndexingError = true;
}
}

Just a suggestion - the code seems to be very repetitive.

Comment on lines +4351 to +4376
if (IsNumericAccessor) {
Row--;
Col--;
// Note: we add diagnostic errors here because otherwise we will return -1
if (!isOneBasedIndex(Name[0])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*row*/ 0 << /*one*/ 1;
HasIndexingError = true;
}
if (!isOneBasedIndex(Name[1])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*col*/ 1 << /*one*/ 1;
HasIndexingError = true;
}
} else {
if (!isZeroBasedIndex(Name[0])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*row*/ 0 << /*zero*/ 0;
HasIndexingError = true;
}
if (!isZeroBasedIndex(Name[1])) {
Diag(MemberLoc, diag::err_typecheck_subscript_not_in_bounds)
<< /*col*/ 1 << /*zero*/ 0;
HasIndexingError = true;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if we need separate *not_in_bounds vs. *out_of_bounds errors. Maybe parseHLSLMatrixAccessor could take the actual matrix dimensions and check directly against them? I like the *not_in_bounds version of the error better because it includes what kind of indexing it is (0-based or 1-based).

Copy link
Collaborator

@llvm-beanz llvm-beanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a new AST node, it can't be done as the existing MatrixSubscriptExpr.

Consider this example:

export void fn() {
    float2x2 M = {1,2,3,4};
    float2 V = M._m00_m01; // produces a vector.
    M._m10_m01 = 1.xx; // prodces a swizzled vector l-value!
}

Compiler Explorer

Expr *ColIdx = mkIdx(Col);

// Build A[Row][Col], reusing the existing matrix subscript machinery.
return SemaRef.CreateBuiltinMatrixSubscriptExpr(Base, RowIdx, ColIdx,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other problem with using a subscript expression here is that the AST won't actually retain source information accurately enough to drive AST-driven tooling. I would expect a rewriter with this AST representation to rewrite A._m00 to A[0][0]. The AST node needs to know the indices and the spelling in order to be able to recreate the original syntax without referring back to the source.

@farzonl
Copy link
Member Author

farzonl commented Oct 14, 2025

This needs a new AST node, it can't be done as the existing MatrixSubscriptExpr.

Consider this example:

export void fn() {
    float2x2 M = {1,2,3,4};
    float2 V = M._m00_m01; // produces a vector.
    M._m10_m01 = 1.xx; // prodces a swizzled vector l-value!
}

Compiler Explorer

I was thinking this wouldn't be to hard to do by generating multiple MatrixSubscriptExpr(s) but your second point about retain source information to recreate this construct would still not be possible.

@llvm-beanz
Copy link
Collaborator

I was thinking this wouldn't be to hard to do by generating multiple MatrixSubscriptExpr(s) but your second point about retain source information to recreate this construct would still not be possible.

You can't do it with multiple AST nodes. You need a single l-value of vector type that represents elements arbitrarily dispersed across the matrix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[HLSL] Add _m and _<numeric> based accessors to hlsl::matrix.

4 participants