Skip to content

Commit b6e65ae

Browse files
committed
explain what 'segments' are in a comment
1 parent 064fc78 commit b6e65ae

File tree

1 file changed

+37
-0
lines changed

1 file changed

+37
-0
lines changed

clang/lib/CIR/CodeGen/CIRGenStmtOpenACC.cpp

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,43 @@ class OpenACCClauseCIREmitter final
125125
// As some clauses, such as 'num_gangs' or 'wait' require a 'segments' list to
126126
// be maintained, this takes a list of segments that will be updated with the
127127
// proper counts as 'argument' elements are added.
128+
//
129+
// In MLIR, the 'operands' are stored as a large array, with a separate array
130+
// of 'segments' that show which 'operand' applies to which 'operand-kind'.
131+
// That is, a 'num_workers' operand-kind or 'num_vectors' operand-kind.
132+
//
133+
// So the operands array might have 4 elements, but the 'segments' array will
134+
// be something like:
135+
//
136+
// {0, 0, 0, 2, 0, 1, 1, 0, 0...}
137+
//
138+
// Where each position belongs to a specific 'operand-kind'. So that
139+
// specifies that whichever operand-kind corresponds with index '3' has 2
140+
// elements, and should take the 1st 2 operands off the list (since all
141+
// preceding values are 0). operand-kinds corresponding to 5 and 6 each have
142+
// 1 element.
143+
//
144+
// Fortunately, the `MutableOperandRange` append function actually takes care
145+
// of that for us at the 'top level'.
146+
//
147+
// However, in cases like `num_gangs' or 'wait', where each individual
148+
// 'element' might be itself array-like, there is a separate 'segments' array
149+
// for them. So in the case of:
150+
//
151+
// device_type(nvidia, radeon) num_gangs(1, 2, 3)
152+
//
153+
// We have to emit that as TWO arrays into the IR (where the device_type is an
154+
// attribute), so they look like:
155+
//
156+
// num_gangs({One : i32, Two : i32, Three : i32} [#acc.device_type<nvidia>],\
157+
// {One : i32, Two : i32, Three : i32} [#acc.device_type<radeon>])
158+
//
159+
// When stored in the 'operands' list, the top-level 'segement' for
160+
// 'num_gangs' just shows 6 elements. In order to get the array-like
161+
// apperance, the 'numGangsSegments' list is kept as well. In the above case,
162+
// we've inserted 6 operands, so the 'numGangsSegments' must contain 2
163+
// elements, 1 per array, and each will have a value of 3. The verifier will
164+
// ensure that the collections counts are correct.
128165
mlir::ArrayAttr
129166
handleDeviceTypeAffectedClause(mlir::ArrayAttr existingDeviceTypes,
130167
mlir::ValueRange argument,

0 commit comments

Comments
 (0)