@@ -125,6 +125,43 @@ class OpenACCClauseCIREmitter final
125125 // As some clauses, such as 'num_gangs' or 'wait' require a 'segments' list to
126126 // be maintained, this takes a list of segments that will be updated with the
127127 // proper counts as 'argument' elements are added.
128+ //
129+ // In MLIR, the 'operands' are stored as a large array, with a separate array
130+ // of 'segments' that show which 'operand' applies to which 'operand-kind'.
131+ // That is, a 'num_workers' operand-kind or 'num_vectors' operand-kind.
132+ //
133+ // So the operands array might have 4 elements, but the 'segments' array will
134+ // be something like:
135+ //
136+ // {0, 0, 0, 2, 0, 1, 1, 0, 0...}
137+ //
138+ // Where each position belongs to a specific 'operand-kind'. So that
139+ // specifies that whichever operand-kind corresponds with index '3' has 2
140+ // elements, and should take the 1st 2 operands off the list (since all
141+ // preceding values are 0). operand-kinds corresponding to 5 and 6 each have
142+ // 1 element.
143+ //
144+ // Fortunately, the `MutableOperandRange` append function actually takes care
145+ // of that for us at the 'top level'.
146+ //
147+ // However, in cases like `num_gangs' or 'wait', where each individual
148+ // 'element' might be itself array-like, there is a separate 'segments' array
149+ // for them. So in the case of:
150+ //
151+ // device_type(nvidia, radeon) num_gangs(1, 2, 3)
152+ //
153+ // We have to emit that as TWO arrays into the IR (where the device_type is an
154+ // attribute), so they look like:
155+ //
156+ // num_gangs({One : i32, Two : i32, Three : i32} [#acc.device_type<nvidia>],\
157+ // {One : i32, Two : i32, Three : i32} [#acc.device_type<radeon>])
158+ //
159+ // When stored in the 'operands' list, the top-level 'segement' for
160+ // 'num_gangs' just shows 6 elements. In order to get the array-like
161+ // apperance, the 'numGangsSegments' list is kept as well. In the above case,
162+ // we've inserted 6 operands, so the 'numGangsSegments' must contain 2
163+ // elements, 1 per array, and each will have a value of 3. The verifier will
164+ // ensure that the collections counts are correct.
128165 mlir::ArrayAttr
129166 handleDeviceTypeAffectedClause (mlir::ArrayAttr existingDeviceTypes,
130167 mlir::ValueRange argument,
0 commit comments