@@ -237,3 +237,59 @@ corresponding operation, except if it is explicitly skipped as described
237237[ above] ( #overriding-clause-inherited-properties ) . This way, in case of a later
238238tablegen failure while processing OpenMP dialect operations, earlier messages
239239triggered by that pass can point to a likely solution.
240+
241+ ## Loop-Associated Directives
242+
243+ Loop-associated OpenMP constructs are represented in the dialect as loop wrapper
244+ operations. These implement the ` LoopWrapperInterface ` , which enforces a series
245+ of restrictions upon the operation:
246+ - It contains a single region with a single block; and
247+ - Its block contains exactly two operations: another loop wrapper or
248+ ` omp.loop_nest ` operation and a terminator.
249+
250+ This approach splits the representation for a loop nest and the loop-associated
251+ constructs that specify how its iterations are executed, possibly across various
252+ SIMD lanes (` omp.simd ` ), threads (` omp.wsloop ` ), teams of threads
253+ (` omp.distribute ` ) or tasks (` omp.taskloop ` ). The ability to directly nest
254+ multiple loop wrappers to impact the execution of a single loop nest is used to
255+ represent composite constructs in a modular way.
256+
257+ The ` omp.loop_nest ` operation represents a collapsed rectangular loop nest that
258+ must always be wrapped by at least one loop wrapper, which defines how it is
259+ intended to be executed. It serves as a simpler and more restrictive
260+ representation of OpenMP loops while a more general approach to support
261+ non-rectangular loop nests, loop transformations and non-perfectly nested loops
262+ based on a new ` omp.canonical_loop ` definition is developed.
263+
264+ The following example shows how a ` parallel {do,for} ` construct would be
265+ represented:
266+ ``` mlir
267+ omp.parallel ... {
268+ ...
269+ omp.wsloop ... {
270+ omp.loop_nest (%i) : index = (%lb) to (%ub) step (%step) {
271+ %a = load %a[%i] : memref<?xf32>
272+ %b = load %b[%i] : memref<?xf32>
273+ %sum = arith.addf %a, %b : f32
274+ store %sum, %c[%i] : memref<?xf32>
275+ omp.yield
276+ }
277+ omp.terminator
278+ }
279+ ...
280+ omp.terminator
281+ }
282+ ```
283+
284+ ### Loop Transformations
285+
286+ In addition to the worksharing loop-associated constructs described above, the
287+ OpenMP specification also defines a set of loop transformation constructs. They
288+ replace the associated loop(s) before worksharing constructs are executed on the
289+ generated loop(s). Some examples of such constructs are ` tile ` and ` unroll ` .
290+
291+ A general approach for representing these types of OpenMP constructs has not yet
292+ been implemented, but it is closely linked to the ` omp.canonical_loop ` work.
293+ Nevertheless, loop transformation that the ` collapse ` clause for loop-associated
294+ worksharing constructs defines can be represented by introducing multiple
295+ bounds, step and induction variables to the ` omp.loop_nest ` operation.
0 commit comments