|
| 1 | +<!--===- docs/FIRArrayOperations.md |
| 2 | + |
| 3 | + Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. |
| 4 | + See https://llvm.org/LICENSE.txt for license information. |
| 5 | + SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception |
| 6 | + |
| 7 | +--> |
| 8 | + |
| 9 | +# Design: FIR Array operations |
| 10 | + |
| 11 | +```eval_rst |
| 12 | +.. contents:: |
| 13 | + :local: |
| 14 | +``` |
| 15 | + |
| 16 | +## General |
| 17 | + |
| 18 | +The array operations in FIR model the copy-in/copy-out semantics over Fortran |
| 19 | +statements. |
| 20 | + |
| 21 | +They are currently 6 array operations: |
| 22 | +- `fir.array_load` |
| 23 | +- `fir.array_merge_store` |
| 24 | +- `fir.array_fetch` |
| 25 | +- `fir.array_update` |
| 26 | +- `fir.array_access` |
| 27 | +- `fir.array_amend` |
| 28 | + |
| 29 | +`array_load`(s) and `array_merge_store` are a pairing that brackets the lifetime |
| 30 | +of the array copies. |
| 31 | + |
| 32 | +`array_fetch` and `array_update` are defined to work as getter/setter pairs on |
| 33 | +values of elements from loaded array copies. These have "GEP-like" syntax and |
| 34 | +semantics. |
| 35 | + |
| 36 | +`array_access` and `array_amend` are defined to work as getter/setter pairs on |
| 37 | +references to elements in loaded array copies. `array_access` has "GEP-like" |
| 38 | +syntax. `array_amend` annotates which loaded array copy is being written to. |
| 39 | +It is invalid to update an array copy without array_amend; doing so will result |
| 40 | +in undefined behavior. |
| 41 | + |
| 42 | +## array_load |
| 43 | + |
| 44 | +This operation taken with `array_merge_store` captures Fortran's |
| 45 | +copy-in/copy-out semantics. One way to think of this is that array_load |
| 46 | +creates a snapshot copy of the entire array. This copy can then be used |
| 47 | +as the "original value" of the array while the array's new value is |
| 48 | +computed. The `array_merge_store` operation is the copy-out semantics, which |
| 49 | +merge the updates with the original array value to produce the final array |
| 50 | +result. This abstracts the copy operations as opposed to always creating |
| 51 | +copies or requiring dependence analysis be performed on the syntax trees |
| 52 | +and before lowering to the IR. |
| 53 | + |
| 54 | +Load an entire array as a single SSA value. |
| 55 | + |
| 56 | +```fortran |
| 57 | + real :: a(o:n,p:m) |
| 58 | + ... |
| 59 | + ... = ... a ... |
| 60 | +``` |
| 61 | + |
| 62 | +One can use `fir.array_load` to produce an ssa-value that captures an |
| 63 | +immutable value of the entire array `a`, as in the Fortran array expression |
| 64 | +shown above. Subsequent changes to the memory containing the array do not |
| 65 | +alter its composite value. This operation let's one load an array as a |
| 66 | +value while applying a runtime shape, shift, or slice to the memory |
| 67 | +reference, and its semantics guarantee immutability. |
| 68 | + |
| 69 | +```mlir |
| 70 | +%s = fir.shape_shift %o, %n, %p, %m : (index, index, index, index) -> !fir.shape<2> |
| 71 | +// load the entire array 'a' |
| 72 | +%v = fir.array_load %a(%s) : (!fir.ref<!fir.array<?x?xf32>>, !fir.shape<2>) -> !fir.array<?x?xf32> |
| 73 | +// a fir.store here into array %a does not change %v |
| 74 | +``` |
| 75 | + |
| 76 | +# array_merge_store |
| 77 | + |
| 78 | +The `array_merge_store` operation store a merged array value to memory. |
| 79 | + |
| 80 | + |
| 81 | +```fortran |
| 82 | + real :: a(n,m) |
| 83 | + ... |
| 84 | + a = ... |
| 85 | +``` |
| 86 | + |
| 87 | +One can use `fir.array_merge_store` to merge/copy the value of `a` in an |
| 88 | +array expression as shown above. |
| 89 | + |
| 90 | +```mlir |
| 91 | + %v = fir.array_load %a(%shape) : ... |
| 92 | + %r = fir.array_update %v, %f, %i, %j : (!fir.array<?x?xf32>, f32, index, index) -> !fir.array<?x?xf32> |
| 93 | + fir.array_merge_store %v, %r to %a : !fir.ref<!fir.array<?x?xf32>> |
| 94 | +``` |
| 95 | + |
| 96 | +This operation merges the original loaded array value, `%v`, with the |
| 97 | +chained updates, `%r`, and stores the result to the array at address, `%a`. |
| 98 | + |
| 99 | +## array_fetch |
| 100 | + |
| 101 | +The `array_fetch` operation fetches the value of an element in an array value. |
| 102 | + |
| 103 | +```fortran |
| 104 | + real :: a(n,m) |
| 105 | + ... |
| 106 | + ... a ... |
| 107 | + ... a(r,s+1) ... |
| 108 | +``` |
| 109 | + |
| 110 | +One can use `fir.array_fetch` to fetch the (implied) value of `a(i,j)` in |
| 111 | +an array expression as shown above. It can also be used to extract the |
| 112 | +element `a(r,s+1)` in the second expression. |
| 113 | + |
| 114 | +```mlir |
| 115 | + %s = fir.shape %n, %m : (index, index) -> !fir.shape<2> |
| 116 | + // load the entire array 'a' |
| 117 | + %v = fir.array_load %a(%s) : (!fir.ref<!fir.array<?x?xf32>>, !fir.shape<2>) -> !fir.array<?x?xf32> |
| 118 | + // fetch the value of one of the array value's elements |
| 119 | + %1 = fir.array_fetch %v, %i, %j : (!fir.array<?x?xf32>, index, index) -> f32 |
| 120 | +``` |
| 121 | + |
| 122 | +It is only possible to use `array_fetch` on an `array_load` result value. |
| 123 | + |
| 124 | +## array_update |
| 125 | + |
| 126 | +The `array_update` operation is used to update the value of an element in an |
| 127 | +array value. A new array value is returned where all element values of the input |
| 128 | +array are identical except for the selected element which is the value passed in |
| 129 | +the update. |
| 130 | + |
| 131 | +```fortran |
| 132 | + real :: a(n,m) |
| 133 | + ... |
| 134 | + a = ... |
| 135 | +``` |
| 136 | + |
| 137 | +One can use `fir.array_update` to update the (implied) value of `a(i,j)` |
| 138 | +in an array expression as shown above. |
| 139 | + |
| 140 | +```mlir |
| 141 | + %s = fir.shape %n, %m : (index, index) -> !fir.shape<2> |
| 142 | + // load the entire array 'a' |
| 143 | + %v = fir.array_load %a(%s) : (!fir.ref<!fir.array<?x?xf32>>, !fir.shape<2>) -> !fir.array<?x?xf32> |
| 144 | + // update the value of one of the array value's elements |
| 145 | + // %r_{ij} = %f if (i,j) = (%i,%j), %v_{ij} otherwise |
| 146 | + %r = fir.array_update %v, %f, %i, %j : (!fir.array<?x?xf32>, f32, index, index) -> !fir.array<?x?xf32> |
| 147 | + fir.array_merge_store %v, %r to %a : !fir.ref<!fir.array<?x?xf32>> |
| 148 | +``` |
| 149 | + |
| 150 | +An array value update behaves as if a mapping function from the indices |
| 151 | +to the new value has been added, replacing the previous mapping. These |
| 152 | +mappings can be added to the ssa-value, but will not be materialized in |
| 153 | +memory until the `fir.array_merge_store` is performed. |
| 154 | + |
| 155 | +## array_access |
| 156 | + |
| 157 | +The `array_access` operationis used to fetch the memory reference of an element |
| 158 | +in an array value. |
| 159 | + |
| 160 | +```fortran |
| 161 | + real :: a(n,m) |
| 162 | + ... |
| 163 | + ... a ... |
| 164 | + ... a(r,s+1) ... |
| 165 | +``` |
| 166 | + |
| 167 | +One can use `fir.array_access` to recover the implied memory reference to |
| 168 | +the element `a(i,j)` in an array expression `a` as shown above. It can also |
| 169 | +be used to recover the reference element `a(r,s+1)` in the second |
| 170 | +expression. |
| 171 | + |
| 172 | +```mlir |
| 173 | + %s = fir.shape %n, %m : (index, index) -> !fir.shape<2> |
| 174 | + // load the entire array 'a' |
| 175 | + %v = fir.array_load %a(%s) : (!fir.ref<!fir.array<?x?xf32>>, !fir.shape<2>) -> !fir.array<?x?xf32> |
| 176 | + // fetch the value of one of the array value's elements |
| 177 | + %1 = fir.array_access %v, %i, %j : (!fir.array<?x?xf32>, index, index) -> !fir.ref<f32> |
| 178 | +``` |
| 179 | + |
| 180 | +It is only possible to use `array_access` on an `array_load` result value. |
| 181 | + |
| 182 | +## array_amend |
| 183 | + |
| 184 | +The `array_amend` operation marks an array value as having been changed via its |
| 185 | +reference. The reference into the array value is obtained via a |
| 186 | +`fir.array_access` op. |
| 187 | + |
| 188 | +```mlir |
| 189 | + // fetch the value of one of the array value's elements |
| 190 | + %1 = fir.array_access %v, %i, %j : (!fir.array<?x?xT>, index, index) -> !fir.ref<T> |
| 191 | + // modify the element by storing data using %1 as a reference |
| 192 | + %2 = ... %1 ... |
| 193 | + // mark the array value |
| 194 | + %new_v = fir.array_amend %v, %2 : (!fir.array<?x?xT>, !fir.ref<T>) -> !fir.array<?x?xT> |
| 195 | +``` |
| 196 | + |
| 197 | +## Array value copy pass |
| 198 | + |
| 199 | +One of the main purpose of the array operations present in FIR is to be able to |
| 200 | +perform the dependence analysis and elide copies where possible with a MLIR |
| 201 | +pass. This pass is called the `array-value-copy` pass. |
| 202 | +The analysis detects if there are any conflicts. A conflicts is when one of the |
| 203 | +following cases occurs: |
| 204 | + |
| 205 | +1. There is an `array_update`/`array_amend` to an array value/reference, a_j, |
| 206 | + such that a_j was loaded from the same array memory reference (array_j) but |
| 207 | + with a different shape as the other array values a_i, where i != j. |
| 208 | + [Possible overlapping arrays.] |
| 209 | +2. There is either an `array_fetch`/`array_access` or |
| 210 | + `array_update`/`array_amend` of a_j with a different set of index values. |
| 211 | + [Possible loop-carried dependence.] |
| 212 | + |
| 213 | +`array_update` writes an entire element in the loaded array value. So an |
| 214 | +`array_update` that does not change any of the arrays fetched from or updates |
| 215 | +the exact same element that was read on the current iteration does not |
| 216 | +introduce a dependence. |
| 217 | + |
| 218 | +`array_amend` may be a partial update to an element, such as a substring. In |
| 219 | +that case, there is no dependence if all the other `array_access` ops are |
| 220 | +referencing other arrays. We conservatively assume there may be an |
| 221 | +overlap like in s(:)(1:4) = s(:)(3:6) where s is an array of characters. |
| 222 | + |
| 223 | +If none of the array values overlap in storage and the accesses are not |
| 224 | +loop-carried, then the arrays are conflict-free and no copies are required. |
| 225 | + |
| 226 | +Below is an example of the FIR/MLIR code before and after the `array-value-copy` |
| 227 | +pass. |
| 228 | + |
| 229 | +```fortran |
| 230 | +subroutine s(a,l,u) |
| 231 | + type t |
| 232 | + integer m |
| 233 | + end type t |
| 234 | + type(t) :: a(:) |
| 235 | + integer :: l, u |
| 236 | + forall (i=l:u) |
| 237 | + a(i) = a(u-i+1) |
| 238 | + end forall |
| 239 | +end |
| 240 | +``` |
| 241 | + |
| 242 | +``` |
| 243 | +func @_QPs(%arg0: !fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>, %arg1: !fir.ref<i32>, %arg2: !fir.ref<i32>) { |
| 244 | + %0 = fir.alloca i32 {adapt.valuebyref, bindc_name = "i"} |
| 245 | + %1 = fir.load %arg1 : !fir.ref<i32> |
| 246 | + %2 = fir.convert %1 : (i32) -> index |
| 247 | + %3 = fir.load %arg2 : !fir.ref<i32> |
| 248 | + %4 = fir.convert %3 : (i32) -> index |
| 249 | + %c1 = arith.constant 1 : index |
| 250 | + %5 = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>> |
| 251 | + %6 = fir.array_load %arg0 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>> |
| 252 | + %7 = fir.do_loop %arg3 = %2 to %4 step %c1 unordered iter_args(%arg4 = %5) -> (!fir.array<?x!fir.type<_QFsTt{m:i32}>>) { |
| 253 | + %8 = fir.convert %arg3 : (index) -> i32 |
| 254 | + fir.store %8 to %0 : !fir.ref<i32> |
| 255 | + %c1_i32 = arith.constant 1 : i32 |
| 256 | + %9 = fir.load %arg2 : !fir.ref<i32> |
| 257 | + %10 = fir.load %0 : !fir.ref<i32> |
| 258 | + %11 = arith.subi %9, %10 : i32 |
| 259 | + %12 = arith.addi %11, %c1_i32 : i32 |
| 260 | + %13 = fir.convert %12 : (i32) -> i64 |
| 261 | + %14 = fir.convert %13 : (i64) -> index |
| 262 | + %15 = fir.array_access %6, %14 {Fortran.offsets} : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>> |
| 263 | + %16 = fir.load %0 : !fir.ref<i32> |
| 264 | + %17 = fir.convert %16 : (i32) -> i64 |
| 265 | + %18 = fir.convert %17 : (i64) -> index |
| 266 | + %19 = fir.array_access %arg4, %18 {Fortran.offsets} : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>> |
| 267 | + %20 = fir.load %15 : !fir.ref<!fir.type<_QFsTt{m:i32}>> |
| 268 | + fir.store %20 to %19 : !fir.ref<!fir.type<_QFsTt{m:i32}>> |
| 269 | + %21 = fir.array_amend %arg4, %19 : (!fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.ref<!fir.type<_QFsTt{m:i32}>>) -> !fir.array<?x!fir.type<_QFsTt{m:i32}>> |
| 270 | + fir.result %21 : !fir.array<?x!fir.type<_QFsTt{m:i32}>> |
| 271 | + } |
| 272 | + fir.array_merge_store %5, %7 to %arg0 : !fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.array<?x!fir.type<_QFsTt{m:i32}>>, !fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>> |
| 273 | + return |
| 274 | +} |
| 275 | +``` |
| 276 | + |
| 277 | +``` |
| 278 | +func @_QPs(%arg0: !fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>, %arg1: !fir.ref<i32>, %arg2: !fir.ref<i32>) { |
| 279 | + %0 = fir.alloca i32 {adapt.valuebyref, bindc_name = "i"} |
| 280 | + %1 = fir.load %arg1 : !fir.ref<i32> |
| 281 | + %2 = fir.convert %1 : (i32) -> index |
| 282 | + %3 = fir.load %arg2 : !fir.ref<i32> |
| 283 | + %4 = fir.convert %3 : (i32) -> index |
| 284 | + %c1 = arith.constant 1 : index |
| 285 | + %c0 = arith.constant 0 : index |
| 286 | + %5:3 = fir.box_dims %arg0, %c0 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>, index) -> (index, index, index) |
| 287 | + %6 = fir.shape %5#1 : (index) -> !fir.shape<1> |
| 288 | + // Allocate copy |
| 289 | + %7 = fir.allocmem !fir.array<?x!fir.type<_QFsTt{m:i32}>>, %5#1 |
| 290 | + %8 = fir.convert %5#1 : (index) -> index |
| 291 | + %c0_0 = arith.constant 0 : index |
| 292 | + %c1_1 = arith.constant 1 : index |
| 293 | + %9 = arith.subi %8, %c1_1 : index |
| 294 | + // Initialize copy |
| 295 | + fir.do_loop %arg3 = %c0_0 to %9 step %c1_1 { |
| 296 | + %c1_4 = arith.constant 1 : index |
| 297 | + %15 = arith.addi %arg3, %c1_4 : index |
| 298 | + %16 = fir.array_coor %arg0(%6) %15 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>, !fir.shape<1>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>> |
| 299 | + %c1_5 = arith.constant 1 : index |
| 300 | + %17 = arith.addi %arg3, %c1_5 : index |
| 301 | + %18 = fir.array_coor %7(%6) %17 : (!fir.heap<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>, !fir.shape<1>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>> |
| 302 | + %19 = fir.field_index m, !fir.type<_QFsTt{m:i32}> |
| 303 | + %20 = fir.coordinate_of %16, %19 : (!fir.ref<!fir.type<_QFsTt{m:i32}>>, !fir.field) -> !fir.ref<i32> |
| 304 | + %21 = fir.coordinate_of %18, %19 : (!fir.ref<!fir.type<_QFsTt{m:i32}>>, !fir.field) -> !fir.ref<i32> |
| 305 | + %22 = fir.load %20 : !fir.ref<i32> |
| 306 | + fir.store %22 to %21 : !fir.ref<i32> |
| 307 | + } |
| 308 | + %10 = fir.undefined !fir.array<?x!fir.type<_QFsTt{m:i32}>> |
| 309 | + %11 = fir.undefined !fir.array<?x!fir.type<_QFsTt{m:i32}>> |
| 310 | + // Perform the actual work |
| 311 | + %12 = fir.do_loop %arg3 = %2 to %4 step %c1 unordered iter_args(%arg4 = %10) -> (!fir.array<?x!fir.type<_QFsTt{m:i32}>>) { |
| 312 | + %15 = fir.convert %arg3 : (index) -> i32 |
| 313 | + fir.store %15 to %0 : !fir.ref<i32> |
| 314 | + %c1_i32 = arith.constant 1 : i32 |
| 315 | + %16 = fir.load %arg2 : !fir.ref<i32> |
| 316 | + %17 = fir.load %0 : !fir.ref<i32> |
| 317 | + %18 = arith.subi %16, %17 : i32 |
| 318 | + %19 = arith.addi %18, %c1_i32 : i32 |
| 319 | + %20 = fir.convert %19 : (i32) -> i64 |
| 320 | + %21 = fir.convert %20 : (i64) -> index |
| 321 | + %22 = fir.array_coor %arg0 %21 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>> |
| 322 | + %23 = fir.load %0 : !fir.ref<i32> |
| 323 | + %24 = fir.convert %23 : (i32) -> i64 |
| 324 | + %25 = fir.convert %24 : (i64) -> index |
| 325 | + %26 = fir.array_coor %7(%6) %25 : (!fir.heap<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>, !fir.shape<1>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>> |
| 326 | + %27 = fir.load %22 : !fir.ref<!fir.type<_QFsTt{m:i32}>> |
| 327 | + fir.store %27 to %26 : !fir.ref<!fir.type<_QFsTt{m:i32}>> |
| 328 | + %28 = fir.undefined !fir.array<?x!fir.type<_QFsTt{m:i32}>> |
| 329 | + fir.result %28 : !fir.array<?x!fir.type<_QFsTt{m:i32}>> |
| 330 | + } |
| 331 | + %13 = fir.convert %5#1 : (index) -> index |
| 332 | + %c0_2 = arith.constant 0 : index |
| 333 | + %c1_3 = arith.constant 1 : index |
| 334 | + %14 = arith.subi %13, %c1_3 : index |
| 335 | + // Move tha value back from the copy to the original array |
| 336 | + fir.do_loop %arg3 = %c0_2 to %14 step %c1_3 { |
| 337 | + %c1_4 = arith.constant 1 : index |
| 338 | + %15 = arith.addi %arg3, %c1_4 : index |
| 339 | + %16 = fir.array_coor %7(%6) %15 : (!fir.heap<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>, !fir.shape<1>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>> |
| 340 | + %c1_5 = arith.constant 1 : index |
| 341 | + %17 = arith.addi %arg3, %c1_5 : index |
| 342 | + %18 = fir.array_coor %arg0(%6) %17 : (!fir.box<!fir.array<?x!fir.type<_QFsTt{m:i32}>>>, !fir.shape<1>, index) -> !fir.ref<!fir.type<_QFsTt{m:i32}>> |
| 343 | + %19 = fir.field_index m, !fir.type<_QFsTt{m:i32}> |
| 344 | + %20 = fir.coordinate_of %16, %19 : (!fir.ref<!fir.type<_QFsTt{m:i32}>>, !fir.field) -> !fir.ref<i32> |
| 345 | + %21 = fir.coordinate_of %18, %19 : (!fir.ref<!fir.type<_QFsTt{m:i32}>>, !fir.field) -> !fir.ref<i32> |
| 346 | + %22 = fir.load %20 : !fir.ref<i32> |
| 347 | + fir.store %22 to %21 : !fir.ref<i32> |
| 348 | + } |
| 349 | + fir.freemem %7 : !fir.heap<!fir.array<?x!fir.type<_QFsTt{m:i32}>>> |
| 350 | + return |
| 351 | + } |
| 352 | +``` |
0 commit comments