@@ -95,29 +95,49 @@ struct MapInfoOpConversion
9595 fir::unwrapPassByRefType (typeAttr.getValue ()))) &&
9696 !characterWithDynamicLen (
9797 fir::unwrapPassByRefType (typeAttr.getValue ()))) {
98- // Characters with a LEN param are represented as char
99- // arrays/strings, the initial lowering doesn't generate
100- // bounds for these, however, we require them to map the
101- // data appropriately in the later lowering stages. This
102- // is to prevent the need for unecessary caveats
103- // specific to Flang. We also strip the array from the
104- // type so that all variations of strings are treated
105- // identically and there's no caveats or specialisations
106- // required in the later stages. As an example, Boxed
107- // char strings will emit a single char array no matter
108- // the number of dimensions caused by additional array
109- // dimensions which needs specialised for, as it differs
110- // from the non-box variation which will emit each array
111- // wrapping the character array, e.g. given a type of
112- // the same dimensions, if one is boxed, the types would
113- // end up:
98+ // Characters with a LEN param are represented as strings
99+ // (array of characters), the lowering to LLVM dialect
100+ // doesn't generate bounds for these (and this is not
101+ // done at the initial lowering either) and there is
102+ // minor inconsistencies in the variable types we
103+ // create for the map without this step when converting
104+ // to the LLVM dialect.
114105 //
115- // array<i8 x 16>
116- // vs
117- // array<10 x array< 10 x array<i8 x 16>>>
106+ // For example, given the types:
118107 //
119- // This means we have to treat one specially in the
120- // lowering. So we try to "canonicalize" it here.
108+ // 1) CHARACTER(LEN=16), dimension(:,:), allocatable :: char_arr
109+ // 2) CHARACTER(LEN=16), dimension(10,10) :: char_arr
110+ //
111+ // We get the FIR types (note for 1: we already peeled off the
112+ // dynamic extents from the type at this stage, but the conversion
113+ // to llvm dialect does that in any case, so the final result
114+ // is the same):
115+ //
116+ // 1) !fir.char<1,16>
117+ // 2) !fir.array<10x10x!fir.char<1,16>>
118+ //
119+ // Which are converted to the LLVM dialect types:
120+ //
121+ // 1) !llvm.array<16 x i8>
122+ // 2) llvm.array<10 x array<10 x array<16 x i8>>
123+ //
124+ // And in both cases, we are missing the innermost bounds for
125+ // the !fir.char<1,16> which is expanded into a 16 x i8 array
126+ // in the conversion to LLVM dialect.
127+ //
128+ // The problem with this is that we would like to treat these
129+ // cases identically and not have to create specialised
130+ // lowerings for either of these in the lowering to LLVM-IR
131+ // and treat them like any other array that passes through.
132+ //
133+ // To do so below, we generate an extra bound for the
134+ // innermost array (the char type/string) using the LEN
135+ // parameter of the character type. And we "canonicalize"
136+ // the type, stripping it down to the base element type,
137+ // which in this case is an i8. This effectively allows
138+ // the lowering to treat this as a 1-D array with multiple
139+ // bounds which it is capable of handling without any special
140+ // casing.
121141 // TODO: Handle dynamic LEN characters.
122142 if (auto ct = mlir::dyn_cast_or_null<fir::CharacterType>(
123143 fir::unwrapSequenceType (typeAttr.getValue ()))) {
0 commit comments