Skip to content

Commit a2ae84c

Browse files
committed
Add [parallel.simd.traits]
1 parent 1ab4b9f commit a2ae84c

File tree

1 file changed

+174
-0
lines changed

1 file changed

+174
-0
lines changed

data_parallel_types.html

Lines changed: 174 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -201,6 +201,180 @@ <h1><ins>Header <code>&lt;experimental/simd&gt;</code> synopsis</ins></h1>
201201
</p>
202202
</cxx-section>
203203

204+
<cxx-section id="parallel.simd.abi">
205+
<h1><ins><code>simd</code> ABI tags</ins></h1>
206+
<ins>
207+
<pre>
208+
209+
namespace std::experimental {
210+
inline namespace parallelism_v2 {
211+
namespace simd_abi {
212+
struct scalar {};
213+
template&lt;int N&gt; struct fixed_size {};
214+
template&lt;class T&gt; inline constexpr in max_fixed_size = <em>implementation-defined</em>;
215+
template&lt;class T&gt; using compatible = <em>implementation-defined</em>;
216+
template&lt;class T&gt; using native = <em>implementation-defined</em>;
217+
}
218+
}
219+
}
220+
</pre>
221+
</ins>
222+
223+
<p>
224+
<ins>
225+
An <em>ABI tag</em> is a type in the <code>std::experimental::parallelism_v2::simd_abi</code> namespace that indicates a choice of size and binary representation for objects of data-parallel type. <cxx-note>The intent is for the size and binary representation to depend on the target architecture.</cxx-note> The ABI tag, together with a given element type implies a number of elements. ABI tag types are used as the second template argument to <code>simd</code> and <code>simd_mask</code>. <cxx-note>The ABI tag is orthogonal to selecting the machine instruction set. The selected machine instruction set limits the usable ABI tag types, though (see <cxx-ref to="parallel.simd.overview"></cxx-ref>). The ABI tags enable users to safely pass objects of data-parallel type between translation unit boundaries (e.g. function calls or I/O).</cxx-note>
226+
</ins>
227+
</p>
228+
229+
<p>
230+
<ins>
231+
Use of the <code>scalar</code> tag type requires data-parallel types to store a single element (i.e., <code>simd&lt;T, simd_abi::scalar&gt;::size()</code> returns 1). <cxx-note><code>scalar</code> is not an alias for <code>fixed_size&lt;1&gt;</code>.</cxx-note>
232+
</ins>
233+
</p>
234+
235+
<p>
236+
<ins>
237+
The value of <code>max_fixed_size&lt;T&gt;</code> is at least 32.
238+
</ins>
239+
</p>
240+
241+
<p>
242+
<ins>
243+
Use of the <code>simd_abi::fixed_size&lt;N&gt;</code> tag type requires data-parallel types to store <code>N</code> elements (i.e. <code>simd&lt;T, simd_abi::fixed_size&lt;N&gt;&gt;::size()</code> is <code>N</code>). <code>simd&lt;T, fixed_size&lt;N&gt;&gt;</code> and <code>simd_mask&lt;T, fixed_size&lt;N&gt;&gt;</code> with <code>N &gt; 0</code> and <code>N &lt;= max_fixed_size&lt;T&gt;</code> is supported. Additionally, for every supported <code>simd&lt;T, Abi&gt;</code> (see <cxx-ref to="parallel.simd.overview"></cxx-ref>), where <code>Abi</code> is an ABI tag is not a specialization of <code>simd_abi::fixed_size</code>, <code>N == simd&lt;T, Abi&gt;::size()</code> is true.
244+
245+
<cxx-note>It is unspecified whether <code>simd&lt;T, fixed_size&lt;T, fixed_size&lt;N&gt;&gt;</code> with <code>N &gt; max_fixed_size&lt;T&gt;</code> is supported. The value of <code>max_fixed_size&lt;T&gt;</code> can depend on compiler flags and can change between different compiler versions.</cxx-note>
246+
247+
<cxx-note>An implementation may forego ABI compatibility between differently compiled translation units for <code>simd</code> and <code>simd_mask</code> specializations using the same <code>simd_abi::fixed_size&lt;N&gt;</code> tag. Otherwise, the efficiency of <code>simd&lt;T, Abi&gt;</code> is likely to be better than for <code>simd&lt;T, fixed_size&lt;simd_size_v&lt;T, Abi&gt;&gt;&gt;</code> (with <code>Abi</code> not a specialization of <code>simd_abi::fixed_size</code>).</cxx-note>
248+
</ins>
249+
</p>
250+
251+
<p>
252+
<ins>
253+
An implementation may define additional <em>extended ABI tag</em> types in the <code>std::experimental::parallelism_v2::simd_abi</code> namespace, to support other forms of data-parallel computation.
254+
</ins>
255+
</p>
256+
257+
<p>
258+
<ins>
259+
<code>compatible&lt;T&gt;</code> is an implementation-defined alias for an ABI tag. <cxx-note>The intent is to use the ABI tag producing the most efficient data-parallel execution for the element type <code>T</code> that ensures ABI compatibility between translation units on the target architecture.</cxx-note>
260+
261+
<br>
262+
<br>
263+
264+
<cxx-example>
265+
Consider a target architecture supporting the extended ABI tags <code>__simd128</code> and <code>__simd256</code>, where the <code>__simd256</code> type requires an optional ISA extension on said architecture. Also, the target architecture does not support <code>long double</code> with either ABI tag. The implementation therefore defines
266+
267+
<bl>
268+
<li>
269+
<ins>
270+
<code>compatible&lt;T&gt;</code> as an alias for <code>__simd128</code> for all vectorizable <code>T</code>, except <code>long double</code>, and
271+
</ins>
272+
</li>
273+
274+
<li>
275+
<ins>
276+
<code>compatible&lt;long double&gt;</code> as an alias for <code>scalar</code>.
277+
</ins>
278+
</li>
279+
</bl>
280+
</cxx-example>
281+
</ins>
282+
</p>
283+
284+
<p>
285+
<ins>
286+
<code>native&lt;T&gt;</code> is an implementation-defined alias for an ABI tag. <cxx-note>The intent is to use the ABI tag producing the most efficient data-parallel execution for the element type <code>T</code> that is supported on the currently targeted system. For target architectures without ISA extensions, the <code>native&lt;T&gt;</code> and <code>compatible&lt;T&gt;</code> aliases will likely be the same. For target architectures with ISA extensions, compiler flags may influence the <code>native&lt;T&gt;</code> alias while <code>compatible&lt;T&gt;</code> will be the same independent of such flags.</cxx-note>
287+
288+
<br>
289+
<br>
290+
291+
<cxx-example>
292+
Consider a target architecture supporting the extended ABI tags <code>__simd128</code> and <code>__simd256</code>, where hardware support for <code>__simd256</code> only exists for floating-point types. The implementation therefore defines <code>native&lt;T&gt;</code> as an alias for
293+
294+
<bl>
295+
<li>
296+
<ins>
297+
<code>__simd256</code> if <code>T</code> is a floating-point type, and
298+
</ins>
299+
</li>
300+
301+
<li>
302+
<ins>
303+
<code>__simd128</code> otherwise.
304+
</ins>
305+
</li>
306+
</bl>
307+
</cxx-example>
308+
</ins>
309+
</p>
310+
311+
<ins>
312+
<pre>
313+
314+
namespace std::experimental {
315+
inline namespace parallelism_v2 {
316+
namespace simd_abi {
317+
318+
template&lt;T, size_t N&gt; struct deduce { using type = <em>see-below</em>; };
319+
}
320+
}
321+
}
322+
</pre>
323+
</ins>
324+
325+
<p>
326+
<ins>
327+
The member <code>type</code> is present if and only if
328+
329+
<bl>
330+
<li>
331+
<ins>
332+
<code>T</code> is a vectorizable type, and
333+
</ins>
334+
</li>
335+
336+
<li>
337+
<ins>
338+
<code>simd_abi::fixed_size&lt;N&gt;</code> is supported (see <cxx-ref to="parallel.simd.abi"></cxx-ref>).
339+
</ins>
340+
</li>
341+
</bl>
342+
</ins>
343+
</p>
344+
345+
<p>
346+
<ins>
347+
Where present, the member typedef <code>type</code> shall name an ABI tag type that satisfies
348+
</ins>
349+
350+
<bl>
351+
<li>
352+
<ins>
353+
<code>simd_size&lt;T, type&gt; == N</code>, and
354+
</ins>
355+
</li>
356+
357+
<li>
358+
<ins>
359+
<code>simd&lt;T, type&gt;</code> is default constructible (see <cxx-ref to="parallel.simd.overview"></cxx-ref>).
360+
</ins>
361+
</li>
362+
</bl>
363+
364+
<br>
365+
366+
<ins>
367+
If <code>N</code> is <code>1</code>, the member typedef <code>type</code> is <code>simd_abi::scalar</code>. Otherwise, if there are multiple ABI tag types that satisfy the constraints, the member typedef <code>type</code> is implementation-defined. <cxx-note>It is expected that extended ABI tags can produce better optimizations and thus are preferred over <code>simd_abi::fixed_size&lt;N&gt;</code>.</cxx-note>
368+
</ins>
369+
</p>
370+
371+
<p>
372+
<ins>
373+
The behavior of a program that adds specializations for <code>deduce</code> is undefined.
374+
</ins>
375+
</p>
376+
</cxx-section>
377+
204378
<cxx-section id="parallel.simd.overview">
205379
</cxx-section>
206380
</cxx-clause>

0 commit comments

Comments
 (0)