Skip to content

Commit 760605f

Browse files
authored
Merge pull request itanium-cxx-abi#131 from rjmccall/dependent-mangling
Editorial improvements, and write a section about dependent mangling
2 parents f0e63ac + b625bb7 commit 760605f

File tree

2 files changed

+232
-71
lines changed

2 files changed

+232
-71
lines changed

abi.html

Lines changed: 230 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -304,10 +304,12 @@ <h3><a href="#definitions"> 1.1 Definitions </a></h3>
304304
<dd>
305305
<p>
306306
In general, a type is considered a POD for the purposes of layout if
307-
it is a POD type (in the sense of ISO C++ [basic.types]). However, a
307+
it is a POD type (in the sense of ISO C++
308+
<span class=cxxref>[basic.types]</span>). However, a
308309
type is not considered to be a POD for the purpose of layout if it is:
309310
<ul>
310-
<li>a POD-struct or POD-union (in the sense of ISO C++ [class]) with a
311+
<li>a POD-struct or POD-union (in the sense of ISO C++
312+
<span class=cxxref>[class]</span>) with a
311313
bit-field whose declared width is wider than the declared type
312314
of the bit-field, or
313315
<li>an array type whose element type is not a POD for the purpose of layout, or
@@ -4319,22 +4321,34 @@ <h4><a href="#mangling-general"> 5.1.1 General </a></h4>
43194321

43204322
<p>
43214323
This section specifies the <i>mangling</i>, i.e. encoding,
4322-
of external names
4323-
(external in the sense of being visible outside the object file where
4324-
they occur).
4325-
The encoding is formalized as a derivation grammar along with the
4326-
explanatory text,
4327-
in a modified BNF with the following conventions:
4324+
of external names (external in the sense of being visible
4325+
outside the object file where they occur). The encoding is
4326+
formalized as a derivation grammar along with the explanatory
4327+
text, in a modified BNF with the following conventions:
43284328
<ul>
4329-
<li> Non-terminals are delimited by diamond braces: "&lt;>".
4330-
<li> Italics in non-terminals are modifiers to be ignored,
4331-
e.g. &lt;<i>function</i> <a href="#mangle.name">name</a>&gt; is the same as &lt;<a href="#mangle.name">name</a>&gt;.
4329+
<li> Alternatives are given on separate lines.
4330+
<li> References to non-terminals are delimited by angle brackets
4331+
<code class=mangle>&lt;&gt;</code>.
4332+
<li> Italicized text in references to non-terminals describes or
4333+
limits what is mangled by the reference, but it does not
4334+
affect the formal grammar. For example,
4335+
<code class=mangle>&lt;<i>function</i> <a href="#mangle.name">name</a>&gt;</code> is the same as
4336+
<code class=mangle>&lt;<a href="#mangle.name">name</a>&gt;</code>,
4337+
but it means this derivation rule should be used only for
4338+
the names of functions.
43324339
<li> Spaces are to be ignored.
4333-
<li> Text beginning with '#' is comments, to be ignored.
4334-
<li> Tokens in square brackets "[]" are optional.
4335-
<li> Tokens are placed in parentheses "()" for grouping purposes.
4336-
<li> '*' repeats the preceding item 0 or more times.
4337-
<li> '+' repeats the preceding item 1 or more times.
4340+
<li> Text beginning with <code class=mangle>#</code> is a comment,
4341+
to be ignored until the end of the line. Comments are often
4342+
used to describe in what case an alternative should be used.
4343+
<li> Sequences of items in square brackets
4344+
<code class=mangle>[]</code> are optional.
4345+
<li> Sequences of items in parentheses
4346+
<code class=mangle>()</code> are groups for the purposes of
4347+
<code class=mangle>*</code> and <code class=mangle>+</code>.
4348+
<li> An asterisk <code class=mangle>*</code> allows the preceding
4349+
item to repeat 0 or more times.
4350+
<li> A plus sign <code class=mangle>+</code> allows the preceding
4351+
item to repeat 1 or more times.
43384352
<li> All other characters are terminals, representing themselves.
43394353
</ul>
43404354

@@ -4351,9 +4365,10 @@ <h4><a href="#mangling-general"> 5.1.1 General </a></h4>
43514365
or <code>Type?</code> for an unknown data type.
43524366

43534367
<p>
4354-
Mangled names containing '<tt>$</tt>' or '<tt>.</tt>' are reserved for
4355-
private implementation use. Names produced using such extensions are
4356-
inherently non-portable and should be given internal linkage where possible.
4368+
Mangled names containing <code class=mangle>$</code> or
4369+
<code class=mangle>.</code> are reserved for private implementation
4370+
use. Names produced using such extensions are inherently non-portable
4371+
and should be given internal linkage where possible.
43574372

43584373
<p>
43594374
<a name="mangling-structure">
@@ -4389,6 +4404,101 @@ <h4><a href="#mangling-structure"> 5.1.2 General Structure </a></h4>
43894404
prior to the first period. There is no restriction on the characters
43904405
that may be used in the suffix following the period.
43914406

4407+
<p>
4408+
ABI mangling is designed to ensure that entities receive the
4409+
same mangling if and only if they are the same entity according
4410+
to the C++ standard's one-definition rule (ODR) and the
4411+
various rules for declaration matching (such as
4412+
<span class=cxxref>[over.dcl]</span> and
4413+
<span class=cxxref>[temp.over]</span>. Those rules are quite
4414+
complex, and they dictate the results of mangling, and so it
4415+
should not be surprising that the mangling rules are also complex.
4416+
The ABI must be closely involved with the evolution of those
4417+
language rules to ensure that they remain implementable with
4418+
mangling. When the rules say that an ODR violation has
4419+
undefined behavior, that is often because it is impractical to
4420+
ensure that the entities involved will have different manglings.
4421+
Similarly, when the rules forbid certain constructs from the
4422+
signature of a declaration, that is often because that construct
4423+
would create unreasonable problems for mangling.
4424+
4425+
<p>
4426+
Mangling must sometimes be able to distinguish entities that
4427+
are not equivalent under the ODR and declaration-matching
4428+
rules. This is true even if the entities would not be
4429+
distinguishable by C++ code because, say, every name lookup
4430+
which included both of them would be ambiguous. For example,
4431+
different translation units might declare similar but not
4432+
eqivalent function templates in the same namespace:
4433+
4434+
<pre>// a.cpp:
4435+
template &lt;int&rt; void foo() {}
4436+
template &lt;&rt; void foo<0>();
4437+
4438+
// b.cpp:
4439+
template &lt;long&rt; void foo() {}
4440+
template &lt;&rt; void foo<0>();</pre>
4441+
4442+
<p>
4443+
The C++ standard grants implementations broad flexibility to
4444+
ignore certain kinds of differences. For example, the rules
4445+
in <span class=cxxref>[temp.over.link]</span> for
4446+
functionally-equivalent function templates could be used to
4447+
shorten manglings in certain cases where instantiation-dependence
4448+
provably has no effect. This ABI generally does not take
4449+
advantage of that flexibility.
4450+
4451+
<a name="mangling.dependent">
4452+
<h5><a href="#mangling.dependent">Dependent constructs in templates</a></h5>
4453+
4454+
<p>
4455+
It is sometimes necessary to mangle unresolved and uninstantiated
4456+
language constructs such as types and expressions that appear
4457+
within templates. This accounts for a lot of the complexity of
4458+
entity mangling in this ABI.
4459+
4460+
<p>
4461+
In many places, the mangling grammar formally allows a single
4462+
construct to be mangled in one of several different ways.
4463+
Usually there is one production which allows a fully-resolved
4464+
value or entity reference, and there is another production that
4465+
allows an expression or unresolved entity reference. As an
4466+
example, this can be clearly seen in the mangling for
4467+
<a href="#mangle.array-type">array types</a>, which gives
4468+
one mangling for a constant bound and another for an expression.
4469+
4470+
<p>
4471+
There are two reasons for this. First, manglings using the
4472+
fully-resolved case are often significantly more compact.
4473+
More importantly, though, the language often treat dependent
4474+
and non-dependent constructs differently. For example,
4475+
<span class=cxxref>[temp.over.link]</span> gives rules for
4476+
when two expressions that involve template parameters are
4477+
considered equivalent, and those rules are reflected in
4478+
this ABI's expression mangling rules. Conversely, expressions
4479+
that don't involve template parameters but are used in
4480+
constant-evaluated contexts (such an array length) are
4481+
considered to be equivalent if and only if they resolve
4482+
to the same value. Mangling a non-dependent expression using
4483+
its expression structure could incorrectly produce different
4484+
manglings for different expressions that resolve to the same
4485+
value, and it could incorrectly produce the same mangling for
4486+
xpressions that resolve to different values but happen to be
4487+
spelled the same.
4488+
4489+
<p>
4490+
It is therefore important to use the right production given
4491+
the dependence of the construct in question. The standard
4492+
defines several different kinds of dependence, such as
4493+
<i>value dependence</i> and <i>type dependence</i>. In
4494+
general, the rule that should be used in mangling is
4495+
<a href="#instantiation-dependent"><i>instantiation
4496+
dependence</i></a>: if a construct in instantiation-dependent,
4497+
it should use the general production, and otherwise it
4498+
should use the narrow production. The grammar below will
4499+
state clearly when certain productions are only for
4500+
instantiation-dependent cases.
4501+
43924502
<a name="mangle.anonymous">
43934503
<h5><a href="#mangling.anonymous">Anonymous entities</a></h5>
43944504

@@ -4966,18 +5076,20 @@ <h6>Known exceptions to the extended qualifier rules</h6>
49665076

49675077
<ul compact>
49685078
<li>
4969-
The GNU <code>address_space(N)</code> qualifier is mangled using the
4970-
name <code>AS</code>. For historical reasons, if the argument expression
4971-
is not instantiation-dependent, its value is incorporated directly into
4972-
the &lt;<a href="#mangle.source-name">source-name</a>&gt; of the qualifier;
4973-
otherwise it is encoded as a qualifier argument.
5079+
The GNU <code>address_space(N)</code> qualifier is mangled
5080+
using the name <code>AS</code>. For historical reasons, if
5081+
the argument expression is not
5082+
<a href="#instantiation-dependent">instantiation-dependent</a>,
5083+
its value is incorporated directly into the
5084+
&lt;<a href="#mangle.source-name">source-name</a>&gt; of the
5085+
qualifier; otherwise it is encoded as a qualifier argument.
49745086

49755087
<p>
49765088
For example, <code>int __attribute__((address_space(3))) *</code> is
49775089
encoded as <code>PU3AS3i</code>, but
4978-
<code>int __attribute__((address_space(K))) *</code> (given that
4979-
<code>K</code> is a dependent reference to the first template parameter) is
4980-
encoded as <code>PU2ASIT_Ei</code>.
5090+
<code>int __attribute__((address_space(K))) *</code>
5091+
(in which <code>K</code> is a reference to the first
5092+
template parameter) is encoded as <code>PU2ASIT_Ei</code>.
49815093

49825094
</ul>
49835095

@@ -5218,25 +5330,26 @@ <h5><a href="#mangling.named">5.1.5.5 Class, union, and enum types</a></h5>
52185330
<h5><a href="#mangle.array-type">5.1.5.6 Array types</a></h5>
52195331

52205332
<p>
5221-
Array types encode the dimension (number of elements) and the element type.
5222-
Note that "array" parameters to functions are encoded as pointer types.
5223-
For variable length arrays (C99 VLAs),
5224-
the dimension (but not the '_' separator) is omitted.
5333+
Array types encode their array bound and element type. Note that
5334+
"array" parameters to functions are encoded as pointer types.
5335+
The array bound (but not the <code class=mangle>_</code> separator)
5336+
is omitted for incomplete array types (e.g. <code>int[]</code>)
5337+
and C99 variable-length array types.
52255338

52265339
<pre><font color=blue><code>
5227-
&lt;array-type&gt; ::= A &lt;<i>positive dimension</i> <a href="#mangle.number">number</a>&gt; _ &lt;<i>element</i> <a href="#mangle.type">type</a>&gt;
5228-
::= A [&lt;<i>dimension</i> <a href="#mangle.expression">expression</a>&gt;] _ &lt;<i>element</i> <a href="#mangle.type">type</a>&gt;
5229-
5340+
&lt;array-type&gt; ::= A [&lt;<i>array bound</i> <a href="#mangle.number">number</a>&gt;] _ &lt;<i>element</i> <a href="#mangle.type">type</a>&gt;
5341+
::= A &lt;<i>instantiation-dependent array bound</i> <a href="#mangle.expression">expression</a>&gt; _ &lt;<i>element</i> <a href="#mangle.type">type</a>&gt;
52305342
</pre></font></code>
52315343

52325344
<p>
5233-
When the dimension is an expression involving template parameters,
5234-
the second production is used.
5235-
Thus, the declarations:
5345+
The second rule is used when the array bound is an
5346+
<a href="#instantiation-dependent">instantiation-dependent</a>
5347+
expression. For example:
52365348
<pre><code> template&lt;int I&gt; void foo (int (&amp;)[I + 1]) { }
5349+
5350+
// Mangled as _Z3fooILi2EEvRAplT_Li1E_i
52375351
template void foo&lt;2&gt; (int (&amp;)[3]);
52385352
</pre></code>
5239-
produce the mangled name "<code>_Z3fooILi2EEvRAplT_Li1E_i</code>".
52405353

52415354
<a name="mangle.pointer-to-member-type">
52425355
<h5><a href="#mangle.pointer-to-member-type">5.1.5.7 Pointer-to-member types</a></h5>
@@ -5254,31 +5367,67 @@ <h5><a href="#mangle.pointer-to-member-type">5.1.5.7 Pointer-to-member types</a>
52545367
<h5><a href="#mangle.template-param">5.1.5.8 Template parameters</a></h5>
52555368

52565369
<p>
5257-
When function and member function template instantiations reference
5258-
the template parameters in their parameter or result types,
5259-
the template parameter number is encoded,
5260-
with the sequence T_, T0_, ... For example:
5370+
A reference to a template parameter is mangled using the index
5371+
of the parameter, with a special mangling for the first parameter.
5372+
The sequence of parameters is therefore <code class=mangle>T_</code>,
5373+
<code class=mangle>T0_</code>, <code class=mangle>T1_</code>, and so on.
5374+
5375+
<pre><code><font color=blue>
5376+
&lt;template-param&gt; ::= T_ # first template parameter
5377+
::= T &lt;<i>parameter-2 non-negative</i> <a href="#mangle.number">number</a>&gt; _
5378+
&lt;<a name="mangle.template-template-param">template-template-param</a>&gt; ::= &lt;<a href="#mangle.template-param">template-param</a>&gt;
5379+
::= &lt;<a href="#mangle.substitution">substitution</a>&gt;
5380+
</font></code></pre>
5381+
5382+
<p>
5383+
For example:
52615384
<pre><code>
5262-
template&lt;class T> void f(T) {}
5385+
template&lt;class T&gt; void f(T) {}
5386+
5387+
// Mangled as "_Z1fIiEvT_"
52635388
template void f(int);
5264-
// Mangled as "_Z1fIiEvT_".
5389+
52655390
</code></pre>
52665391

5267-
Class template parameter references are mangled using the standard
5268-
mangling for the actual parameter type,
5269-
typically a substitution.
5270-
Note that a template parameter reference is a substitution candidate,
5271-
distinct from the type (or other substitutible entity)
5272-
that is the actual parameter.
5273-
</p>
5392+
<p>
5393+
Note that a template parameter reference is a
5394+
<a href="#mangling-compression">substitution candidate</a>.
5395+
As a substitution, it is treated as distinct from the actual
5396+
template argument, including in recursive positions.
5397+
For example, in the mangling of the following function template
5398+
specialization, the first incidence of <code>T*</code> is not
5399+
substituted despite being known (in this specialization) to be
5400+
the same type as <code>int*</code>, and the second incidence
5401+
is substituted with the substitution derived from the first
5402+
incidence, not that from the incidence of <code>int*</code>.
52745403

5275-
<pre><code><font color=blue>
5276-
&lt;template-param&gt; ::= T_ # first template parameter
5277-
::= T &lt;<i>parameter-2 non-negative</i> <a href="#mangle.number">number</a>&gt; _
5278-
&lt;<a name="mangle.template-template-param">template-template-param</a>&gt; ::= &lt;<a href="#mangle.template-param">template-param</a>&gt;
5279-
::= &lt;<a href="#mangle.substitution">substitution</a>&gt;
5404+
<pre><code>
5405+
template&lt;class T&gt; void f(int*, T*, T*) {}
52805406

5281-
</font></code></pre>
5407+
// Mangled as "_Z1fIiEvPiPT_S2_"
5408+
template void f(int*, int*, int*);
5409+
5410+
</code></pre>
5411+
5412+
<p>
5413+
Typically, only references to function template parameters occurring
5414+
within the dependent signature of the template are mangled this way.
5415+
In other contexts, template instantiation replaces references
5416+
to template parameters with the actual template arguments, and mangling
5417+
should mangle such references exactly as if they were that template
5418+
argument. For example:
5419+
5420+
<pre><code>
5421+
template&lt;class T&gt; class A {
5422+
template&lt;class U&gt; void f(T, U) {}
5423+
};
5424+
5425+
// Mangled as "_ZN1AIiE1fIfEEviT_"
5426+
template void A&lt;int&gt;::f(int, float);
5427+
5428+
</code></pre>
5429+
5430+
</p>
52825431

52835432
<a name="mangle.function-param">
52845433
<h5><a href="#mangle.function-param">5.1.5.9 Function parameter references</a></h5>
@@ -5358,18 +5507,29 @@ <h5><a href="#mangle.template-arg">5.1.5.10 Template Arguments</a></h5>
53585507
<h4><a href="#expressions">5.1.6 Expressions</a></h4>
53595508

53605509
<p>
5361-
Expressions must be mangled in several contexts. When mangling the
5362-
name of a specialized template, non-type template arguments are
5363-
mangled as an expression; these expressions are typically very simple.
5364-
However, when mangling the signature of a function template, any
5365-
<a href="#instantiation-dependent">instantiation-dependent</a> expressions
5366-
(e.g. in an array bound,
5367-
<code>decltype</code> type, or template argument) must be mangled in
5368-
order to properly distinguish templates that are different under the
5369-
ODR and that can legally be differentiated by substitution failures.
5370-
Therefore, nearly the entire expression grammar of C++ is subject
5371-
to mangling, with only a few exceptions (like lambdas) that are
5372-
explicitly disallowed in function signatures.
5510+
Expressions must be mangled in several contexts.
5511+
5512+
<p>
5513+
When mangling the name of a specialized template, non-type
5514+
template arguments are mangled as expressions. These
5515+
expressions are typically very simple, and they do not
5516+
necessarily reflect any argument expression that was used
5517+
in source. See the section on mangling
5518+
<a href="#mangle.template-args">template arguments</a> for
5519+
more detail.
5520+
5521+
<p>
5522+
More generally, when mangling the signature of a function
5523+
template, any
5524+
<a href="#instantiation-dependent">instantiation-dependent</a>
5525+
expressions (e.g. in an array bound, <code>decltype</code>,
5526+
or template argument) must be mangled in order to properly
5527+
distinguish templates that are different under the ODR.
5528+
See the section on <a href="#mangling.dependent">dependent
5529+
mangling</a>. As a result, nearly the entire expression
5530+
grammar of C++ is subject to mangling, with only a few
5531+
exceptions (like lambdas) that are explicitly disallowed
5532+
in function signatures.
53735533

53745534
<p>
53755535
In general, expression manglings reflect a prefix traversal of the
@@ -5380,8 +5540,7 @@ <h4><a href="#expressions">5.1.6 Expressions</a></h4>
53805540
must be mangled differently because the parentheses act to suppress
53815541
argument-dependent lookup.) Unless explicitly stated otherwise, the
53825542
expression is mangled without constant folding or other
5383-
simplification. Therefore this mangling is quite similar to the
5384-
source token stream. (C++ Standard reference 14.5.5.1p5.)
5543+
simplification.
53855544

53865545
<p>
53875546
Each expression mangling begins with a code (typically two letters)

0 commit comments

Comments
 (0)