Skip to content

Commit 5b41ac1

Browse files
DougGregorjckarter
andcommitted
[ABI] Introduce indirect symbolic references to context descriptors.
Extending the mangling of symbolic references to also include indirect symbolic references. This allows mangled names to refer to context descriptors (both type and protocol) not in the current source file. For now, only permit indirect symbolic references within the current module, because remote mirrors (among other things) is unable to handle relocations. Co-authored-by: Joe Groff <[email protected]>
1 parent 8d3da66 commit 5b41ac1

23 files changed

+610
-233
lines changed

docs/ABI/Mangling.rst

Lines changed: 61 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,48 @@ mangled name will start with the module name (after the ``_S``).
4646
In the following, productions which are only _part_ of an operator, are
4747
named with uppercase letters.
4848

49+
Symbolic references
50+
~~~~~~~~~~~~~~~~~~~
51+
52+
The Swift compiler emits mangled names into binary images to encode
53+
references to types for runtime instantiation and reflection. In a binary,
54+
these mangled names may embed pointers to runtime data
55+
structures in order to more efficiently represent locally-defined types.
56+
We call these pointers **symbolic references**.
57+
These references will be introduced by a control character in the range
58+
`\x01` ... `\x1F`, which indicates the kind of symbolic reference, followed by
59+
some number of arbitrary bytes *which may include null bytes*. Code that
60+
processes mangled names out of Swift binaries needs to be aware of symbolic
61+
references in order to properly terminate strings; a null terminator may be
62+
part of a symbolic reference.
63+
64+
::
65+
66+
symbolic-reference ::= [\x01-\x17] .{4} // Relative symbolic reference
67+
#if sizeof(void*) == 8
68+
symbolic-reference ::= [\x18-\x1F] .{8} // Absolute symbolic reference
69+
#elif sizeof(void*) == 4
70+
symbolic-reference ::= [\x18-\x1F] .{4} // Absolute symbolic reference
71+
#endif
72+
73+
Symbolic references are only valid in compiler-emitted metadata structures
74+
and must only appear in read-only parts of a binary image. APIs and tools
75+
that interpret Swift mangled names from potentially uncontrolled inputs must
76+
refuse to interpret symbolic references.
77+
78+
The following symbolic reference kinds are currently implemented:
79+
80+
::
81+
82+
{any-generic-type, protocol} ::= '\x01' .{4} // Reference points directly to context descriptor
83+
{any-generic-type, protocol} ::= '\x02' .{4} // Reference points indirectly to context descriptor
84+
// The grammatical role of the symbolic reference is determined by the
85+
// kind of context descriptor referenced
86+
87+
protocol-conformance-ref ::= '\x03' .{4} // Reference points directly to protocol conformance descriptor (NOT IMPLEMENTED)
88+
protocol-conformance-ref ::= '\x04' .{4} // Reference points indirectly to protocol conformance descriptor (NOT IMPLEMENTED)
89+
90+
4991
Globals
5092
~~~~~~~
5193

@@ -553,18 +595,36 @@ Generics
553595

554596
::
555597

556-
protocol-conformance ::= type protocol module generic-signature?
598+
protocol-conformance-context ::= protocol module generic-signature?
599+
600+
protocol-conformance ::= type protocol-conformance-context
557601

558602
``<protocol-conformance>`` refers to a type's conformance to a protocol. The
559603
named module is the one containing the extension or type declaration that
560604
declared the conformance.
561605

562606
::
563607

608+
protocol-conformance ::= type protocol
609+
610+
If ``type`` is a generic parameter or associated type of one, then no module
611+
is mangled, because the conformance must be resolved from the generic
612+
environment.
613+
564614
protocol-conformance ::= context identifier protocol identifier generic-signature? // Property behavior conformance
565615

566616
Property behaviors are implemented using private protocol conformances.
567617

618+
::
619+
620+
concrete-protocol-conformance ::= type protocol-conformance-ref
621+
protocol-conformance-ref ::= protocol module?
622+
623+
A compact representation used to represent mangled protocol conformance witness
624+
arguments at runtime. The ``module`` is only specified for conformances that
625+
are "retroactive", meaning that the context in which the conformance is defined
626+
is in neither the protocol or type module.
627+
568628
::
569629

570630
generic-signature ::= requirement* 'l' // one generic parameter

include/swift/AST/ASTMangler.h

Lines changed: 17 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -46,12 +46,22 @@ class ASTMangler : public Mangler {
4646
/// If disabled, it is an error to try to mangle such an entity.
4747
bool AllowNamelessEntities = false;
4848

49-
/// If nonnull, provides a callback to encode symbolic references to
50-
/// type contexts.
51-
std::function<bool (const DeclContext *Context)>
52-
CanSymbolicReference;
53-
54-
std::vector<std::pair<const DeclContext *, unsigned>> SymbolicReferences;
49+
/// If enabled, some entities will be emitted as symbolic reference
50+
/// placeholders. The offsets of these references will be stored in the
51+
/// `SymbolicReferences` vector, and it is up to the consumer of the mangling
52+
/// to fill these in.
53+
bool AllowSymbolicReferences = false;
54+
55+
public:
56+
using SymbolicReferent = llvm::PointerUnion<const NominalTypeDecl *,
57+
const ProtocolConformance *>;
58+
protected:
59+
60+
/// If set, the mangler calls this function to determine whether to symbolic
61+
/// reference a given entity. Defaults to always returning true.
62+
std::function<bool (SymbolicReferent)> CanSymbolicReference;
63+
64+
std::vector<std::pair<SymbolicReferent, unsigned>> SymbolicReferences;
5565

5666
public:
5767
enum class SymbolKind {
@@ -292,7 +302,7 @@ class ASTMangler : public Mangler {
292302

293303
void appendOpParamForLayoutConstraint(LayoutConstraint Layout);
294304

295-
void appendSymbolicReference(const DeclContext *context);
305+
void appendSymbolicReference(SymbolicReferent referent);
296306

297307
std::string mangleTypeWithoutPrefix(Type type) {
298308
appendType(type);

include/swift/Demangling/Demangle.h

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@ namespace llvm {
3333
namespace swift {
3434
namespace Demangle {
3535

36+
enum class SymbolicReferenceKind : uint8_t;
37+
3638
struct DemangleOptions {
3739
bool SynthesizeSugarOnTypes = false;
3840
bool DisplayDebuggerGeneratedModule = true;
@@ -473,7 +475,9 @@ void mangleIdentifier(const char *data, size_t length,
473475
/// This should always round-trip perfectly with demangleSymbolAsNode.
474476
std::string mangleNode(const NodePointer &root);
475477

476-
using SymbolicResolver = llvm::function_ref<Demangle::NodePointer (const void *)>;
478+
using SymbolicResolver =
479+
llvm::function_ref<Demangle::NodePointer (SymbolicReferenceKind,
480+
const void *)>;
477481

478482
/// \brief Remangle a demangled parse tree, using a callback to resolve
479483
/// symbolic references.
@@ -537,6 +541,8 @@ class DemanglerPrinter {
537541
return std::move(*this << std::forward<T>(x));
538542
}
539543

544+
DemanglerPrinter &writeHex(unsigned long long n) &;
545+
540546
std::string &&str() && { return std::move(Stream); }
541547

542548
llvm::StringRef getStringRef() const { return Stream; }

include/swift/Demangling/DemangleNodes.def

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,7 @@ NODE(PrefixOperator)
143143
NODE(PrivateDeclName)
144144
NODE(PropertyDescriptor)
145145
CONTEXT_NODE(Protocol)
146+
CONTEXT_NODE(ProtocolSymbolicReference)
146147
NODE(ProtocolConformance)
147148
NODE(ProtocolDescriptor)
148149
NODE(ProtocolConformanceDescriptor)
@@ -172,13 +173,13 @@ NODE(SpecializationIsFragile)
172173
CONTEXT_NODE(Static)
173174
CONTEXT_NODE(Structure)
174175
CONTEXT_NODE(Subscript)
175-
CONTEXT_NODE(SymbolicReference)
176176
NODE(Suffix)
177177
NODE(ThinFunctionType)
178178
NODE(Tuple)
179179
NODE(TupleElement)
180180
NODE(TupleElementName)
181181
NODE(Type)
182+
CONTEXT_NODE(TypeSymbolicReference)
182183
CONTEXT_NODE(TypeAlias)
183184
NODE(TypeList)
184185
NODE(TypeMangling)
@@ -192,7 +193,6 @@ NODE(TypeMetadataLazyCache)
192193
NODE(UncurriedFunctionType)
193194
#define REF_STORAGE(Name, ...) NODE(Name)
194195
#include "swift/AST/ReferenceStorage.def"
195-
CONTEXT_NODE(UnresolvedSymbolicReference)
196196
CONTEXT_NODE(UnsafeAddressor)
197197
CONTEXT_NODE(UnsafeMutableAddressor)
198198
NODE(ValueWitness)

include/swift/Demangling/Demangler.h

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -280,6 +280,17 @@ class CharVector : public Vector<char> {
280280
}
281281
};
282282

283+
/// Kinds of symbolic reference supported.
284+
enum class SymbolicReferenceKind : uint8_t {
285+
/// A symbolic reference to a context descriptor, representing the
286+
/// (unapplied generic) context.
287+
Context,
288+
};
289+
290+
using SymbolicReferenceResolver_t = NodePointer (SymbolicReferenceKind,
291+
Directness,
292+
int32_t, const void *);
293+
283294
/// The demangler.
284295
///
285296
/// It de-mangles a string and it also owns the returned node-tree. This means
@@ -301,7 +312,7 @@ class Demangler : public NodeFactory {
301312
StringRef Words[MaxNumWords];
302313
int NumWords = 0;
303314

304-
std::function<NodePointer (int32_t, const void *)> SymbolicReferenceResolver;
315+
std::function<SymbolicReferenceResolver_t> SymbolicReferenceResolver;
305316

306317
bool nextIf(StringRef str) {
307318
if (!Text.substr(Pos).startswith(str)) return false;
@@ -472,7 +483,8 @@ class Demangler : public NodeFactory {
472483

473484
NodePointer demangleObjCTypeName();
474485
NodePointer demangleTypeMangling();
475-
NodePointer demangleSymbolicReference(const void *at);
486+
NodePointer demangleSymbolicReference(unsigned char rawKind,
487+
const void *at);
476488

477489
void dump();
478490

@@ -483,7 +495,7 @@ class Demangler : public NodeFactory {
483495

484496
/// Install a resolver for symbolic references in a mangled string.
485497
void setSymbolicReferenceResolver(
486-
std::function<NodePointer (int32_t, const void*)> resolver) {
498+
std::function<SymbolicReferenceResolver_t> resolver) {
487499
SymbolicReferenceResolver = resolver;
488500
}
489501

include/swift/Demangling/TypeDecoder.h

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ class TypeDecoder {
114114
case NodeKind::Enum:
115115
case NodeKind::Structure:
116116
case NodeKind::TypeAlias: // This can show up for imported Clang decls.
117-
case NodeKind::SymbolicReference:
117+
case NodeKind::TypeSymbolicReference:
118118
{
119119
BuiltNominalTypeDecl typeDecl = BuiltNominalTypeDecl();
120120
BuiltType parent = BuiltType();
@@ -228,7 +228,8 @@ class TypeDecoder {
228228
IsClassBound);
229229
}
230230

231-
case NodeKind::Protocol: {
231+
case NodeKind::Protocol:
232+
case NodeKind::ProtocolSymbolicReference: {
232233
if (auto Proto = decodeMangledProtocolType(Node)) {
233234
return Builder.createProtocolCompositionType(Proto, BuiltType(),
234235
/*IsClassBound=*/false);
@@ -473,14 +474,14 @@ class TypeDecoder {
473474
}
474475

475476
private:
476-
bool decodeMangledNominalType(const Demangle::NodePointer &node,
477+
bool decodeMangledNominalType(Demangle::NodePointer node,
477478
BuiltNominalTypeDecl &typeDecl,
478479
BuiltType &parent) {
479480
if (node->getKind() == NodeKind::Type)
480481
return decodeMangledNominalType(node->getChild(0), typeDecl, parent);
481482

482483
Demangle::NodePointer nominalNode;
483-
if (node->getKind() == NodeKind::SymbolicReference) {
484+
if (node->getKind() == NodeKind::TypeSymbolicReference) {
484485
// A symbolic reference can be directly resolved to a nominal type.
485486
nominalNode = node;
486487
} else {
@@ -519,19 +520,19 @@ class TypeDecoder {
519520
return true;
520521
}
521522

522-
BuiltProtocolDecl decodeMangledProtocolType(
523-
const Demangle::NodePointer &node) {
523+
BuiltProtocolDecl decodeMangledProtocolType(Demangle::NodePointer node) {
524524
if (node->getKind() == NodeKind::Type)
525525
return decodeMangledProtocolType(node->getChild(0));
526526

527-
if (node->getNumChildren() < 2 || node->getKind() != NodeKind::Protocol)
527+
if ((node->getNumChildren() < 2 || node->getKind() != NodeKind::Protocol)
528+
&& node->getKind() != NodeKind::ProtocolSymbolicReference)
528529
return BuiltProtocolDecl();
529530

530531
return Builder.createProtocolDecl(node);
531532
}
532533

533534
bool decodeMangledFunctionInputType(
534-
const Demangle::NodePointer &node,
535+
Demangle::NodePointer node,
535536
std::vector<FunctionParam<BuiltType>> &params,
536537
bool &hasParamFlags) {
537538
// Look through a couple of sugar nodes.
@@ -542,7 +543,7 @@ class TypeDecoder {
542543
}
543544

544545
auto decodeParamTypeAndFlags =
545-
[&](const Demangle::NodePointer &typeNode,
546+
[&](Demangle::NodePointer typeNode,
546547
FunctionParam<BuiltType> &param) -> bool {
547548
Demangle::NodePointer node = typeNode;
548549

include/swift/Reflection/TypeRefBuilder.h

Lines changed: 23 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -354,15 +354,30 @@ class TypeRefBuilder {
354354
// demangling out of the referenced context descriptors in the target
355355
// process.
356356
Dem.setSymbolicReferenceResolver(
357-
[this, &reader](int32_t offset, const void *base) -> Demangle::NodePointer {
358-
// Resolve the reference to a remote address.
359-
auto remoteAddress = getRemoteAddrOfTypeRefPointer(base);
360-
if (remoteAddress == 0)
357+
[this, &reader](SymbolicReferenceKind kind,
358+
Directness directness,
359+
int32_t offset, const void *base) -> Demangle::NodePointer {
360+
// Resolve the reference to a remote address.
361+
auto remoteAddress = getRemoteAddrOfTypeRefPointer(base);
362+
if (remoteAddress == 0)
363+
return nullptr;
364+
365+
auto address = remoteAddress + offset;
366+
if (directness == Directness::Indirect) {
367+
if (auto indirectAddress = reader.readPointerValue(address)) {
368+
address = *indirectAddress;
369+
} else {
361370
return nullptr;
362-
363-
return reader.readDemanglingForContextDescriptor(remoteAddress + offset,
364-
Dem);
365-
});
371+
}
372+
}
373+
374+
switch (kind) {
375+
case Demangle::SymbolicReferenceKind::Context:
376+
return reader.readDemanglingForContextDescriptor(address, Dem);
377+
}
378+
379+
return nullptr;
380+
});
366381
}
367382

368383
TypeConverter &getTypeConverter() { return TC; }

0 commit comments

Comments
 (0)