Skip to content

Commit 43533e2

Browse files
docs: Describe how we index dependent names (#314)
1 parent 2d8d6b9 commit 43533e2

File tree

1 file changed

+99
-9
lines changed

1 file changed

+99
-9
lines changed

docs/Design.md

Lines changed: 99 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,18 @@
1-
# Design sketch for scip-clang
2-
3-
There are three main things to discuss here:
4-
- What the overall indexer architecture will be
5-
- How we can avoid redundant work for headers across TUs
6-
(["claiming"](https://github.com/kythe/kythe/blob/master/kythe/cxx/indexer/cxx/claiming.md)
7-
in the Kythe docs)
8-
- How the indexer will map C++ to SCIP
9-
1+
# scip-clang design notes
2+
3+
- [Architecture](#architecture)
4+
- [Handling slow, hung or crashed workers](#handling-slow-hung-or-crashed-workers)
5+
- [Disk I/O](#disk-io)
6+
- [Bazel and distributed builds](#bazel-and-distributed-builds)
7+
- [Reducing work across headers](#reducing-work-across-headers)
8+
- [Checking well-behavedness of headers](#checking-well-behavedness-of-headers)
9+
- [Indexing templates](#indexing-templates)
10+
- [Mapping C++ to SCIP](#mapping-c-to-scip)
11+
- [Symbol names for macros](#symbol-names-for-macros)
12+
- [Symbol names for declarations](#symbol-names-for-declarations)
13+
- [Symbol names for enum cases](#symbol-names-for-enum-cases)
14+
- [Method disambiguator](#method-disambiguator)
15+
- [Forward declarations](#forward-declarations)
1016
## Architecture
1117

1218
When working on a compilation database (a `compile_commands.json` file),
@@ -216,6 +222,90 @@ entities for equality (or content-hash the AST,
216222
which would also be error-prone).
217223
</details>
218224

225+
## Indexing templates
226+
227+
Recommended reading: [cppreference page on dependent names](https://en.cppreference.com/w/cpp/language/dependent_name)
228+
229+
In the context of templates, C++ has two kinds of names:
230+
- Dependent names, where the result of name lookup may
231+
depend on the substitutions for template parameters.
232+
- Non-dependent names, where the result of name lookup
233+
is not allowed to depend on substitutions,
234+
even if considering substitutions would lead to a better match.
235+
236+
Consequently, it is not always possible to get the correctly
237+
name-resolved result for a name without template substitution.
238+
239+
For example, prefixing method calls with `this->`
240+
inside a templated class makes a name dependent.
241+
So in the presence of templates,
242+
omitting `this->` for method calls is not always permitted,
243+
and adding `this->` can make "obviously wrong" code compile.
244+
Here's a code example:
245+
246+
```cpp
247+
template <typename T>
248+
struct Q0 {
249+
void f0() {}
250+
};
251+
252+
template <typename T>
253+
struct Q1: Q0<T> {
254+
void f1() {
255+
// f0 is dependent here, since `this` has type Q1<T>*
256+
this->f0(); // OK
257+
// f0 is independent here due to absence of explicit `this`
258+
f0(); // error: use of undeclared identifier 'f0'
259+
this->non_existent(); // OK: no template instantiation => no error
260+
}
261+
};
262+
```
263+
264+
For indexing templates, there are roughly 3 possible options
265+
from an indexer's point of view:
266+
267+
1. Pedantic:
268+
- Traverse uninstantiated template bodies once,
269+
and collect information about non-dependent names.
270+
- For every template instantiation, traverse the template body
271+
once, and collect information about dependent names.
272+
This can be de-duplicated on-the-fly.
273+
2. Generalizing:
274+
- Traverse uninstantiated template bodies once,
275+
and collect information about non-dependent names.
276+
- Randomly select a single instantiation. Traverse the template body
277+
for this instantiation, and collect information about dependent names.
278+
3. Optimistic:
279+
- Traverse uninstantied template bodies once,
280+
and collect information about both non-dependent and dependent names.
281+
For dependent names, rely on some way of performing approximate name lookup.
282+
283+
We go with the Optimistic approach in scip-clang for the following reasons:
284+
285+
- Performance: It is the only approach
286+
compatible with the optimization of indexing a header only once (per transcript).
287+
Otherwise, if a template in a header is included in a TU,
288+
but not instantiated,
289+
then indexing the header will not fully index the body of the template.
290+
Going for the Pedantic approach would likely lead to a large amount
291+
of redundant work across TUs due to repeated traversals of the same or
292+
similar template instantiations. The extra information would also
293+
increase the time for index merging.
294+
- Good enough: Based on experience, most dependent names in practice
295+
behave like non-dependent names anyways.
296+
This is reflected in clangd's index also using imprecise name lookup
297+
for dependent names:
298+
299+
```cpp
300+
/// Performs an imprecise lookup of a dependent name in this class.
301+
///
302+
/// This function does not follow strict semantic rules and should be used
303+
/// only when lookup rules can be relaxed, e.g. indexing.
304+
std::vector<const NamedDecl *>
305+
lookupDependentName(DeclarationName Name,
306+
llvm::function_ref<bool(const NamedDecl *ND)> Filter);
307+
```
308+
219309
## Mapping C++ to SCIP
220310

221311
(FQN = Fully Qualified Name)

0 commit comments

Comments
 (0)