Skip to content
228 changes: 227 additions & 1 deletion docs/design/interoperability/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,27 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

- [Philosophy and goals](#philosophy-and-goals)
- [Overview](#overview)
- [C++ Interoperability Model: Introduction and Principles](#c-interoperability-model-introduction-and-principles)
- [The "Successor Language" Mandate](#the-successor-language-mandate)
- [The `Cpp` Associated Type: The Interop Trigger](#the-cpp-associated-type-the-interop-trigger)
- [Importing C++ APIs into Carbon](#importing-c-apis-into-carbon)
- [Importing C++ Libraries (Header-Based)](#importing-c-libraries-header-based)
- [TODO: Importing C++ Code (Inline)](#todo-importing-c-code-inline)
- [Accessing Built-in C++ Entities (File-less)](#accessing-built-in-c-entities-file-less)
- [The `Cpp` Namespace](#the-cpp-namespace)
- [TODO: Importing C++ Macros](#todo-importing-c-macros)
- [Calling C++ Code from Carbon](#calling-c-code-from-carbon)
- [Function Call Syntax and Semantics](#function-call-syntax-and-semantics)
- [TODO: Overload Resolution](#todo-overload-resolution)
- [TODO: Thunks](#todo-thunks)
- [TODO: Constructors](#todo-constructors)
- [TODO: Struct Literals](#todo-struct-literals)
- [TODO: Accessing C++ Classes, Structs, and Members](#todo-accessing-c-classes-structs-and-members)
- [TODO: Accessing Global Variables](#todo-accessing-global-variables)
- [TODO: Bi-directional Type Mapping: Primitives and Core Types](#todo-bi-directional-type-mapping-primitives-and-core-types)
- [TODO: Advanced Type Mapping: Pointers, References, and `const`](#todo-advanced-type-mapping-pointers-references-and-const)
- [TODO: Bi-directional Type Mapping: Standard Library Types](#todo-bi-directional-type-mapping-standard-library-types)
- [TODO: The Operator Interoperability Model](#todo-the-operator-interoperability-model)

<!-- tocstop -->

Expand All @@ -29,4 +50,209 @@ more detail.

## Overview

TODO
Carbon's bidirectional interoperability with C++ is
[a cornerstone of its design](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code),
enabling a gradual transition from existing C++ codebases. The goal is not just
a foreign function interface (FFI), but a seamless, high-fidelity integration
that supports advanced C++ features, from templates to class hierarchies.

C++ APIs are imported into Carbon using an `import Cpp` directive, which makes
C++ declarations available within a dedicated `Cpp` namespace in Carbon. This
prevents name collisions and makes the origin of symbols explicit. Carbon code
can then call C++ functions, instantiate C++ classes, and use C++ types, while
respecting C++'s semantics, including its complex overload resolution rules.

Similarly, Carbon APIs can be designed to be callable from C++. The
interoperability layer is designed to be zero-cost, avoiding unnecessary
allocations or copies when calling between the two languages This is achieved
through a deep semantic co-design, where the Carbon compiler embeds a C++
compiler frontend (Clang) to understand and map C++ constructs with high
fidelity. This includes preserving the nominal distinctions between C++ types
like `long` and `long long`, or `T*` and `T&`, which is critical for correct
overload resolution and template instantiation.

## C++ Interoperability Model: Introduction and Principles

### The "Successor Language" Mandate

The design of Carbon's C++ interoperability is governed by its foundational
goal: [to be a successor language](/README.md), not merely a language with a
foreign function interface (FFI). This mandate dictates a design that moves
beyond the C-style FFI adopted by most modern languages and instead provides
"seamless, bidirectional interoperability". The objective is to support deep
integration with existing C++ code, encompassing its most complex features,
"from inheritance to templates".

This goal has profound implications for the Carbon compiler and language
semantics. It requires that C++ is not treated as a "foreign" entity. Instead,
Carbon's semantic model must be _co-designed_ to understand, map, and interact
with C++'s semantic constructs—including templates, class hierarchies, and
complex overload resolution—with high fidelity. The interoperability layer must,
therefore, operate at the semantic analysis (SemIR) level, not just at the
linking (ABI) level. This document specifies the design of this semantic
contract.

### The `Cpp` Associated Type: The Interop Trigger

A core mechanism in this design is the `Cpp` associated type. This concept
defines the "trigger" that activates C++-specific semantic rules within the
Carbon compiler. Any operation involving a type that is designated as a `Cpp`
associated type will invoke the specialized interoperability logic, such as the
operator model detailed in
[The Operator Interoperability Model Section](#todo-the-operator-interoperability-model).

A type is considered a `Cpp` associated type if its definition involves an
imported C++ type in any of the following ways:

1. The C++ type itself (for example, `Cpp.Widget`).
2. A pointer to a C++ type (for example, `Cpp.Widget*`).
3. A Carbon generic type parameterized with a C++ type (for example,
`MyCarbonVector(Cpp.Widget)`).
4. A Carbon struct or class containing a C++ type as a member (for example,
`MyCarbonStruct { x: Cpp.Widget }`).

This "pervasive" model of C++-awareness is a fundamental design choice. The C++
semantics are not confined to a specific `unsafe` or `extern "C++"` block; they
"infect" any Carbon type that composes them. For example, when the Carbon
compiler instantiates a _Carbon_ generic type like `MyCarbonVector(Cpp.Widget)`,
its type system must be aware that the `Cpp.Widget` parameter carries
C++-specific rules. This mandates that Carbon's own generic system, struct
layout logic, and operator lookup must query the type system for the presence of
a `Cpp` associated type. If present, the compiler must branch to a different,
more complex logic path (for example, C++-aware overload resolution). This
design prioritizes the goal of a "seamless" and "intuitive" user experience over
implementation simplicity.

## Importing C++ APIs into Carbon

### Importing C++ Libraries (Header-Based)

The primary mechanism for importing existing, user-defined C++ code is through
header file inclusion. The Carbon toolchain must be able to parse and analyze
C++ header files to make their declarations available within Carbon.

**Syntax:** The syntax for this operation is `import Cpp library "header_name"`.
This syntax is used for both C-style standard libraries and C++ headers:

- **C Standard Library:**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This syntax is used for both C-style standard libraries and C++ headers:
- **C Standard Library:**
This syntax is used for both standard library headers and user-defined headers:
- **Standard Library:**

I think we should try to avoid mentioning C here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


```carbon
import Cpp library "<cstdio>";
```

This import makes entities like `putchar` available.

- **C++ User-Defined Header:**
```carbon
import Cpp library "circle.h";
```
This import makes user-defined declarations and definitions available.

**Mechanism:** This file-based import model implies a co-compilation strategy.
The Carbon toolchain must embed a full C++ compiler front-end (specifically,
Clang). When the Carbon compiler encounters an `import Cpp library` directive,
it does not attempt to parse the C++ header itself. Instead, it must:

1. Invoke the embedded Clang library to parse, preprocess, and perform semantic
analysis on the specified header file.
2. Request that Clang build an Abstract Syntax Tree (AST) of the public
declarations within that header.
3. Perform a complex "AST-to-SemIR" translation on demand, "bridging" the C++
declarations from Clang's AST into Carbon's own semantic representation
(SemIR).

This model allows Carbon to leverage Clang's mature and correct implementation
of C++'s complex parsing and semantic rules, including template instantiation,
without having to reimplement them.

### TODO: Importing C++ Code (Inline)

### Accessing Built-in C++ Entities (File-less)

Some C++ entities, particularly built-in primitive types, are not defined in any
header file. They are "intrinsic" to the C++ compiler. These entities are
available in Carbon without an explicit `import` declaration.

**Mechanism:** When a built-in C++ entity is accessed in Carbon code, the
compiler checks if the necessary C++ Abstract Syntax Tree (AST) has been
generated. If not, an AST is generated on-demand. This on-demand generation
ensures that types like `long` or `float` are available seamlessly within the
`Cpp` namespace when they are first used, without requiring developers to
explicitly import them. This approach provides a clean, file-less way to access
the foundational types required for C++ interoperability.

### The `Cpp` Namespace

A critical design choice for managing C++ imports is the mandatory use of a
containing namespace, `Cpp`. All imported C++ entities—functions, classes,
types, and operators—are accessed by way of this prefix.

- **Functions:** `Cpp.putchar(...)`
- **Classes/Types:** `Cpp.Circle`, `Cpp.Point`
- **Constructors:** `Cpp.Circle.Circle()`

**Rationale:** This prefix acts as a "firewall" and is essential for adhering to
Carbon's core design principles of
[Namespace cleanliness](/docs/project/principles/namespace_cleanliness.md) and
[Low context-sensitivity](/docs/project/principles/low_context_sensitivity.md).

C++ codebases, particularly older ones, are often "polluted" with
global-namespace functions and, most problematically, thousands of un-namespaced
preprocessor macros (for example, in standard platform headers). A "naive"
import that dumps these symbols into the Carbon global namespace would be
disastrous, leading to rampant name collisions and high context-sensitivity (for
example, "Does `Foo` refer to Carbon's `Foo` or an imported C++ `Foo`?").

The `Cpp.` prefix makes the _origin_ of every symbol explicit and unambiguous.
It ensures that C++ entities cannot collide with Carbon code, thereby "learning
from" one of C++'s most significant legacy design issues.

### TODO: Importing C++ Macros

## Calling C++ Code from Carbon

### Function Call Syntax and Semantics

Once imported, C++ functions are invoked using standard Carbon function call
syntax, prefixed with the `Cpp` namespace. The Carbon compiler is responsible
for mapping the Carbon arguments to the types expected by the C++ function's
signature.

This often requires explicit casting on the Carbon side, using the `as` keyword,
to satisfy the C++ function's parameter types.

**Example:** The following example imports `cstdio` and calls the C function
`putchar`. The Carbon `Core.Char` variable `n` must be cast first to `u8` and
then to `i32` to match the `int` parameter expected by `putchar`.

```carbon
import Cpp library "<cstdio>";

fn Run() {
let hello: array(Core.Char, 6) = ('H', 'e', 'l', 'l', 'o', '!');
for (n: Core.Char in hello) {
// Carbon 'as' casting is used to match the C++ signature
Cpp.putchar((n as u8) as i32);
}
}
```

### TODO: Overload Resolution

### TODO: Thunks

### TODO: Constructors

### TODO: Struct Literals

## TODO: Accessing C++ Classes, Structs, and Members

## TODO: Accessing Global Variables

## TODO: Bi-directional Type Mapping: Primitives and Core Types

## TODO: Advanced Type Mapping: Pointers, References, and `const`

## TODO: Bi-directional Type Mapping: Standard Library Types

## TODO: The Operator Interoperability Model
Loading