You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently headings are a mix of sentence case ("The quick brown fox")
and title case ("The Quick Brown Fox"). Title case is extremely formal,
so sentence case feels more natural here.
Copy file name to clipboardExpand all lines: guide/src/guide/compute_capabilities.md
+16-16Lines changed: 16 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,9 +1,9 @@
1
-
# Compute Capability Gating
1
+
# Compute capability gating
2
2
3
3
This section covers how to write code that adapts to different CUDA compute capabilities
4
4
using conditional compilation.
5
5
6
-
## What are Compute Capabilities?
6
+
## What are compute capabilities?
7
7
8
8
CUDA GPUs have different "compute capabilities" that determine which features they
9
9
support. Each capability is identified by a version number like `3.5`, `5.0`, `6.1`,
@@ -17,7 +17,7 @@ For example:
17
17
18
18
For comprehensive details, see [NVIDIA's CUDA documentation on GPU architectures](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#gpu-compilation).
19
19
20
-
## Virtual vs Real Architectures
20
+
## Virtual vs real Architectures
21
21
22
22
In CUDA terminology:
23
23
@@ -28,7 +28,7 @@ In CUDA terminology:
28
28
Rust CUDA works exclusively with virtual architectures since it only generates PTX. The
29
29
`NvvmArch::ComputeXX` enum values correspond to CUDA's virtual architectures.
30
30
31
-
## Using Target Features
31
+
## Using target features
32
32
33
33
When building your kernel, the `NvvmArch::ComputeXX` variant you choose enables specific
34
34
`target_feature` flags. These can be used with `#[cfg(...)]` to conditionally compile
@@ -51,12 +51,12 @@ which `NvvmArch::ComputeXX` is used to build the kernel, there is a different an
51
51
These features let you write optimized code paths for specific GPU generations while
52
52
still supporting older ones.
53
53
54
-
## Specifying Compute Capabilites
54
+
## Specifying compute capabilites
55
55
56
56
Starting with CUDA 12.9, NVIDIA introduced architecture suffixes that affect
@@ -142,7 +142,7 @@ Note: While the 'a' variant enables all these features during compilation (allow
142
142
143
143
For more details on suffixes, see [NVIDIA's blog post on family-specific architecture features](https://developer.nvidia.com/blog/nvidia-blackwell-and-nvidia-cuda-12-9-introduce-family-specific-architecture-features/).
144
144
145
-
### Manual Compilation (Without cuda_builder)
145
+
### Manual compilation (without cuda_builder)
146
146
147
147
If you're invoking rustc directly instead of using cuda_builder, you only need to specify the architecture through LLVM args:
Copy file name to clipboardExpand all lines: guide/src/guide/kernel_abi.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
# Kernel ABI
2
2
3
-
This section details how parameters are passed to GPU kernels by the Codegen at the current time.
4
-
In other words, how the codegen expects you to pass different types to GPU kernels from the CPU.
3
+
This section details how parameters are passed to GPU kernels by the codegen backend. In other
4
+
words, how the codegen backend expects you to pass different types to GPU kernels from the CPU.
5
5
6
6
⚠️ If you find any bugs in the ABI please report them. ⚠️
7
7
@@ -15,7 +15,7 @@ other ABI we override purely to avoid footguns.
15
15
16
16
Functions marked as `#[kernel]` are enforced to be `extern "C"` by the kernel macro, and it is expected
17
17
that __all__ GPU kernels be `extern "C"`, not that you should be declaring any kernels without the `#[kernel]` macro,
18
-
because the codegen/cuda_std is allowed to rely on the behavior of `#[kernel]` for correctness.
18
+
because the codegen backend/cuda_std is allowed to rely on the behavior of `#[kernel]` for correctness.
19
19
20
20
## Structs
21
21
@@ -119,7 +119,7 @@ unsafe {
119
119
}
120
120
```
121
121
122
-
You may get warnings about slices being an improper C-type, but the warnings are safe to ignore, the codegen guarantees
122
+
You may get warnings about slices being an improper C-type, but the warnings are safe to ignore, the codegen backend guarantees
123
123
that slices are passed as pairs of params.
124
124
125
125
You cannot however pass mutable slices, this is because it would violate aliasing rules, each thread receiving a copy of the mutable
@@ -135,7 +135,7 @@ ZSTs (zero-sized types) are ignored and become nothing in the final PTX.
135
135
Primitive types are passed directly by value, same as structs. They map to the special PTX types `.s8`, `.s16`, `.s32`, `.s64`, `.u8`, `.u16`, `.u32`, `.u64`, `.f32`, and `.f64`.
136
136
With the exception that `u128` and `i128` are passed as byte arrays (but this has no impact on how they are passed from the CPU).
137
137
138
-
## References And Pointers
138
+
## References And pointers
139
139
140
140
References and Pointers are both passed as expected, as pointers. It is therefore expected that you pass such parameters using device memory:
Copy file name to clipboardExpand all lines: guide/src/nvvm/backends.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,25 +1,25 @@
1
-
# Custom rustc Backends
1
+
# Custom rustc backends
2
2
3
-
Before we get into the details of rustc_codegen_nvvm, we obviously need to explain what a codegen is!
3
+
Before we get into the details of rustc_codegen_nvvm, we obviously need to explain what a codegen backend is!
4
4
5
-
Custom codegens are rustc's answer to "well what if I want Rust to compile to X?". This is a problem
5
+
Custom codegen backends are rustc's answer to "well what if I want Rust to compile to X?". This is a problem
6
6
that comes up in many situations, especially conversations of "well LLVM cannot target this, so we are screwed".
7
7
To solve this problem, rustc decided to incrementally decouple itself from being attached/reliant on LLVM exclusively.
8
8
9
-
Previously, rustc only had a single codegen, the LLVM codegen. The LLVM codegen translated MIR directly to LLVM IR.
9
+
Previously, rustc only had a single codegen backend, the LLVM codegen backed. This translated MIR directly to LLVM IR.
10
10
This is great if you just want to support LLVM, but LLVM is not perfect, and inevitably you will hit limits to what LLVM
11
11
is able to do. Or, you may just want to stop using LLVM, LLVM is not without problems (it is often slow, clunky to deal with,
12
12
and does not support a lot of targets).
13
13
14
-
Nowadays, rustc is almost fully decoupled from LLVM and it is instead generic over the "codegen" backend used.
14
+
Nowadays, rustc is almost fully decoupled from LLVM and it is instead generic over the codegen backend used.
15
15
rustc instead uses a system of codegen backends that implement traits and then get loaded as dynamically linked libraries.
16
16
This allows Rust to compile to virtually anything with a surprisingly small amount of work. At the time of writing, there are
17
-
five publicly known codegens that exist:
17
+
five publicly known codegen backends that exist:
18
18
- rustc_codegen_cranelift
19
19
- rustc_codegen_llvm
20
20
- rustc_codegen_gcc
21
21
- rustc_codegen_spirv
22
-
- rustc_codegen_nvvm, obviously the best codegen ;)
22
+
- rustc_codegen_nvvm, obviously the best backend ;)
23
23
24
24
rustc_codegen_cranelift targets the cranelift backend, which is a codegen backend written in Rust that is faster than LLVM but does not have many optimizations
25
25
compared to LLVM. rustc_codegen_llvm is obvious, it is the backend almost everybody uses which targets LLVM. rustc_codegen_gcc targets GCC (GNU Compiler Collection)
@@ -32,9 +32,9 @@ What NVVM IR/libNVVM are has been covered in the [CUDA section](../../cuda/pipel
32
32
33
33
# rustc_codegen_ssa
34
34
35
-
rustc_codegen_ssa is the central crate behind every single codegen and does much of the hard work.
36
-
It abstracts away the MIR lowering logic so that custom codegens only have to implement some
37
-
traits and the SSA codegen does everything else. For example:
35
+
rustc_codegen_ssa is the central crate behind every single codegen backend and does much of the
36
+
hard work. It abstracts away the MIR lowering logic so that custom codegen backends only have to
37
+
implement some traits and the SSA codegen does everything else. For example:
38
38
- A trait for getting a type like an integer type.
0 commit comments