You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A lot of names are used sometimes with backticks, sometimes without.
It's not 100% clear but I think "without" is better, for the following:
- rustc_codegen_*
- cuda_builder
- rustc
- cuda_std
- rust-gpu (which will become "Rust GPU" in a subsequent commit)
In contrast, `lib.rs` is a file name and should use backticks.
Copy file name to clipboardExpand all lines: guide/src/guide/compute_capabilities.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -142,9 +142,9 @@ Note: While the 'a' variant enables all these features during compilation (allow
142
142
143
143
For more details on suffixes, see [NVIDIA's blog post on family-specific architecture features](https://developer.nvidia.com/blog/nvidia-blackwell-and-nvidia-cuda-12-9-introduce-family-specific-architecture-features/).
144
144
145
-
### Manual Compilation (Without `cuda_builder`)
145
+
### Manual Compilation (Without cuda_builder)
146
146
147
-
If you're invoking `rustc` directly instead of using `cuda_builder`, you only need to specify the architecture through LLVM args:
147
+
If you're invoking rustc directly instead of using cuda_builder, you only need to specify the architecture through LLVM args:
148
148
149
149
```bash
150
150
rustc --target nvptx64-nvidia-cuda \
@@ -210,7 +210,7 @@ These patterns work when using base architectures (no suffix), which enable all
210
210
211
211
If you encounter errors about missing functions or features:
212
212
213
-
1. Check the compute capability you're targeting in `cuda_builder`
213
+
1. Check the compute capability you're targeting in cuda_builder
214
214
2. Verify your GPU supports the features you're using
215
215
3. Use `nvidia-smi` to check your GPU's compute capability
216
216
4. Add appropriate `#[cfg]` guards or increase the target architecture
Copy file name to clipboardExpand all lines: guide/src/guide/getting_started.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Getting Started
2
2
3
-
This section covers how to get started writing GPU crates with `cuda_std` and `cuda_builder`.
3
+
This section covers how to get started writing GPU crates with cuda_std and cuda_builder.
4
4
5
5
## Required Libraries
6
6
@@ -53,12 +53,12 @@ edition = "2021"
53
53
+cuda_std = "XX"
54
54
```
55
55
56
-
Where `XX` is the latest version of `cuda_std`.
56
+
Where `XX` is the latest version of cuda_std.
57
57
58
58
We changed our crate's crate types to `cdylib` and `rlib`. We specified `cdylib` because the nvptx targets do not support binary crate types.
59
59
`rlib` is so that we will be able to use the crate as a dependency, such as if we would like to use it on the CPU.
60
60
61
-
## lib.rs
61
+
## `lib.rs`
62
62
63
63
Before we can write any GPU kernels, we must add a few directives to our `lib.rs` which are required by the codegen:
64
64
@@ -86,7 +86,7 @@ If you would like to use `alloc` or things like printing from GPU kernels (which
86
86
externcrate alloc;
87
87
```
88
88
89
-
Finally, if you would like to use types such as slices or arrays inside of GPU kernels you must allow `improper_cytypes_definitions` either on the whole crate or the individual GPU kernels. This is because on the CPU, such types are not guaranteed to be passed a certain way, so they should not be used in `extern "C"` functions (which is what kernels are implicitly declared as). However, `rustc_codegen_nvvm` guarantees the way in which things like structs, slices, and arrays are passed. See [Kernel ABI](./kernel_abi.md).
89
+
Finally, if you would like to use types such as slices or arrays inside of GPU kernels you must allow `improper_cytypes_definitions` either on the whole crate or the individual GPU kernels. This is because on the CPU, such types are not guaranteed to be passed a certain way, so they should not be used in `extern "C"` functions (which is what kernels are implicitly declared as). However, rustc_codegen_nvvm guarantees the way in which things like structs, slices, and arrays are passed. See [Kernel ABI](./kernel_abi.md).
90
90
91
91
```rs
92
92
#![allow(improper_ctypes_definitions)]
@@ -161,7 +161,7 @@ It also applies `#[no_mangle]` so the name of the kernel is the same as it is de
161
161
162
162
## Building the GPU crate
163
163
164
-
Now that you have some kernels defined in a crate, you can build them easily using `cuda_builder`.
164
+
Now that you have some kernels defined in a crate, you can build them easily using cuda_builder.
165
165
which builds GPU crates while passing everything needed by rustc.
166
166
167
167
To use it you can simply add it as a build dependency in your CPU crate (the crate running the GPU kernels):
@@ -173,7 +173,7 @@ To use it you can simply add it as a build dependency in your CPU crate (the cra
173
173
174
174
Where `XX` is the current version of cuda_builder.
175
175
176
-
Then, you can simply invoke it in the build.rs of your CPU crate:
176
+
Then, you can simply invoke it in the `build.rs` of your CPU crate:
Copy file name to clipboardExpand all lines: guide/src/nvvm/backends.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,18 +21,18 @@ five publicly known codegens that exist:
21
21
- rustc_codegen_spirv
22
22
- rustc_codegen_nvvm, obviously the best codegen ;)
23
23
24
-
`rustc_codegen_cranelift` targets the cranelift backend, which is a codegen backend written in rust that is faster than LLVM but does not have many optimizations
25
-
compared to LLVM. `rustc_codegen_llvm` is obvious, it is the backend almost everybody uses which targets LLVM. `rustc_codegen_gcc` targets GCC (GNU Compiler Collection)
26
-
which is able to target more exotic targets than LLVM, especially for embedded. `rustc_codegen_spirv` targets the SPIR-V (Standard Portable Intermediate Representation 5)
24
+
rustc_codegen_cranelift targets the cranelift backend, which is a codegen backend written in rust that is faster than LLVM but does not have many optimizations
25
+
compared to LLVM. rustc_codegen_llvm is obvious, it is the backend almost everybody uses which targets LLVM. rustc_codegen_gcc targets GCC (GNU Compiler Collection)
26
+
which is able to target more exotic targets than LLVM, especially for embedded. rustc_codegen_spirv targets the SPIR-V (Standard Portable Intermediate Representation 5)
27
27
format, which is a format mostly used for compiling shader languages such as GLSL or WGSL to a standard representation that Vulkan/OpenGL can use, the reasons
28
28
why SPIR-V is not an alternative to CUDA/rustc_codegen_nvvm have been covered in the [FAQ](../../faq.md).
29
29
30
-
Finally, we come to the star of the show, `rustc_codegen_nvvm`. This backend targets NVVM IR for compiling rust to GPU kernels that can be run by CUDA.
30
+
Finally, we come to the star of the show, rustc_codegen_nvvm. This backend targets NVVM IR for compiling rust to GPU kernels that can be run by CUDA.
31
31
What NVVM IR/libNVVM are has been covered in the [CUDA section](../../cuda/pipeline.md).
32
32
33
33
# rustc_codegen_ssa
34
34
35
-
`rustc_codegen_ssa` is the central crate behind every single codegen and does much of the hard work.
35
+
rustc_codegen_ssa is the central crate behind every single codegen and does much of the hard work.
36
36
It abstracts away the MIR lowering logic so that custom codegens only have to implement some
37
37
traits and the SSA codegen does everything else. For example:
38
38
- A trait for getting a type like an integer type.
0 commit comments