You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The existing text uses "codegen" frequently as a shorthand for "codegen
backend". I found this confusing and distracting. ("Codegens" is even
worse.) This commit replaces these uses with "codegen backend" (or
occasionally something else more appropriate).
The commit preserves the use of "codegen" for the act of code generation,
e.g. "during codegen we do XYZ", because that's more standard.
Copy file name to clipboardExpand all lines: guide/src/faq.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,8 +14,8 @@ This can be circumvented by building LLVM in a special way, but this is far beyo
14
14
which yield considerable performance differences (especially on more complex kernels with more information in the IR).
15
15
- For some reason (either rustc giving weird LLVM IR or the LLVM PTX backend being broken) the LLVM PTX backend often
16
16
generates completely invalid PTX for trivial programs, so it is not an acceptable workflow for a production pipeline.
17
-
- GPU and CPU codegen is fundamentally different, creating a codegen that is only for the GPU allows us to
18
-
seamlessly implement features which would have been impossible or very difficult to implement in the existing codegen, such as:
17
+
- GPU and CPU codegen is fundamentally different, creating a codegen backend that is only for the GPU allows us to
18
+
seamlessly implement features which would have been impossible or very difficult to implement in the existing codegen backend, such as:
19
19
- Shared memory, this requires some special generation of globals with custom addrspaces, its just not possible to do without backend explicit handling.
20
20
- Custom linking logic to do dead code elimination so as to not end up with large PTX files full of dead functions/globals.
21
21
- Stripping away everything we do not need, no complex ABI handling, no shared lib handling, control over how function calls are generated, etc.
@@ -33,7 +33,7 @@ Long answer, there are a couple of things that make this impossible:
33
33
- NVVM IR is a __subset__ of LLVM IR, there are tons of things that NVVM will not accept. Such as a lot of function attrs not being allowed.
34
34
This is well documented and you can find the spec [here](https://docs.nvidia.com/cuda/nvvm-ir-spec/index.html). Not to mention
35
35
many bugs in libNVVM that I have found along the way, the most infuriating of which is nvvm not accepting integer types that arent `i1, i8, i16, i32, or i64`.
36
-
This required special handling in the codegen to convert these "irregular" types into vector types.
36
+
This required special handling in the codegen backend to convert these "irregular" types into vector types.
37
37
38
38
## What is the point of using Rust if a lot of things in kernels are unsafe?
0 commit comments