You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: guide/src/faq.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@ seamlessly implement features which would have been impossible or very difficult
21
21
- Stripping away everything we do not need, no complex ABI handling, no shared lib handling, control over how function calls are generated, etc.
22
22
23
23
So overall, the LLVM PTX backend is fit for smaller kernels/projects/proofs of concept.
24
-
It is however not fit for compiling an entire language (core is __very__ big) with dependencies and more. The end goal is for rust to be able to be used
24
+
It is however not fit for compiling an entire language (core is __very__ big) with dependencies and more. The end goal is for Rust to be able to be used
25
25
over CUDA C/C++ with the same (or better!) performance and features, therefore, we must take advantage of all optimizations NVCC has over us.
26
26
27
27
## If NVVM IR is a subset of LLVM IR, can we not give rustc-generated LLVM IR to NVVM?
@@ -117,22 +117,22 @@ no control over it and no 100% reliable way to fix it, therefore we must shift t
117
117
118
118
Moreover, the CUDA GPU kernel model is entirely based on trust, trusting each thread to index into the correct place in buffers,
119
119
trusting the caller of the kernel to uphold some dimension invariants, etc. This is once again, completely incompatible with how
120
-
rust does things. We can provide wrappers to calculate an index that always works, and macros to index a buffer automatically, but
120
+
Rust does things. We can provide wrappers to calculate an index that always works, and macros to index a buffer automatically, but
121
121
indexing in complex ways is a core operation in CUDA and it is impossible for us to prove that whatever the developer is doing is correct.
122
122
123
123
Finally, We would love to be able to use mut refs in kernel parameters, but this is would be unsound. Because
124
124
each kernel function is *technically* called multiple times in parallel with the same parameters, we would be
125
-
aliasing the mutable ref, which Rustc declares as unsound (aliasing mechanics). So raw pointers or slightly-less-unsafe
125
+
aliasing the mutable ref, which rustc declares as unsound (aliasing mechanics). So raw pointers or slightly-less-unsafe
126
126
need to be used. However, they are usually only used for the initial buffer indexing, after which you can turn them into a
127
127
mutable reference just fine (because you indexed in a way where no other thread will index that element). Also note
128
128
that shared refs can be used as parameters just fine.
129
129
130
-
Now that we outlined why this is a thing, why is using rust a benefit if we still need to use unsafe?
130
+
Now that we outlined why this is a thing, why is using Rust a benefit if we still need to use unsafe?
131
131
132
132
Well it's simple, eliminating most of the things that a developer needs to think about to have a safe program
133
133
is still exponentially safer than leaving __everything__ to the developer to think about.
134
134
135
-
By using rust, we eliminate:
135
+
By using Rust, we eliminate:
136
136
- The forgotten/unhandled CUDA errors problem (yay results!).
137
137
- The uninitialized memory problem.
138
138
- The forgetting to dealloc memory problem.
@@ -156,15 +156,15 @@ The reasoning for this is the same reasoning as to why you would use CUDA over o
156
156
- rust-gpu does not perform many optimizations, and with rustc_codegen_ssa's less than ideal codegen, the optimizations by LLVM and libNVVM are needed.
157
157
- SPIR-V is arguably still not suitable for serious GPU kernel codegen, it is underspecced, complex, and does not mention many things which are needed.
158
158
While libNVVM (which uses a well documented subset of LLVM IR) and the PTX ISA are very thoroughly documented/specified.
159
-
- rust-gpu is primarily focused on graphical shaders, compute shaders are secondary, which the rust ecosystem needs, but it also
159
+
- rust-gpu is primarily focused on graphical shaders, compute shaders are secondary, which the Rust ecosystem needs, but it also
160
160
needs a project 100% focused on computing, and computing only.
161
161
- SPIR-V cannot access many useful CUDA libraries such as OptiX, cuDNN, cuBLAS, etc.
162
162
- SPIR-V debug info is still very young and rust-gpu cannot generate it. While rustc_codegen_nvvm does, which can be used
163
163
for profiling kernels in something like nsight compute.
164
164
165
165
Moreover, CUDA is the primary tool used in big computing industries such as VFX and scientific computing. Therefore
166
-
it is much easier for CUDA C++ users to use rust for GPU computing if most of the concepts are still the same. Plus,
167
-
we can interface with existing CUDA code by compiling it to PTX then linking it with our rust code using the CUDA linker
166
+
it is much easier for CUDA C++ users to use Rust for GPU computing if most of the concepts are still the same. Plus,
167
+
we can interface with existing CUDA code by compiling it to PTX then linking it with our Rust code using the CUDA linker
168
168
API (which is exposed in a high level wrapper in cust).
169
169
170
170
## Why use the CUDA Driver API over the Runtime API?
@@ -289,5 +289,5 @@ Changes that are currently in progress but not done/experimental:
289
289
Just like RustaCUDA, cust makes no assumptions of what language was used to generate the PTX/cubin. It could be
290
290
C, C++, futhark, or best of all, Rust!
291
291
292
-
Cust's name is literally just rust + CUDA mashed together in a horrible way.
292
+
Cust's name is literally just Rust + CUDA mashed together in a horrible way.
293
293
Or you can pretend it stands for custard if you really like custard.
Copy file name to clipboardExpand all lines: guide/src/guide/getting_started.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -232,4 +232,4 @@ You can use it as follows (assuming your clone of Rust CUDA is at the absolute p
232
232
2. despite using Docker, your machine will still need to be running a compatible driver, in this case for CUDA 11.4.1 it is >=470.57.02
233
233
3. if you have issues within the container, it can help to start ensuring your GPU is recognized
234
234
- ensure `nvidia-smi` provides meaningful output in the container
235
-
-NVidia provides a number of samples https://github.com/NVIDIA/cuda-samples. In particular, you may want to try `make`ing and running the [`deviceQuery`](https://github.com/NVIDIA/cuda-samples/tree/ba04faaf7328dbcc87bfc9acaf17f951ee5ddcf3/Samples/deviceQuery) sample. If all is well you should see many details about your GPU
235
+
-NVIDIA provides a number of samples https://github.com/NVIDIA/cuda-samples. In particular, you may want to try `make`ing and running the [`deviceQuery`](https://github.com/NVIDIA/cuda-samples/tree/ba04faaf7328dbcc87bfc9acaf17f951ee5ddcf3/Samples/deviceQuery) sample. If all is well you should see many details about your GPU
Copy file name to clipboardExpand all lines: guide/src/nvvm/backends.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
-
# Custom Rustc Backends
1
+
# Custom rustc Backends
2
2
3
3
Before we get into the details of rustc_codegen_nvvm, we obviously need to explain what a codegen is!
4
4
5
-
Custom codegens are rustc's answer to "well what if I want rust to compile to X?". This is a problem
5
+
Custom codegens are rustc's answer to "well what if I want Rust to compile to X?". This is a problem
6
6
that comes up in many situations, especially conversations of "well LLVM cannot target this, so we are screwed".
7
7
To solve this problem, rustc decided to incrementally decouple itself from being attached/reliant on LLVM exclusively.
8
8
@@ -11,23 +11,23 @@ This is great if you just want to support LLVM, but LLVM is not perfect, and ine
11
11
is able to do. Or, you may just want to stop using LLVM, LLVM is not without problems (it is often slow, clunky to deal with,
12
12
and does not support a lot of targets).
13
13
14
-
Nowadays, Rustc is almost fully decoupled from LLVM and it is instead generic over the "codegen" backend used.
15
-
Rustc instead uses a system of codegen backends that implement traits and then get loaded as dynamically linked libraries.
16
-
This allows rust to compile to virtually anything with a surprisingly small amount of work. At the time of writing, there are
14
+
Nowadays, rustc is almost fully decoupled from LLVM and it is instead generic over the "codegen" backend used.
15
+
rustc instead uses a system of codegen backends that implement traits and then get loaded as dynamically linked libraries.
16
+
This allows Rust to compile to virtually anything with a surprisingly small amount of work. At the time of writing, there are
17
17
five publicly known codegens that exist:
18
18
- rustc_codegen_cranelift
19
19
- rustc_codegen_llvm
20
20
- rustc_codegen_gcc
21
21
- rustc_codegen_spirv
22
22
- rustc_codegen_nvvm, obviously the best codegen ;)
23
23
24
-
rustc_codegen_cranelift targets the cranelift backend, which is a codegen backend written in rust that is faster than LLVM but does not have many optimizations
24
+
rustc_codegen_cranelift targets the cranelift backend, which is a codegen backend written in Rust that is faster than LLVM but does not have many optimizations
25
25
compared to LLVM. rustc_codegen_llvm is obvious, it is the backend almost everybody uses which targets LLVM. rustc_codegen_gcc targets GCC (GNU Compiler Collection)
26
26
which is able to target more exotic targets than LLVM, especially for embedded. rustc_codegen_spirv targets the SPIR-V (Standard Portable Intermediate Representation 5)
27
27
format, which is a format mostly used for compiling shader languages such as GLSL or WGSL to a standard representation that Vulkan/OpenGL can use, the reasons
28
28
why SPIR-V is not an alternative to CUDA/rustc_codegen_nvvm have been covered in the [FAQ](../../faq.md).
29
29
30
-
Finally, we come to the star of the show, rustc_codegen_nvvm. This backend targets NVVM IR for compiling rust to GPU kernels that can be run by CUDA.
30
+
Finally, we come to the star of the show, rustc_codegen_nvvm. This backend targets NVVM IR for compiling Rust to GPU kernels that can be run by CUDA.
31
31
What NVVM IR/libNVVM are has been covered in the [CUDA section](../../cuda/pipeline.md).
Copy file name to clipboardExpand all lines: guide/src/nvvm/debugging.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -47,7 +47,7 @@ If that doesn't work, then it might be a bug inside of CUDA itself, but that sho
47
47
is to set up the crate for debug (and see if it still happens in debug). Then you can run your executable under NSight Compute, go to the source tab, and
48
48
examine the SASS (basically an assembly lower than PTX) to see if ptxas miscompiled it.
49
49
50
-
If you set up the codegen for debug, it should give you a mapping from rust code to SASS which should hopefully help to see what exactly is breaking.
50
+
If you set up the codegen for debug, it should give you a mapping from Rust code to SASS which should hopefully help to see what exactly is breaking.
0 commit comments