Use consistent capitalization, part 2.

nnethercote · nnethercote · commit 5db7091d7650 · 2025-11-20T09:30:44.000+11:00
- Rustc -&gt; rustc (truly!)
- rust -&gt; Rust
- NVidia/nvidia -&gt; NVIDIA
diff --git a/guide/src/SUMMARY.md b/guide/src/SUMMARY.md
@@ -13,7 +13,7 @@
   - [GPU Computing](cuda/gpu_computing.md)
   - [The CUDA Pipeline](cuda/pipeline.md)
 - [rustc_codegen_nvvm](nvvm/README.md)
-  - [Custom Rustc Backends](nvvm/backends.md)
+  - [Custom rustc Backends](nvvm/backends.md)
   - [rustc_codegen_nvvm](nvvm/nvvm.md)
   - [Types](nvvm/types.md)
   - [PTX Generation](nvvm/ptxgen.md)
diff --git a/guide/src/cuda/gpu_computing.md b/guide/src/cuda/gpu_computing.md
@@ -36,10 +36,10 @@ as (but not limited to):
 - Forgetting to free memory, using uninitialized memory, etc.
 
 Not to mention the standardized tooling that makes the building, documentation, sharing, and linting of GPU kernel libraries easily possible.
-Most of the reasons for using rust on the CPU apply to using Rust for the GPU, these reasons have been stated countless times so
+Most of the reasons for using Rust on the CPU apply to using Rust for the GPU, these reasons have been stated countless times so
 I will not repeat them here. 
 
-A couple of particular rust features make writing CUDA code much easier: RAII and Results.
+A couple of particular Rust features make writing CUDA code much easier: RAII and Results.
 In cust everything uses RAII (through `Drop` impls) to manage freeing memory and returning handles, which 
 frees users from having to think about that, which yields safer, more reliable code.
 
@@ -48,6 +48,6 @@ Ignoring these statuses is very dangerous and can often lead to random segfaults
 both the CUDA SDK, and other libraries provide macros to handle such statuses. This handling is not very reliable and causes
 dependency issues down the line. 
 
-Instead of an unreliable system of macros, we can leverage rust results for this. In cust we return special `CudaResult<T>`
-results that can be bubbled up using rust's `?` operator, or, similar to `CUDA_SAFE_CALL` can be unwrapped or expected if 
+Instead of an unreliable system of macros, we can leverage Rust results for this. In cust we return special `CudaResult<T>`
+results that can be bubbled up using Rust's `?` operator, or, similar to `CUDA_SAFE_CALL` can be unwrapped or expected if 
 proper error handling is not needed. 
diff --git a/guide/src/faq.md b/guide/src/faq.md
@@ -21,7 +21,7 @@ seamlessly implement features which would have been impossible or very difficult
   - Stripping away everything we do not need, no complex ABI handling, no shared lib handling, control over how function calls are generated, etc.
 
 So overall, the LLVM PTX backend is fit for smaller kernels/projects/proofs of concept.
-It is however not fit for compiling an entire language (core is __very__ big) with dependencies and more. The end goal is for rust to be able to be used 
+It is however not fit for compiling an entire language (core is __very__ big) with dependencies and more. The end goal is for Rust to be able to be used 
 over CUDA C/C++ with the same (or better!) performance and features, therefore, we must take advantage of all optimizations NVCC has over us.
 
 ## If NVVM IR is a subset of LLVM IR, can we not give rustc-generated LLVM IR to NVVM?
@@ -117,22 +117,22 @@ no control over it and no 100% reliable way to fix it, therefore we must shift t
 
 Moreover, the CUDA GPU kernel model is entirely based on trust, trusting each thread to index into the correct place in buffers,
 trusting the caller of the kernel to uphold some dimension invariants, etc. This is once again, completely incompatible with how 
-rust does things. We can provide wrappers to calculate an index that always works, and macros to index a buffer automatically, but 
+Rust does things. We can provide wrappers to calculate an index that always works, and macros to index a buffer automatically, but 
 indexing in complex ways is a core operation in CUDA and it is impossible for us to prove that whatever the developer is doing is correct.
 
 Finally, We would love to be able to use mut refs in kernel parameters, but this is would be unsound. Because
 each kernel function is *technically* called multiple times in parallel with the same parameters, we would be
-aliasing the mutable ref, which Rustc declares as unsound (aliasing mechanics). So raw pointers or slightly-less-unsafe
+aliasing the mutable ref, which rustc declares as unsound (aliasing mechanics). So raw pointers or slightly-less-unsafe
 need to be used. However, they are usually only used for the initial buffer indexing, after which you can turn them into a
 mutable reference just fine (because you indexed in a way where no other thread will index that element). Also note
 that shared refs can be used as parameters just fine.
 
-Now that we outlined why this is a thing, why is using rust a benefit if we still need to use unsafe?
+Now that we outlined why this is a thing, why is using Rust a benefit if we still need to use unsafe?
 
 Well it's simple, eliminating most of the things that a developer needs to think about to have a safe program
 is still exponentially safer than leaving __everything__ to the developer to think about. 
 
-By using rust, we eliminate:
+By using Rust, we eliminate:
 - The forgotten/unhandled CUDA errors problem (yay results!).
 - The uninitialized memory problem.
 - The forgetting to dealloc memory problem.
@@ -156,15 +156,15 @@ The reasoning for this is the same reasoning as to why you would use CUDA over o
 - rust-gpu does not perform many optimizations, and with rustc_codegen_ssa's less than ideal codegen, the optimizations by LLVM and libNVVM are needed.
 - SPIR-V is arguably still not suitable for serious GPU kernel codegen, it is underspecced, complex, and does not mention many things which are needed.
 While libNVVM (which uses a well documented subset of LLVM IR) and the PTX ISA are very thoroughly documented/specified.
-- rust-gpu is primarily focused on graphical shaders, compute shaders are secondary, which the rust ecosystem needs, but it also 
+- rust-gpu is primarily focused on graphical shaders, compute shaders are secondary, which the Rust ecosystem needs, but it also 
 needs a project 100% focused on computing, and computing only.
 - SPIR-V cannot access many useful CUDA libraries such as OptiX, cuDNN, cuBLAS, etc.
 - SPIR-V debug info is still very young and rust-gpu cannot generate it. While rustc_codegen_nvvm does, which can be used
 for profiling kernels in something like nsight compute.
 
 Moreover, CUDA is the primary tool used in big computing industries such as VFX and scientific computing. Therefore 
-it is much easier for CUDA C++ users to use rust for GPU computing if most of the concepts are still the same. Plus,
-we can interface with existing CUDA code by compiling it to PTX then linking it with our rust code using the CUDA linker
+it is much easier for CUDA C++ users to use Rust for GPU computing if most of the concepts are still the same. Plus,
+we can interface with existing CUDA code by compiling it to PTX then linking it with our Rust code using the CUDA linker
 API (which is exposed in a high level wrapper in cust).
 
 ## Why use the CUDA Driver API over the Runtime API?
@@ -289,5 +289,5 @@ Changes that are currently in progress but not done/experimental:
 Just like RustaCUDA, cust makes no assumptions of what language was used to generate the PTX/cubin. It could be 
 C, C++, futhark, or best of all, Rust!
 
-Cust's name is literally just rust + CUDA mashed together in a horrible way.
+Cust's name is literally just Rust + CUDA mashed together in a horrible way.
 Or you can pretend it stands for custard if you really like custard.
diff --git a/guide/src/features.md b/guide/src/features.md
@@ -105,4 +105,4 @@ on things used by the wide majority of users.
 | Stream Ordered Memory | ✔️ |
 | Graph Memory Nodes | ❌ |
 | Unified Memory | ✔️ |
-| `__restrict__` | ➖ | Not needed, you get that performance boost automatically through rust's noalias :) |
+| `__restrict__` | ➖ | Not needed, you get that performance boost automatically through Rust's noalias :) |
diff --git a/guide/src/guide/getting_started.md b/guide/src/guide/getting_started.md
@@ -232,4 +232,4 @@ You can use it as follows (assuming your clone of Rust CUDA is at the absolute p
 2. despite using Docker, your machine will still need to be running a compatible driver, in this case for CUDA 11.4.1 it is >=470.57.02
 3. if you have issues within the container, it can help to start ensuring your GPU is recognized
    - ensure `nvidia-smi` provides meaningful output in the container
-   - NVidia provides a number of samples https://github.com/NVIDIA/cuda-samples. In particular, you may want to try `make`ing and running the [`deviceQuery`](https://github.com/NVIDIA/cuda-samples/tree/ba04faaf7328dbcc87bfc9acaf17f951ee5ddcf3/Samples/deviceQuery) sample. If all is well you should see many details about your GPU
+   - NVIDIA provides a number of samples https://github.com/NVIDIA/cuda-samples. In particular, you may want to try `make`ing and running the [`deviceQuery`](https://github.com/NVIDIA/cuda-samples/tree/ba04faaf7328dbcc87bfc9acaf17f951ee5ddcf3/Samples/deviceQuery) sample. If all is well you should see many details about your GPU
diff --git a/guide/src/guide/kernel_abi.md b/guide/src/guide/kernel_abi.md
@@ -7,10 +7,10 @@ In other words, how the codegen expects you to pass different types to GPU kerne
 
 ## Preface
 
-Please note that the following __only__ applies to non-rust call conventions, we make zero guarantees 
-about the rust call convention, just like rustc. 
+Please note that the following __only__ applies to non-Rust call conventions, we make zero guarantees 
+about the Rust call convention, just like rustc. 
 
-While we currently override every ABI except rust, you should generally only use `"C"`, any 
+While we currently override every ABI except Rust, you should generally only use `"C"`, any 
 other ABI we override purely to avoid footguns.
 
 Functions marked as `#[kernel]` are enforced to be `extern "C"` by the kernel macro, and it is expected
diff --git a/guide/src/nvvm/backends.md b/guide/src/nvvm/backends.md
@@ -1,8 +1,8 @@
-# Custom Rustc Backends
+# Custom rustc Backends
 
 Before we get into the details of rustc_codegen_nvvm, we obviously need to explain what a codegen is!
 
-Custom codegens are rustc's answer to "well what if I want rust to compile to X?". This is a problem
+Custom codegens are rustc's answer to "well what if I want Rust to compile to X?". This is a problem
 that comes up in many situations, especially conversations of "well LLVM cannot target this, so we are screwed".
 To solve this problem, rustc decided to incrementally decouple itself from being attached/reliant on LLVM exclusively.
 
@@ -11,23 +11,23 @@ This is great if you just want to support LLVM, but LLVM is not perfect, and ine
 is able to do. Or, you may just want to stop using LLVM, LLVM is not without problems (it is often slow, clunky to deal with, 
 and does not support a lot of targets). 
 
-Nowadays, Rustc is almost fully decoupled from LLVM and it is instead generic over the "codegen" backend used.
-Rustc instead uses a system of codegen backends that implement traits and then get loaded as dynamically linked libraries.
-This allows rust to compile to virtually anything with a surprisingly small amount of work. At the time of writing, there are
+Nowadays, rustc is almost fully decoupled from LLVM and it is instead generic over the "codegen" backend used.
+rustc instead uses a system of codegen backends that implement traits and then get loaded as dynamically linked libraries.
+This allows Rust to compile to virtually anything with a surprisingly small amount of work. At the time of writing, there are
 five publicly known codegens that exist:
 - rustc_codegen_cranelift
 - rustc_codegen_llvm
 - rustc_codegen_gcc
 - rustc_codegen_spirv
 - rustc_codegen_nvvm, obviously the best codegen ;)
 
-rustc_codegen_cranelift targets the cranelift backend, which is a codegen backend written in rust that is faster than LLVM but does not have many optimizations
+rustc_codegen_cranelift targets the cranelift backend, which is a codegen backend written in Rust that is faster than LLVM but does not have many optimizations
 compared to LLVM. rustc_codegen_llvm is obvious, it is the backend almost everybody uses which targets LLVM. rustc_codegen_gcc targets GCC (GNU Compiler Collection)
 which is able to target more exotic targets than LLVM, especially for embedded. rustc_codegen_spirv targets the SPIR-V (Standard Portable Intermediate Representation 5)
 format, which is a format mostly used for compiling shader languages such as GLSL or WGSL to a standard representation that Vulkan/OpenGL can use, the reasons
 why SPIR-V is not an alternative to CUDA/rustc_codegen_nvvm have been covered in the [FAQ](../../faq.md).
 
-Finally, we come to the star of the show, rustc_codegen_nvvm. This backend targets NVVM IR for compiling rust to GPU kernels that can be run by CUDA. 
+Finally, we come to the star of the show, rustc_codegen_nvvm. This backend targets NVVM IR for compiling Rust to GPU kernels that can be run by CUDA. 
 What NVVM IR/libNVVM are has been covered in the [CUDA section](../../cuda/pipeline.md).
 
 # rustc_codegen_ssa
diff --git a/guide/src/nvvm/debugging.md b/guide/src/nvvm/debugging.md
@@ -47,7 +47,7 @@ If that doesn't work, then it might be a bug inside of CUDA itself, but that sho
 is to set up the crate for debug (and see if it still happens in debug). Then you can run your executable under NSight Compute, go to the source tab, and 
 examine the SASS (basically an assembly lower than PTX) to see if ptxas miscompiled it.
 
-If you set up the codegen for debug, it should give you a mapping from rust code to SASS which should hopefully help to see what exactly is breaking.
+If you set up the codegen for debug, it should give you a mapping from Rust code to SASS which should hopefully help to see what exactly is breaking.
 
 Here is an example of the screen you should see:
 
diff --git a/guide/src/nvvm/nvvm.md b/guide/src/nvvm/nvvm.md
@@ -8,7 +8,7 @@ Source code -> Typechecking -> MIR -> SSA Codegen -> LLVM IR (NVVM IR) -> PTX ->
                |                                     |          libNVVM +------+                                  |
                |                                     |                                                            |
                |                  rustc_codegen_nvvm +------------------------------------------------------------|
-         Rustc +---------------------------------------------------------------------------------------------------
+         rustc +---------------------------------------------------------------------------------------------------
 ```
 
 Before we do anything, rustc does its normal job, it typechecks, converts everything to MIR, etc. Then, 
diff --git a/guide/src/nvvm/ptxgen.md b/guide/src/nvvm/ptxgen.md
@@ -58,7 +58,7 @@ into the module if they are used, doing so using dependency graphs.
 
 There are a couple of special modules we need to load before we are done, `libdevice` and `libintrinsics`. The first and most
 important one is libdevice, libdevice is essentially a bitcode module containing hyper-optimized math intrinsics
-that nvidia provides for us. You can find it as a `.bc` file in the libdevice folder inside your NVVM install location.
+that NVIDIA provides for us. You can find it as a `.bc` file in the libdevice folder inside your NVVM install location.
 Every function inside of it is prefixed with `__nv_`, you can find docs for it [here](https://docs.nvidia.com/cuda/libdevice-users-guide/index.html).
 
 We declare these intrinsics inside of `ctx_intrinsics.rs` and link to them inside cuda_std. We also use them to codegen