Skip to content

Commit 5083885

Browse files
committed
[libc][docs] Update NVPTX using documentation now that linking works
Summary: I added a wrapper linker awhile back but this still says it doesn't work.
1 parent 9df94e2 commit 5083885

File tree

1 file changed

+11
-13
lines changed

1 file changed

+11
-13
lines changed

libc/docs/gpu/using.rst

Lines changed: 11 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -34,10 +34,10 @@ described in the `clang documentation
3434
by the OpenMP toolchain, but is currently opt-in for the CUDA and HIP toolchains
3535
through the ``--offload-new-driver``` and ``-fgpu-rdc`` flags.
3636

37-
In order or link the GPU runtime, we simply pass this library to the embedded
38-
device linker job. This can be done using the ``-Xoffload-linker`` option, which
39-
forwards an argument to a ``clang`` job used to create the final GPU executable.
40-
The toolchain should pick up the C libraries automatically in most cases, so
37+
In order or link the GPU runtime, we simply pass this library to the embedded
38+
device linker job. This can be done using the ``-Xoffload-linker`` option, which
39+
forwards an argument to a ``clang`` job used to create the final GPU executable.
40+
The toolchain should pick up the C libraries automatically in most cases, so
4141
this shouldn't be necessary.
4242

4343
.. code-block:: sh
@@ -189,7 +189,7 @@ final executable.
189189

190190
#include <stdio.h>
191191

192-
int main() { fputs("Hello from AMDGPU!\n", stdout); }
192+
int main() { printf("Hello from AMDGPU!\n"); }
193193

194194
This program can then be compiled using the ``clang`` compiler. Note that
195195
``-flto`` and ``-mcpu=`` should be defined. This is because the GPU
@@ -227,28 +227,26 @@ Building for NVPTX targets
227227
^^^^^^^^^^^^^^^^^^^^^^^^^^
228228

229229
The infrastructure is the same as the AMDGPU example. However, the NVPTX binary
230-
utilities are very limited and must be targeted directly. There is no linker
231-
support for static libraries so we need to link in the ``libc.bc`` bitcode and
232-
inform the compiler driver of the file's contents.
230+
utilities are very limited and must be targeted directly. A utility called
231+
``clang-nvlink-wrapper`` instead wraps around the standard link job to give the
232+
illusion that ``nvlink`` is a functional linker.
233233

234234
.. code-block:: c++
235235

236236
#include <stdio.h>
237237

238238
int main(int argc, char **argv, char **envp) {
239-
fputs("Hello from NVPTX!\n", stdout);
239+
printf("Hello from NVPTX!\n");
240240
}
241241
242242
Additionally, the NVPTX ABI requires that every function signature matches. This
243243
requires us to pass the full prototype from ``main``. The installation will
244244
contain the ``nvptx-loader`` utility if the CUDA driver was found during
245-
compilation.
245+
compilation. Using link time optimization will help hide this.
246246

247247
.. code-block:: sh
248248
249-
$> clang hello.c --target=nvptx64-nvidia-cuda -march=native \
250-
-x ir <install>/lib/nvptx64-nvidia-cuda/libc.bc \
251-
-x ir <install>/lib/nvptx64-nvidia-cuda/crt1.o
249+
$> clang hello.c --target=nvptx64-nvidia-cuda -mcpu=native -flto -lc <install>/lib/nvptx64-nvidia-cuda/crt1.o
252250
$> nvptx-loader --threads 2 --blocks 2 a.out
253251
Hello from NVPTX!
254252
Hello from NVPTX!

0 commit comments

Comments
 (0)