Skip to content

Commit e112d1c

Browse files
authored
Merge branch 'main' into improved-debugging
2 parents 6728df5 + 6f9f75d commit e112d1c

File tree

1 file changed

+5
-6
lines changed

1 file changed

+5
-6
lines changed

_posts/2025-08-11-cuda-debugging.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -96,14 +96,13 @@ __global__ void illegalMemoryAccessKernel(int* data, int size) {
9696
}
9797
}
9898

99-
// Kernel with illegal memory access - accesses memory beyond allocated bounds
99+
// Simple kernel with no errors
100100
__global__ void normalKernel(int* data, int size) {
101101
int idx = blockIdx.x * blockDim.x + threadIdx.x;
102102

103-
// This will cause illegal memory access - accessing beyond allocated memory
104-
// We allocate 'size' elements but access up to size * 2
105-
if (idx < size) { // Access twice the allocated size
106-
data[idx] = idx; //
103+
104+
if (idx < size) {
105+
data[idx] = idx;
107106
}
108107
}
109108

@@ -152,7 +151,7 @@ int main() {
152151
}
153152
```
154153
155-
This code launches two kernels consecutively (`illegalMemoryAccessKernel` and `normalKernel`). During normal execution, you would encounter an error message: `CUDA Error at test.cu:62 - cudaMemcpy(h_data, d_data, size * sizeof(int), cudaMemcpyDeviceToHost): an illegal memory access was encountered`, and the error would only be detected in the return value of `cudaMemcpy`. Even with `CUDA_LAUNCH_BLOCKING=1`, it is still impossible to identify the specific kernel that caused the error.
154+
This code launches two kernels consecutively (`illegalMemoryAccessKernel` and `normalKernel`). During execution, you would encounter an error message: `CUDA Error at test.cu:62 - cudaMemcpy(h_data, d_data, size * sizeof(int), cudaMemcpyDeviceToHost): an illegal memory access was encountered`, and the error would only be detected in the return value of `cudaMemcpy`. Even with `CUDA_LAUNCH_BLOCKING=1`, it is still impossible to identify the specific kernel that caused the error.
156155
157156
By adding the CUDA core dump-related environment variables, we can observe:
158157

0 commit comments

Comments
 (0)