|
| 1 | +To check for kernel stack overflow, set [configuration flag](https://github.com/intel/intel-graphics-compiler/blob/master/documentation/configuration_flags.md) `IGC_StackOverflowDetection=1`. |
| 2 | +Stack overflow can happen if the kernel uses non-inlined stack calls (such as recursion) or the Variable Length Array (VLA) feature. |
| 3 | + |
| 4 | +This adds checks on each stack pointer modification that catch kernel private memory overflow. Note that enabling this feature will reduce kernel performance, so it should be used only for debugging. |
| 5 | + |
| 6 | +If overflow is detected, the message "Stack overflow detected!" will be printed to the console and kernel will throw an assert. |
| 7 | + |
| 8 | +You can change the default stack memory size with the `IGC_ForcePerThreadPrivateMemorySize` flag, using a value between 1024 and 20480. For example `IGC_ForcePerThreadPrivateMemorySize=4096`. Setting a higher value may impact performance. |
| 9 | + |
| 10 | +## Usage examples |
| 11 | +### Recursive stack calls |
| 12 | +This OpenCL C kernel will lead to stack overflow for sufficiently large `n`. We will detect this stack overflow if the kernel is compiled with `IGC_StackOverflowDetection=1` flag. |
| 13 | +```c |
| 14 | +int fact(int n) { |
| 15 | + return n < 2 ? 1 : n*fact(n-1); |
| 16 | +} |
| 17 | +kernel void test_recursive(global int* out, int n) { |
| 18 | + out[0] = fact(n); |
| 19 | +} |
| 20 | +``` |
| 21 | +### VLA feature |
| 22 | +This uses VLA feature via OpenMP Fortran. Increasing `m` will lead to stack overflow. |
| 23 | +```fortran |
| 24 | +program test_vla |
| 25 | + implicit none |
| 26 | + integer, parameter :: n = 1000, m = 600 |
| 27 | + real, allocatable :: a(:), b(:) |
| 28 | + integer :: i, j |
| 29 | +
|
| 30 | + allocate(a(n), b(m)) |
| 31 | + !$omp target teams distribute parallel do private(b) |
| 32 | + do i = 1, n |
| 33 | + b(1) = i |
| 34 | + do j = 2, m |
| 35 | + b(j) = b(j-1) + 1 |
| 36 | + enddo |
| 37 | + a(i) = b(m) |
| 38 | + enddo |
| 39 | +
|
| 40 | + do i = 1, n |
| 41 | + write(*,*) 'Index ', i, ' Computed ', a(i) |
| 42 | + enddo |
| 43 | +
|
| 44 | + deallocate(a, b) |
| 45 | +end program test_vla |
| 46 | +``` |
| 47 | + |
| 48 | +Compiling this code with `ifx -fiopenmp -fopenmp-targets=spir64_gen -Xopenmp-target-backend "-device pvc" ./test.F90 -o dynam.AOT` emits following warning. |
| 49 | +``` |
| 50 | +warning: VLA has been detected, the private memory size is set to 4240B. |
| 51 | +You can change this size by setting environmental variable IGC_ForcePerThreadPrivateMemorySize to a value in range [1024:20480]. |
| 52 | +Greater values can affect performance, and lower ones may lead to incorrect results of your program. |
| 53 | +To make sure your program runs correctly you can set environmental variable IGC_StackOverflowDetection=1. |
| 54 | +This flag will print "Stack overflow detected!" if insufficient memory value has led to stack overflow. |
| 55 | +It should be used for debugging only as it affects performance. |
| 56 | +``` |
| 57 | + |
| 58 | +Compiling with `IGC_StackOverflowDetection=1 ifx -fiopenmp -fopenmp-targets=spir64_gen -Xopenmp-target-backend "-device pvc" ./test.F90 -o dynam.AOT` and then executing it will result in |
| 59 | +``` |
| 60 | +Stack overflow detected! |
| 61 | +... |
| 62 | +Stack overflow detected! |
| 63 | +Stack overflow detected! |
| 64 | +Stack overflow detected! |
| 65 | +AssertHandler::printMessage |
| 66 | +forrtl: error (76): Abort trap signal |
| 67 | +``` |
| 68 | + |
| 69 | +We can adjust the private kernel memory size by `IGC_ForcePerThreadPrivateMemorySize` so our VLA won't overflow stack anymore. `IGC_StackOverflowDetection=1 IGC_ForcePerThreadPrivateMemorySize=8192 ifx -fiopenmp -fopenmp-targets=spir64_gen -Xopenmp-target-backend "-device pvc" ./test.F90 -o dynam.AOT` results in proper program execution. |
| 70 | + |
| 71 | +``` |
| 72 | + Index 1 Computed 600.0000 |
| 73 | +... |
| 74 | + Index 1000 Computed 1599.000 |
| 75 | +``` |
0 commit comments