Skip to content

Commit f924f3a

Browse files
yifancomalexdeucher
authored andcommitted
drm/amdkfd: fix random KFDSVMRangeTest.SetGetAttributesTest test failure
KFDSVMRangeTest.SetGetAttributesTest randomly fails in stress test. Note: Google Test filter = KFDSVMRangeTest.* [==========] Running 18 tests from 1 test case. [----------] Global test environment set-up. [----------] 18 tests from KFDSVMRangeTest [ RUN ] KFDSVMRangeTest.BasicSystemMemTest [ OK ] KFDSVMRangeTest.BasicSystemMemTest (30 ms) [ RUN ] KFDSVMRangeTest.SetGetAttributesTest [ ] Get default atrributes /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:154: Failure Value of: expectedDefaultResults[i] Actual: 4294967295 Expected: outputAttributes[i].value Which is: 0 /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:154: Failure Value of: expectedDefaultResults[i] Actual: 4294967295 Expected: outputAttributes[i].value Which is: 0 /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:152: Failure Value of: expectedDefaultResults[i] Actual: 4 Expected: outputAttributes[i].type Which is: 2 [ ] Setting/Getting atrributes [ FAILED ] the root cause is that svm work queue has not finished when svm_range_get_attr is called, thus some garbage svm interval tree data make svm_range_get_attr get wrong result. Flush work queue before iterate svm interval tree. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
1 parent 93c5701 commit f924f3a

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

drivers/gpu/drm/amd/amdkfd/kfd_svm.c

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3026,6 +3026,14 @@ svm_range_get_attr(struct kfd_process *p, uint64_t start, uint64_t size,
30263026
pr_debug("svms 0x%p [0x%llx 0x%llx] nattr 0x%x\n", &p->svms, start,
30273027
start + size - 1, nattr);
30283028

3029+
/* Flush pending deferred work to avoid racing with deferred actions from
3030+
* previous memory map changes (e.g. munmap). Concurrent memory map changes
3031+
* can still race with get_attr because we don't hold the mmap lock. But that
3032+
* would be a race condition in the application anyway, and undefined
3033+
* behaviour is acceptable in that case.
3034+
*/
3035+
flush_work(&p->svms.deferred_list_work);
3036+
30293037
mmap_read_lock(mm);
30303038
if (!svm_range_is_valid(mm, start, size)) {
30313039
pr_debug("invalid range\n");

0 commit comments

Comments
 (0)