Skip to content

Conversation

@thurstond
Copy link
Contributor

@thurstond thurstond commented Apr 29, 2025

GPU libraries may map a significant chunk of host memory that overlaps with the expected shadow mappings

@thurstond thurstond changed the title [tsan][NFCI] Add note that GPU libraries may cause shadow mapping inc… [tsan][NFCI] Add note that GPU libraries may cause shadow mapping incompatibility Apr 29, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 29, 2025

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Thurston Dang (thurstond)

Changes

…ompatibility


Full diff: https://github.com/llvm/llvm-project/pull/137835.diff

1 Files Affected:

  • (modified) compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp (+6-2)
diff --git a/compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp b/compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp
index 373acd3d95d01..8c046e244ea39 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_platform_linux.cpp
@@ -292,8 +292,12 @@ static void ReExecIfNeeded(bool ignore_heap) {
     } else {
       Printf(
           "FATAL: ThreadSanitizer: memory layout is incompatible, "
-          "even though ASLR is disabled.\n"
-          "Please file a bug.\n");
+          "even though ASLR is disabled.\n");
+      Printf(
+          "FATAL: This error may occur for programs that use GPU libraries.");
+      Printf(
+          "FATAL: If your program does not use GPU libraries, please file a "
+          "TSan bug.\n");
       DumpProcessMap();
       Die();
     }

"Please file a bug.\n");
"even though ASLR is disabled.\n");
Printf(
"FATAL: This error may occur for programs that use GPU libraries.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the extra FATAL?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users may grep/filter for 'FATAL' in the log, and it would be misleading if they only saw a subset of the lines.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Collaborator

@vitalybuka vitalybuka May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to mention GPU?

I can see this can help some users, but it also can misleads other users, who has issue, but no GPU libraries.
If we can detect The Library, then maybe there is a value in the message.

What is % of total issues do we expect caused by GPU libraries?
We need a confidence that it's high to blame them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to mention GPU?

I can see this can help some users, but it also can misleads other users, who has issue, but no GPU libraries.

It won't mislead them if they have no GPU libraries - the next line is "If your program does not use GPU libraries, please file a TSan bug."

If we can detect The Library, then maybe there is a value in the message.

What is % of total issues do we expect caused by GPU libraries? We need a confidence that it's high to blame them.

So far, 100% of issues (sample size = 1) :-).

Most non-GPU libraries aren't mapping large host memory buffers before the shadow is allocated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are changing the lib which maps a lot of memory.

Lets just print generic link to:
https://clang.llvm.org/docs/ThreadSanitizer.html#limitations

and add entry about your discovery

@thurstond thurstond closed this May 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants