-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[clang] Apply internal buffering to clang diagnostics printing #113440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
To stabilize output of clang when clang is run in multiple build threads the whole diagnostic message is written first to internal buffer and then the whole message is put to the output stream which usually points to stderr to avoid printing to stderr with small chunks and interleaving of multiple diagnostic messages.
|
@llvm/pr-subscribers-clang Author: Mariya Podchishchaeva (Fznamznon) ChangesTo stabilize output of clang when clang is run in multiple build threads, the whole diagnostic message is written first to internal buffer and then the whole message is put to the output stream which usually points to stderr which is unbuffered by default. This helps to avoid printing to stderr with small chunks and interleaving of multiple diagnostic messages. This is also fixing a slight regression that happened somewhere in clang-17. I checked clang-14 and its output is slightly more stable. <img width="1265" alt="clang-multi-before" src="https://github.com/user-attachments/assets/7664b43b-70fb-4668-a6c6-083ec2a7f081"> gcc's output using the same makefile: <img width="307" alt="gcc-output" src="https://github.com/user-attachments/assets/374be96b-884e-4ad1-a911-a2c55302f407"> So I decided to give it a try since it could greatly improve user experience for some cases. It turned out simple and gave a relatively stable result. clang's output after the change: <img width="301" alt="clang-multi-after" src="https://github.com/user-attachments/assets/a2abfd17-f10d-4cef-a2e1-48fb1dda47fc"> I'm not sure how to test this properly, so the PR doesn't contain any test. But please let me know what you think, I'm open to any suggestions. Full diff: https://github.com/llvm/llvm-project/pull/113440.diff 3 Files Affected:
diff --git a/clang/include/clang/Frontend/TextDiagnostic.h b/clang/include/clang/Frontend/TextDiagnostic.h
index a2fe8ae995423b..e47a111c1dcdc2 100644
--- a/clang/include/clang/Frontend/TextDiagnostic.h
+++ b/clang/include/clang/Frontend/TextDiagnostic.h
@@ -16,6 +16,7 @@
#define LLVM_CLANG_FRONTEND_TEXTDIAGNOSTIC_H
#include "clang/Frontend/DiagnosticRenderer.h"
+#include "llvm/ADT/SmallString.h"
#include "llvm/Support/raw_ostream.h"
namespace clang {
@@ -33,8 +34,15 @@ namespace clang {
/// DiagnosticClient is implemented through this class as is diagnostic
/// printing coming out of libclang.
class TextDiagnostic : public DiagnosticRenderer {
- raw_ostream &OS;
+ raw_ostream &Out;
const Preprocessor *PP;
+ // To stabilize output of clang when clang is run in multiple build threads
+ // the whole diagnostic message is written first to internal buffer and then
+ // the whole message is put to the output stream Out which usually points to
+ // stderr to avoid printing to stderr with small chunks and interleaving of
+ // multiple diagnostic messages.
+ SmallString<1024> InternalBuffer;
+ llvm::raw_svector_ostream OS;
public:
TextDiagnostic(raw_ostream &OS, const LangOptions &LangOpts,
@@ -104,6 +112,8 @@ class TextDiagnostic : public DiagnosticRenderer {
void emitBuildingModuleLocation(FullSourceLoc Loc, PresumedLoc PLoc,
StringRef ModuleName) override;
+ void endDiagnostic(DiagOrStoredDiag D,
+ DiagnosticsEngine::Level Level) override;
private:
void emitFilename(StringRef Filename, const SourceManager &SM);
diff --git a/clang/lib/Frontend/TextDiagnostic.cpp b/clang/lib/Frontend/TextDiagnostic.cpp
index 4119ce6048d45d..fddb846745aa83 100644
--- a/clang/lib/Frontend/TextDiagnostic.cpp
+++ b/clang/lib/Frontend/TextDiagnostic.cpp
@@ -656,7 +656,11 @@ static bool printWordWrapped(raw_ostream &OS, StringRef Str, unsigned Columns,
TextDiagnostic::TextDiagnostic(raw_ostream &OS, const LangOptions &LangOpts,
DiagnosticOptions *DiagOpts,
const Preprocessor *PP)
- : DiagnosticRenderer(LangOpts, DiagOpts), OS(OS), PP(PP) {}
+ : DiagnosticRenderer(LangOpts, DiagOpts), Out(OS), PP(PP),
+ OS(InternalBuffer) {
+ this->OS.buffer().clear();
+ this->OS.enable_colors(true);
+}
TextDiagnostic::~TextDiagnostic() {}
@@ -664,7 +668,7 @@ void TextDiagnostic::emitDiagnosticMessage(
FullSourceLoc Loc, PresumedLoc PLoc, DiagnosticsEngine::Level Level,
StringRef Message, ArrayRef<clang::CharSourceRange> Ranges,
DiagOrStoredDiag D) {
- uint64_t StartOfLocationInfo = OS.tell();
+ uint64_t StartOfLocationInfo = Out.tell();
// Emit the location of this particular diagnostic.
if (Loc.isValid())
@@ -677,7 +681,7 @@ void TextDiagnostic::emitDiagnosticMessage(
printDiagnosticLevel(OS, Level, DiagOpts->ShowColors);
printDiagnosticMessage(OS,
/*IsSupplemental*/ Level == DiagnosticsEngine::Note,
- Message, OS.tell() - StartOfLocationInfo,
+ Message, Out.tell() - StartOfLocationInfo,
DiagOpts->MessageLength, DiagOpts->ShowColors);
}
@@ -1545,3 +1549,9 @@ void TextDiagnostic::emitParseableFixits(ArrayRef<FixItHint> Hints,
OS << "\"\n";
}
}
+
+void TextDiagnostic::endDiagnostic(DiagOrStoredDiag D,
+ DiagnosticsEngine::Level Level) {
+ Out << OS.str();
+ OS.buffer().clear();
+}
diff --git a/clang/lib/Frontend/TextDiagnosticPrinter.cpp b/clang/lib/Frontend/TextDiagnosticPrinter.cpp
index dac5c44fe92566..6e24a19a1ad501 100644
--- a/clang/lib/Frontend/TextDiagnosticPrinter.cpp
+++ b/clang/lib/Frontend/TextDiagnosticPrinter.cpp
@@ -133,12 +133,16 @@ void TextDiagnosticPrinter::HandleDiagnostic(DiagnosticsEngine::Level Level,
// diagnostics in a context that lacks language options, a source manager, or
// other infrastructure necessary when emitting more rich diagnostics.
if (!Info.getLocation().isValid()) {
- TextDiagnostic::printDiagnosticLevel(OS, Level, DiagOpts->ShowColors);
+ SmallString<1000> OutText;
+ llvm::raw_svector_ostream OutputBuffer(OutText);
+ OutputBuffer.enable_colors(true);
+ TextDiagnostic::printDiagnosticLevel(OutputBuffer, Level,
+ DiagOpts->ShowColors);
TextDiagnostic::printDiagnosticMessage(
- OS, /*IsSupplemental=*/Level == DiagnosticsEngine::Note,
+ OutputBuffer, /*IsSupplemental=*/Level == DiagnosticsEngine::Note,
DiagMessageStream.str(), OS.tell() - StartOfLocationInfo,
DiagOpts->MessageLength, DiagOpts->ShowColors);
- OS.flush();
+ OS << OutputBuffer.str();
return;
}
|
|
What does "run in multiple build threads" mean? When does that happen? |
What I meant is a build of some project using |
| // This is important as if the location is missing, we may be emitting | ||
| // diagnostics in a context that lacks language options, a source manager, or | ||
| // other infrastructure necessary when emitting more rich diagnostics. | ||
| if (!Info.getLocation().isValid()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the problem you're describing only exists for diagnostics without valid locations? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it exists for all of them. The handling is a bit different for diagnostics without valid locations, so I had to modify this place and TextDiagnostic class to handle the case with valid locations.
|
BTW, the test failure IS caused by the patch. It somehow caused permanent swap of what clang-tidy prints in clang-tidy-run-with-database.cpp test. Not sure why... |
|
Are we calling |
Well, yeah, but I think we should? llvm-project/llvm/lib/Support/raw_ostream.cpp Line 910 in 37832d5
The thing is though, I've tried setting |
|
Ping? Does that make sense? If so, I'll try to figure clangd output swap that caused lit fail (which seems harmless). |
AaronBallman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like we're hacking around a deeper issue with raw_ostream; I would have expected that SetBuffered() is honored. But it also worries me that we're manually enabling colors in all circumstances and no tests broke.
I'm also a bit worried we're updating TextDiagnostic but not SARIFDiagnostic; does emitting to SARIF also have interleaving issues?
| : DiagnosticRenderer(LangOpts, DiagOpts), Out(OS), PP(PP), | ||
| OS(InternalBuffer) { | ||
| this->OS.buffer().clear(); | ||
| this->OS.enable_colors(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be looking at DiagOpts->ShowColors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I think we do have tests checking colors. We probably don't have the tests that check there is no colors.
| TextDiagnostic::printDiagnosticLevel(OS, Level, DiagOpts->ShowColors); | ||
| SmallString<1000> OutText; | ||
| llvm::raw_svector_ostream OutputBuffer(OutText); | ||
| OutputBuffer.enable_colors(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here as above; shouldn't this be looking at ShowColors first?
|
@AaronBallman Here we return 0 llvm-project/llvm/lib/Support/raw_ostream.cpp Line 858 in e68a3e4
To not set the buffer here llvm-project/llvm/lib/Support/raw_ostream.cpp Line 105 in e68a3e4
|
Good find! This was done in 1771022, and I think our needs have changed in the intervening 15 years so that line buffering is perhaps now worth the added complexity. Maybe we should try backing this out and fixing the underlying issue? CC @sunfishcode for opinions |
|
Is it known which platforms are affected by this issue? I ask because I see some issues with how consoles/terminals are handled by On Windows, llvm-project/llvm/lib/Support/raw_ostream.cpp Lines 639 to 641 in 1946d32
The above code only checks that FD is directed to a character device, not a console specifically. The correct way to check for a console is already implemented elsewhere.llvm-project/llvm/lib/Support/Windows/Process.inc Lines 289 to 292 in 1946d32
Interestingly, the llvm::Process class is already used to detect a terminal for non-Windows systems via indirection through is_displayed().llvm-project/llvm/lib/Support/raw_ostream.cpp Lines 866 to 868 in 1946d32
is_displayed() is also used on Windows, so there is some potential for inconsistent and surprising behavior across character devices.llvm-project/llvm/lib/Support/raw_ostream.cpp Lines 506 to 520 in 1946d32
The code for detecting a Windows console is specific to an actual Windows console device. It works for detecting a Windows console for processes running in a For UNIX platforms, llvm-project/llvm/lib/Support/Unix/Process.inc Lines 309 to 316 in 1946d32
However, for reasons I don't understand, raw_fd_ostream::preferred_buffer_size() checks for both a character device and (indirectly) isatty() (this code used to call isatty() directly).llvm-project/llvm/lib/Support/raw_ostream.cpp Lines 851 to 862 in 1946d32
I think it is worth cleaning this up to see if that 1) fixes the reported problem, and 2) causes any regressions (which would then prompt improving the comments to better explain the intent). |
|
@tahonermann thank you for your investigation, and sorry for the delay.
The issue is reported on Linux with a makefile as a reproducer. I'm not aware if that is the case on Windows.
Well, simply commenting out the code preventing buffering for console and setting incoming OS in TextDiagnostic to be buffered does help to make the output more stable, however I think due to llvm::errs being easily accessible by any parts of LLVM, I'm currently having troubles with sporadic memory corruption in clang tooling tests. I suppose this happens due to setting llvm::errs to buffered/unbuffered in several threads. So, either raw_ostream needs to become thread-safe somehow and/or setting llvm::errs to buffer does seem really unsafe. I feel I'm stuck. WDYT? |

To stabilize output of clang when clang is run in multiple build threads, the whole diagnostic message is written first to internal buffer and then the whole message is put to the output stream which usually points to stderr which is unbuffered by default. This helps to avoid printing to stderr with small chunks and interleaving of multiple diagnostic messages.
This is also fixing a slight regression that happened somewhere in clang-17. I checked clang-14 and its output is slightly more stable.
In general the tool that is run in several threads can't control output interleaving so this change might be questionable. It is up to build's system/script/tool that is running clang to handle output interleaving. However, it seems for example Make doesn't prevent compiler's output interleaving in case of many threads and gcc's output is much more stable. I was using a makefile to build 6 files within 6 thread, each file contained a warning, clang's output was like (just an illustration, it is pretty random):
gcc's output using the same makefile:
So I decided to give it a try since it could greatly improve user experience for some cases. It turned out simple and gave a relatively stable result.
clang's output after the change:
I'm not sure how to test this properly, so the PR doesn't contain any test. But please let me know what you think, I'm open to any suggestions.