Skip to content
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
119 changes: 94 additions & 25 deletions lldb/include/lldb/Core/Telemetry.h
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/JSON.h"
#include "llvm/Telemetry/Telemetry.h"
#include <atomic>
#include <chrono>
#include <ctime>
#include <memory>
Expand All @@ -28,6 +29,17 @@
namespace lldb_private {
namespace telemetry {

struct LLDBConfig : public ::llvm::telemetry::Config {
// If true, we will collect full details about a debug command (eg., args and
// original command). Note: This may contain PII, hence can only be enabled by
// the vendor while creating the Manager.
const bool m_detailed_command_telemetry;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of the Configuration being the one deciding what events we emit. That's easy to audit. Can we make this a bit more generic and maybe have a set of telemetry::KindType that are enabled? We can have a default configuration that enables everything by default upstream (for testing) and then vendors have to override this downstream with the events they're interested in. Originally I imagined doing this in the emitter, but I think the config is a better place to do that. WDYT?

Copy link
Member Author

@oontvoo oontvoo Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for controlling how much data we collect for a given command - not for which type of events we collect.

But we can also add that if you want. Although you could already do taht by overriding the ::dispatch() to simply return/ignore the kind of entries you dont want to record. At that point, the data had been collected but it's not "leaving" LLDB yet - so still fine? (Or do you want something stricter? eg., not collect the data at all?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, I think this is fine as is. Given that this is a struct, please drop the m_ prefix.

Suggested change
// If true, we will collect full details about a debug command (eg., args and
// original command). Note: This may contain PII, hence can only be enabled by
// the vendor while creating the Manager.
const bool m_detailed_command_telemetry;
// If true, we will collect full details about a debug command (eg., args and
// original command). Note: This may contain PII, hence can only be enabled by
// the vendor while creating the Manager.
const bool detailed_command_telemetry;

Copy link
Member Author

@oontvoo oontvoo Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, I think this is fine as is. Given that this is a struct, please drop the m_ prefix.

done


explicit LLDBConfig(bool enable_telemetry, bool detailed_command_telemetry)
: ::llvm::telemetry::Config(enable_telemetry),
m_detailed_command_telemetry(detailed_command_telemetry) {}
};

// We expect each (direct) subclass of LLDBTelemetryInfo to
// have an LLDBEntryKind in the form 0b11xxxxxxxx
// Specifically:
Expand All @@ -37,6 +49,7 @@ namespace telemetry {
// must have their LLDBEntryKind in the similar form (ie., share common prefix)
struct LLDBEntryKind : public ::llvm::telemetry::EntryKind {
static const llvm::telemetry::KindType BaseInfo = 0b11000000;
static const llvm::telemetry::KindType CommandInfo = 0b11010000;
static const llvm::telemetry::KindType DebuggerInfo = 0b11000100;
};

Expand Down Expand Up @@ -66,6 +79,42 @@ struct LLDBBaseTelemetryInfo : public llvm::telemetry::TelemetryInfo {
void serialize(llvm::telemetry::Serializer &serializer) const override;
};

struct CommandInfo : public LLDBBaseTelemetryInfo {

// If the command is/can be associated with a target entry this field contains
// that target's UUID. <EMPTY> otherwise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be Doxygen comments (///)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

std::string target_uuid;
// A unique ID for a command so the manager can match the start entry with
// its end entry. These values only need to be unique within the same session.
// Necessary because we'd send off an entry right before a command's execution
// and another right after. This is to avoid losing telemetry if the command
// does not execute successfully.
int command_id;
// Eg., "breakpoint set"
std::string command_name;
// !!NOTE!! These two fields are not collected by default due to PII risks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop the !!NOTE!!, or use Note that.

Suggested change
// !!NOTE!! These two fields are not collected by default due to PII risks.
// These two fields are not collected by default due to PII risks.

When comments apply to multiple lines, you can use Doxygen groups:

/// Comment spanning multiple things
/// @{
thing one;
thing two;
/// @} 

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Vendor may allow them by setting the
// LLDBConfig::m_detailed_command_telemetry.
std::string original_command;
std::string args;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, the m_detailed_command_telemetry boolean controls a subset of the command, which means my earlier suggestion doesn't work. I can't think of a way to make this more "generic" without breaking up the event, which you probably don't want to do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - that was the knob for controlling how much data is collected upstream.

(The previous design was to have those DispatchFoo() helper method which could be overriden by vendors to collect more or less data, as needed. There's pros and cons to each ...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth wrapping these in std::optional to make it more obvious that these fields are optional?

(I thought I left this comment in an earlier comment, but I can't find it so maybe I forgot to submit it)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

// Return status of a command and any error description in case of error.
lldb::ReturnStatus ret_status;
std::string error_data;

CommandInfo() = default;

llvm::telemetry::KindType getKind() const override {
return LLDBEntryKind::CommandInfo;
}

static bool classof(const llvm::telemetry::TelemetryInfo *T) {
return (T->getKind() & LLDBEntryKind::CommandInfo) ==
LLDBEntryKind::CommandInfo;
}

void serialize(llvm::telemetry::Serializer &serializer) const override;
};

struct DebuggerInfo : public LLDBBaseTelemetryInfo {
std::string lldb_version;

Expand Down Expand Up @@ -93,64 +142,84 @@ class TelemetryManager : public llvm::telemetry::Manager {
public:
llvm::Error preDispatch(llvm::telemetry::TelemetryInfo *entry) override;

const llvm::telemetry::Config *GetConfig();
/// Returns the next unique ID to assign to a command entry.
int MakeNextCommandId();

const LLDBConfig *GetConfig() { return m_config.get(); }

virtual llvm::StringRef GetInstanceName() const = 0;

static TelemetryManager *GetInstance();

static TelemetryManager *GetInstanceIfEnabled();

protected:
TelemetryManager(std::unique_ptr<llvm::telemetry::Config> config);
TelemetryManager(std::unique_ptr<LLDBConfig> config);

static void SetInstance(std::unique_ptr<TelemetryManager> manger);

private:
std::unique_ptr<llvm::telemetry::Config> m_config;
std::unique_ptr<LLDBConfig> m_config;
// Each instance of a TelemetryManager is assigned a unique ID.
const std::string m_id;

// We assign each command (in the same session) a unique id so that their
// "start" and "end" entries can be matched up.
// These values don't need to be unique across runs (because they are
// secondary-key), hence a simple counter is sufficent.
std::atomic<int> m_command_id_seed = 0;
static std::unique_ptr<TelemetryManager> g_instance;
};

/// Helper RAII class for collecting telemetry.
template <typename Info> struct ScopedDispatcher {
// The debugger pointer is optional because we may not have a debugger yet.
// In that case, caller must set the debugger later.
ScopedDispatcher(llvm::unique_function<void(Info *info)> callback,
ScopedDispatcher(Debugger *debugger = nullptr) {
// Start the timer.
m_start_time = std::chrono::steady_clock::now();
this->debugger = debugger;
}
ScopedDispatcher(llvm::unique_function<void(Info *info)> final_callback,
Debugger *debugger = nullptr)
: m_callback(std::move(callback)) {
: m_final_callback(std::move(final_callback)) {
// Start the timer.
m_start_time = std::chrono::steady_clock::now();
m_info.debugger = debugger;
this->debugger = debugger;
}

void SetDebugger(Debugger *debugger) { m_info.debugger = debugger; }
void SetDebugger(Debugger *debugger) { this->debugger = debugger; }

~ScopedDispatcher() {
// If Telemetry is disabled (either at buildtime or runtime),
// then don't do anything.
TelemetryManager *manager = TelemetryManager::GetInstanceIfEnabled();
if (!manager)
return;
void DispatchOnExit(llvm::unique_function<void(Info *info)> final_callback) {
// We probably should not be overriding previously set cb.
assert(!m_final_callback);
m_final_callback = std::move(final_callback);
}

m_info.start_time = m_start_time;
// Populate common fields that we can only set now.
m_info.end_time = std::chrono::steady_clock::now();
// The callback will set the remaining fields.
m_callback(&m_info);
void DispatchNow(llvm::unique_function<void(Info *info)> populate_fields_cb) {
TelemetryManager *manager = TelemetryManager::GetInstance();
if (!manager->GetConfig()->EnableTelemetry)
return;
Info info;
// Populate the common fields we know aboutl
info.start_time = m_start_time;
info.end_time = std::chrono::steady_clock::now();
info.debugger = debugger;
// The callback will set the rest.
populate_fields_cb(&info);
// And then we dispatch.
if (llvm::Error er = manager->dispatch(&m_info)) {
if (llvm::Error er = manager->dispatch(&info)) {
LLDB_LOG_ERROR(GetLog(LLDBLog::Object), std::move(er),
"Failed to dispatch entry of type: {0}", m_info.getKind());
"Failed to dispatch entry of type: {0}", info.getKind());
}
}

~ScopedDispatcher() {
if (m_final_callback)
DispatchNow(std::move(m_final_callback));
}

private:
SteadyTimePoint m_start_time;
llvm::unique_function<void(Info *info)> m_callback;
Info m_info;
llvm::unique_function<void(Info *info)> m_final_callback;
Debugger *debugger;
};

} // namespace telemetry
Expand Down
61 changes: 47 additions & 14 deletions lldb/source/Core/Telemetry.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,17 @@ void LLDBBaseTelemetryInfo::serialize(Serializer &serializer) const {
if (end_time.has_value())
serializer.write("end_time", ToNanosec(end_time.value()));
}
void CommandInfo::serialize(Serializer &serializer) const {
LLDBBaseTelemetryInfo::serialize(serializer);

serializer.write("target_uuid", target_uuid);
serializer.write("command_id", command_id);
serializer.write("command_name", command_name);
serializer.write("original_command", original_command);
serializer.write("args", args);
serializer.write("ret_status", ret_status);
serializer.write("error_data", error_data);
}

void DebuggerInfo::serialize(Serializer &serializer) const {
LLDBBaseTelemetryInfo::serialize(serializer);
Expand All @@ -67,7 +78,7 @@ void DebuggerInfo::serialize(Serializer &serializer) const {
serializer.write("is_exit_entry", is_exit_entry);
}

TelemetryManager::TelemetryManager(std::unique_ptr<Config> config)
TelemetryManager::TelemetryManager(std::unique_ptr<LLDBConfig> config)
: m_config(std::move(config)), m_id(MakeUUID()) {}

llvm::Error TelemetryManager::preDispatch(TelemetryInfo *entry) {
Expand All @@ -79,23 +90,45 @@ llvm::Error TelemetryManager::preDispatch(TelemetryInfo *entry) {
return llvm::Error::success();
}

const Config *TelemetryManager::GetConfig() { return m_config.get(); }

std::unique_ptr<TelemetryManager> TelemetryManager::g_instance = nullptr;
TelemetryManager *TelemetryManager::GetInstance() {
if (!Config::BuildTimeEnableTelemetry)
return nullptr;
return g_instance.get();
int TelemetryManager::MakeNextCommandId() {
return m_command_id_seed.fetch_add(1);
}

TelemetryManager *TelemetryManager::GetInstanceIfEnabled() {
// Telemetry may be enabled at build time but disabled at runtime.
if (TelemetryManager *ins = TelemetryManager::GetInstance()) {
if (ins->GetConfig()->EnableTelemetry)
return ins;
class NoOpTelemetryManager : public TelemetryManager {
public:
llvm::Error preDispatch(llvm::telemetry::TelemetryInfo *entry) override {
// Does nothing.
return llvm::Error::success();
}

return nullptr;
explicit NoOpTelemetryManager()
: TelemetryManager(std::make_unique<LLDBConfig>(
/*EnableTelemetry*/ false, /*DetailedCommand*/ false)) {}

virtual llvm::StringRef GetInstanceName() const override {
return "NoOpTelemetryManager";
}

llvm::Error dispatch(llvm::telemetry::TelemetryInfo *entry) override {
// Does nothing.
return llvm::Error::success();
}

static NoOpTelemetryManager *GetInstance() {
static std::unique_ptr<NoOpTelemetryManager> g_ins =
std::make_unique<NoOpTelemetryManager>();
return g_ins.get();
}
};

std::unique_ptr<TelemetryManager> TelemetryManager::g_instance = nullptr;
TelemetryManager *TelemetryManager::GetInstance() {
// If Telemetry is disabled or if there is no default instance, then use the
// NoOp manager. We use a dummy instance to avoid having to do nullchecks in
// various places.
if (!Config::BuildTimeEnableTelemetry || !g_instance)
return NoOpTelemetryManager::GetInstance();
return g_instance.get();
}

void TelemetryManager::SetInstance(std::unique_ptr<TelemetryManager> manager) {
Expand Down
60 changes: 51 additions & 9 deletions lldb/source/Interpreter/CommandInterpreter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,9 @@
#include "Commands/CommandObjectWatchpoint.h"

#include "lldb/Core/Debugger.h"
#include "lldb/Core/Module.h"
#include "lldb/Core/PluginManager.h"
#include "lldb/Core/Telemetry.h"
#include "lldb/Host/StreamFile.h"
#include "lldb/Utility/ErrorMessages.h"
#include "lldb/Utility/LLDBLog.h"
Expand Down Expand Up @@ -88,6 +90,7 @@
#include "llvm/Support/Path.h"
#include "llvm/Support/PrettyStackTrace.h"
#include "llvm/Support/ScopedPrinter.h"
#include "llvm/Telemetry/Telemetry.h"

#if defined(__APPLE__)
#include <TargetConditionals.h>
Expand Down Expand Up @@ -1883,9 +1886,49 @@ bool CommandInterpreter::HandleCommand(const char *command_line,
LazyBool lazy_add_to_history,
CommandReturnObject &result,
bool force_repeat_command) {
lldb_private::telemetry::ScopedDispatcher<
lldb_private::telemetry::CommandInfo>
helper(&m_debugger);
lldb_private::telemetry::TelemetryManager *ins =
lldb_private::telemetry::TelemetryManager::GetInstance();
const int command_id = ins->MakeNextCommandId();

std::string command_string(command_line);
std::string original_command_string(command_string);
std::string real_original_command_string(command_string);
std::string parsed_command_args;
CommandObject *cmd_obj = nullptr;
Comment on lines +1899 to +1900
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really dislike that the RAII object forces us to declare variables early like it's C99. Do we need the RAII helper here? Are there early returns we need to account for, or can we just fill in the object at the end of the function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, unfortunately, there are a bunch of early returns in this function (which is also very long).

  • Line 1941-1945 : return due to some interruption
  • Line 1997 - 2020 : nested if statements to set the right return-status.


helper.DispatchNow([&](lldb_private::telemetry::CommandInfo *info) {
info->command_id = command_id;
if (Target *target = GetExecutionContext().GetTargetPtr()) {
// If we have a target attached to this command, then get the UUID.
info->target_uuid =
target->GetExecutableModule() != nullptr
? target->GetExecutableModule()->GetUUID().GetAsString()
: "";
}
if (ins->GetConfig()->m_detailed_command_telemetry)
info->original_command = original_command_string;
// The rest (eg., command_name, args, etc) hasn't been parsed yet;
// Those will be collected by the on-exit-callback.
});

helper.DispatchOnExit([&](lldb_private::telemetry::CommandInfo *info) {
// TODO: this is logging the time the command-handler finishes.
// But we may want a finer-grain durations too?
// (ie., the execute_time recorded below?)
info->command_id = command_id;
llvm::StringRef command_name =
cmd_obj ? cmd_obj->GetCommandName() : "<not found>";
info->command_name = command_name.str();
info->ret_status = result.GetStatus();
if (std::string error_str = result.GetErrorString(); !error_str.empty())
info->error_data = std::move(error_str);

if (ins->GetConfig()->m_detailed_command_telemetry)
info->args = parsed_command_args;
});

Log *log = GetLog(LLDBLog::Commands);
LLDB_LOGF(log, "Processing command: %s", command_line);
Expand Down Expand Up @@ -1991,7 +2034,7 @@ bool CommandInterpreter::HandleCommand(const char *command_line,
// From 1 above, we can determine whether the Execute function wants raw
// input or not.

CommandObject *cmd_obj = ResolveCommandImpl(command_string, result);
cmd_obj = ResolveCommandImpl(command_string, result);

// We have to preprocess the whole command string for Raw commands, since we
// don't know the structure of the command. For parsed commands, we only
Expand Down Expand Up @@ -2053,37 +2096,36 @@ bool CommandInterpreter::HandleCommand(const char *command_line,
if (add_to_history)
m_command_history.AppendString(original_command_string);

std::string remainder;
const std::size_t actual_cmd_name_len = cmd_obj->GetCommandName().size();
if (actual_cmd_name_len < command_string.length())
remainder = command_string.substr(actual_cmd_name_len);
parsed_command_args = command_string.substr(actual_cmd_name_len);

// Remove any initial spaces
size_t pos = remainder.find_first_not_of(k_white_space);
size_t pos = parsed_command_args.find_first_not_of(k_white_space);
if (pos != 0 && pos != std::string::npos)
remainder.erase(0, pos);
parsed_command_args.erase(0, pos);

LLDB_LOGF(
log, "HandleCommand, command line after removing command name(s): '%s'",
remainder.c_str());
parsed_command_args.c_str());

// To test whether or not transcript should be saved, `transcript_item` is
// used instead of `GetSaveTranscript()`. This is because the latter will
// fail when the command is "settings set interpreter.save-transcript true".
if (transcript_item) {
transcript_item->AddStringItem("commandName", cmd_obj->GetCommandName());
transcript_item->AddStringItem("commandArguments", remainder);
transcript_item->AddStringItem("commandArguments", parsed_command_args);
}

ElapsedTime elapsed(execute_time);
cmd_obj->SetOriginalCommandString(real_original_command_string);
// Set the indent to the position of the command in the command line.
pos = real_original_command_string.rfind(remainder);
pos = real_original_command_string.rfind(parsed_command_args);
std::optional<uint16_t> indent;
if (pos != std::string::npos)
indent = pos;
result.SetDiagnosticIndent(indent);
cmd_obj->Execute(remainder.c_str(), result);
cmd_obj->Execute(parsed_command_args.c_str(), result);
}

LLDB_LOGF(log, "HandleCommand, command %s",
Expand Down
Loading