Skip to content
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 98 additions & 25 deletions lldb/include/lldb/Core/Telemetry.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,14 @@
#include "lldb/Interpreter/CommandReturnObject.h"
#include "lldb/Utility/LLDBLog.h"
#include "lldb/Utility/StructuredData.h"
#include "lldb/Utility/UUID.h"
#include "lldb/lldb-forward.h"
#include "llvm/ADT/FunctionExtras.h"
#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/JSON.h"
#include "llvm/Telemetry/Telemetry.h"
#include <atomic>
#include <chrono>
#include <ctime>
#include <memory>
Expand All @@ -28,6 +30,17 @@
namespace lldb_private {
namespace telemetry {

struct LLDBConfig : public ::llvm::telemetry::Config {
// If true, we will collect full details about a debug command (eg., args and
// original command). Note: This may contain PII, hence can only be enabled by
// the vendor while creating the Manager.
const bool m_detailed_command_telemetry;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of the Configuration being the one deciding what events we emit. That's easy to audit. Can we make this a bit more generic and maybe have a set of telemetry::KindType that are enabled? We can have a default configuration that enables everything by default upstream (for testing) and then vendors have to override this downstream with the events they're interested in. Originally I imagined doing this in the emitter, but I think the config is a better place to do that. WDYT?

Copy link
Member Author

@oontvoo oontvoo Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for controlling how much data we collect for a given command - not for which type of events we collect.

But we can also add that if you want. Although you could already do taht by overriding the ::dispatch() to simply return/ignore the kind of entries you dont want to record. At that point, the data had been collected but it's not "leaving" LLDB yet - so still fine? (Or do you want something stricter? eg., not collect the data at all?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, I think this is fine as is. Given that this is a struct, please drop the m_ prefix.

Suggested change
// If true, we will collect full details about a debug command (eg., args and
// original command). Note: This may contain PII, hence can only be enabled by
// the vendor while creating the Manager.
const bool m_detailed_command_telemetry;
// If true, we will collect full details about a debug command (eg., args and
// original command). Note: This may contain PII, hence can only be enabled by
// the vendor while creating the Manager.
const bool detailed_command_telemetry;

Copy link
Member Author

@oontvoo oontvoo Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, I think this is fine as is. Given that this is a struct, please drop the m_ prefix.

done


explicit LLDBConfig(bool enable_telemetry, bool detailed_command_telemetry)
: ::llvm::telemetry::Config(enable_telemetry),
m_detailed_command_telemetry(detailed_command_telemetry) {}
};

// We expect each (direct) subclass of LLDBTelemetryInfo to
// have an LLDBEntryKind in the form 0b11xxxxxxxx
// Specifically:
Expand All @@ -37,6 +50,7 @@ namespace telemetry {
// must have their LLDBEntryKind in the similar form (ie., share common prefix)
struct LLDBEntryKind : public ::llvm::telemetry::EntryKind {
static const llvm::telemetry::KindType BaseInfo = 0b11000000;
static const llvm::telemetry::KindType CommandInfo = 0b11010000;
static const llvm::telemetry::KindType DebuggerInfo = 0b11000100;
};

Expand Down Expand Up @@ -66,6 +80,50 @@ struct LLDBBaseTelemetryInfo : public llvm::telemetry::TelemetryInfo {
void serialize(llvm::telemetry::Serializer &serializer) const override;
};

struct CommandInfo : public LLDBBaseTelemetryInfo {
// If the command is/can be associated with a target entry this field contains
// that target's UUID. <EMPTY> otherwise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be Doxygen comments (///)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

UUID target_uuid;
// A unique ID for a command so the manager can match the start entry with
// its end entry. These values only need to be unique within the same session.
// Necessary because we'd send off an entry right before a command's execution
// and another right after. This is to avoid losing telemetry if the command
// does not execute successfully.
uint64_t command_id;
// Eg., "breakpoint set"
std::string command_name;
// !!NOTE!! These two fields are not collected by default due to PII risks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop the !!NOTE!!, or use Note that.

Suggested change
// !!NOTE!! These two fields are not collected by default due to PII risks.
// These two fields are not collected by default due to PII risks.

When comments apply to multiple lines, you can use Doxygen groups:

/// Comment spanning multiple things
/// @{
thing one;
thing two;
/// @} 

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Vendor may allow them by setting the
// LLDBConfig::m_detailed_command_telemetry.
std::string original_command;
std::string args;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, the m_detailed_command_telemetry boolean controls a subset of the command, which means my earlier suggestion doesn't work. I can't think of a way to make this more "generic" without breaking up the event, which you probably don't want to do?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - that was the knob for controlling how much data is collected upstream.

(The previous design was to have those DispatchFoo() helper method which could be overriden by vendors to collect more or less data, as needed. There's pros and cons to each ...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth wrapping these in std::optional to make it more obvious that these fields are optional?

(I thought I left this comment in an earlier comment, but I can't find it so maybe I forgot to submit it)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

// Return status of a command and any error description in case of error.
std::optional<lldb::ReturnStatus> ret_status;
std::optional<std::string> error_data;

CommandInfo() = default;

llvm::telemetry::KindType getKind() const override {
return LLDBEntryKind::CommandInfo;
}

static bool classof(const llvm::telemetry::TelemetryInfo *T) {
return (T->getKind() & LLDBEntryKind::CommandInfo) ==
LLDBEntryKind::CommandInfo;
}

void serialize(llvm::telemetry::Serializer &serializer) const override;

static uint64_t GetNextId();

private:
// We assign each command (in the same session) a unique id so that their
// "start" and "end" entries can be matched up.
// These values don't need to be unique across runs (because they are
// secondary-key), hence a simple counter is sufficent.
static std::atomic<uint64_t> g_command_id_seed;
};

struct DebuggerInfo : public LLDBBaseTelemetryInfo {
std::string lldb_version;

Expand Down Expand Up @@ -93,64 +151,79 @@ class TelemetryManager : public llvm::telemetry::Manager {
public:
llvm::Error preDispatch(llvm::telemetry::TelemetryInfo *entry) override;

const llvm::telemetry::Config *GetConfig();
/// Returns the next unique ID to assign to a command entry.
int MakeNextCommandId();

const LLDBConfig *GetConfig() { return m_config.get(); }

virtual llvm::StringRef GetInstanceName() const = 0;

static TelemetryManager *GetInstance();

static TelemetryManager *GetInstanceIfEnabled();

protected:
TelemetryManager(std::unique_ptr<llvm::telemetry::Config> config);
TelemetryManager(std::unique_ptr<LLDBConfig> config);

static void SetInstance(std::unique_ptr<TelemetryManager> manger);

private:
std::unique_ptr<llvm::telemetry::Config> m_config;
std::unique_ptr<LLDBConfig> m_config;
// Each instance of a TelemetryManager is assigned a unique ID.
const std::string m_id;

static std::unique_ptr<TelemetryManager> g_instance;
};

/// Helper RAII class for collecting telemetry.
template <typename Info> struct ScopedDispatcher {
// The debugger pointer is optional because we may not have a debugger yet.
// In that case, caller must set the debugger later.
ScopedDispatcher(llvm::unique_function<void(Info *info)> callback,
ScopedDispatcher(Debugger *debugger = nullptr) {
// Start the timer.
m_start_time = std::chrono::steady_clock::now();
this->debugger = debugger;
}
ScopedDispatcher(llvm::unique_function<void(Info *info)> final_callback,
Debugger *debugger = nullptr)
: m_callback(std::move(callback)) {
: m_final_callback(std::move(final_callback)) {
// Start the timer.
m_start_time = std::chrono::steady_clock::now();
m_info.debugger = debugger;
this->debugger = debugger;
}

void SetDebugger(Debugger *debugger) { m_info.debugger = debugger; }
void SetDebugger(Debugger *debugger) { this->debugger = debugger; }

~ScopedDispatcher() {
// If Telemetry is disabled (either at buildtime or runtime),
// then don't do anything.
TelemetryManager *manager = TelemetryManager::GetInstanceIfEnabled();
if (!manager)
return;
void DispatchOnExit(llvm::unique_function<void(Info *info)> final_callback) {
// We probably should not be overriding previously set cb.
assert(!m_final_callback);
m_final_callback = std::move(final_callback);
}

m_info.start_time = m_start_time;
// Populate common fields that we can only set now.
m_info.end_time = std::chrono::steady_clock::now();
// The callback will set the remaining fields.
m_callback(&m_info);
void DispatchNow(llvm::unique_function<void(Info *info)> populate_fields_cb) {
TelemetryManager *manager = TelemetryManager::GetInstance();
if (!manager->GetConfig()->EnableTelemetry)
return;
Info info;
// Populate the common fields we know about.
info.start_time = m_start_time;
info.end_time = std::chrono::steady_clock::now();
info.debugger = debugger;
// The callback will set the rest.
populate_fields_cb(&info);
// And then we dispatch.
if (llvm::Error er = manager->dispatch(&m_info)) {
if (llvm::Error er = manager->dispatch(&info)) {
LLDB_LOG_ERROR(GetLog(LLDBLog::Object), std::move(er),
"Failed to dispatch entry of type: {0}", m_info.getKind());
"Failed to dispatch entry of type: {0}", info.getKind());
}
}

~ScopedDispatcher() {
if (m_final_callback)
DispatchNow(std::move(m_final_callback));
}

private:
SteadyTimePoint m_start_time;
llvm::unique_function<void(Info *info)> m_callback;
Info m_info;
llvm::unique_function<void(Info *info)> m_final_callback;
Debugger *debugger;
};

} // namespace telemetry
Expand Down
63 changes: 49 additions & 14 deletions lldb/source/Core/Telemetry.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -60,14 +60,31 @@ void LLDBBaseTelemetryInfo::serialize(Serializer &serializer) const {
serializer.write("end_time", ToNanosec(end_time.value()));
}

void CommandInfo::serialize(Serializer &serializer) const {
LLDBBaseTelemetryInfo::serialize(serializer);

serializer.write("target_uuid", target_uuid.GetAsString());
serializer.write("command_id", command_id);
serializer.write("command_name", command_name);
serializer.write("original_command", original_command);
serializer.write("args", args);
if (ret_status.has_value())
serializer.write("ret_status", ret_status.value());
if (error_data.has_value())
serializer.write("error_data", error_data.value());
}

void DebuggerInfo::serialize(Serializer &serializer) const {
LLDBBaseTelemetryInfo::serialize(serializer);

serializer.write("lldb_version", lldb_version);
serializer.write("is_exit_entry", is_exit_entry);
}

TelemetryManager::TelemetryManager(std::unique_ptr<Config> config)
std::atomic<uint64_t> CommandInfo::g_command_id_seed = 0;
uint64_t CommandInfo::GetNextId() { return g_command_id_seed.fetch_add(1); }

TelemetryManager::TelemetryManager(std::unique_ptr<LLDBConfig> config)
: m_config(std::move(config)), m_id(MakeUUID()) {}

llvm::Error TelemetryManager::preDispatch(TelemetryInfo *entry) {
Expand All @@ -79,23 +96,41 @@ llvm::Error TelemetryManager::preDispatch(TelemetryInfo *entry) {
return llvm::Error::success();
}

const Config *TelemetryManager::GetConfig() { return m_config.get(); }
class NoOpTelemetryManager : public TelemetryManager {
public:
llvm::Error preDispatch(llvm::telemetry::TelemetryInfo *entry) override {
// Does nothing.
return llvm::Error::success();
}

std::unique_ptr<TelemetryManager> TelemetryManager::g_instance = nullptr;
TelemetryManager *TelemetryManager::GetInstance() {
if (!Config::BuildTimeEnableTelemetry)
return nullptr;
return g_instance.get();
}
explicit NoOpTelemetryManager()
: TelemetryManager(std::make_unique<LLDBConfig>(
/*EnableTelemetry*/ false, /*DetailedCommand*/ false)) {}

virtual llvm::StringRef GetInstanceName() const override {
return "NoOpTelemetryManager";
}

llvm::Error dispatch(llvm::telemetry::TelemetryInfo *entry) override {
// Does nothing.
return llvm::Error::success();
}

TelemetryManager *TelemetryManager::GetInstanceIfEnabled() {
// Telemetry may be enabled at build time but disabled at runtime.
if (TelemetryManager *ins = TelemetryManager::GetInstance()) {
if (ins->GetConfig()->EnableTelemetry)
return ins;
static NoOpTelemetryManager *GetInstance() {
static std::unique_ptr<NoOpTelemetryManager> g_ins =
std::make_unique<NoOpTelemetryManager>();
return g_ins.get();
}
};

return nullptr;
std::unique_ptr<TelemetryManager> TelemetryManager::g_instance = nullptr;
TelemetryManager *TelemetryManager::GetInstance() {
// If Telemetry is disabled or if there is no default instance, then use the
// NoOp manager. We use a dummy instance to avoid having to do nullchecks in
// various places.
if (!Config::BuildTimeEnableTelemetry || !g_instance)
return NoOpTelemetryManager::GetInstance();
return g_instance.get();
}

void TelemetryManager::SetInstance(std::unique_ptr<TelemetryManager> manager) {
Expand Down
Loading