Skip to content

[Telemetry] Set up a base skeleton framework#5319

Draft
kadupoornima wants to merge 10 commits intoGoogleCloudPlatform:developfrom
kadupoornima:telemetry-1
Draft

[Telemetry] Set up a base skeleton framework#5319
kadupoornima wants to merge 10 commits intoGoogleCloudPlatform:developfrom
kadupoornima:telemetry-1

Conversation

@kadupoornima
Copy link
Contributor

@kadupoornima kadupoornima commented Mar 6, 2026

Cluster Toolkit Telemetry

The objective of this effort is to design a robust telemetry system for Cluster Toolkit that captures anonymized usage data. This is a proposed enhancement to help the team understand how modules and blueprints are being used across different environments. When enabled, the system will automatically collect non-sensitive metrics (no PII) and deployment outcomes, enabling better product decisions and roadmap planning based on real usage insights. Clients would be able to choose whether or not to opt-in.

This PR - Setting up a base skeleton framework

We will be collecting data from each CLI run and the data is then sent in the form of an HTTP POST request to Clearcut.

Introduced a new pkg/telemetry:

  • collector.go: The required metrics are collected here. Support for COMMAND_NAME, IS_TEST_DATA, RUNTIME_MS, EXIT_CODE has been added in this PR.
  • telemetry.go: Contains methods to construct the payload and handle the complete telemetry flow.
  • uploader.go: Includes a Flush() method to send the event payload to the internal server for future analysis.

Added unit tests for the new code introduced and performed local testing.
This PR has no effect on production, DATA IS NOT BEING COLLECTED NOW.

@kadupoornima kadupoornima added the release-chore To not include into release notes label Mar 6, 2026
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a foundational telemetry framework for the Cluster Toolkit. Its primary purpose is to collect usage data from each CLI command execution, including command details, execution time, and exit status. This data is then formatted into a structured payload and transmitted to an internal server for analysis, enabling insights into tool usage and performance without affecting production environments.

Highlights

  • Telemetry Framework Introduction: A new pkg/telemetry package has been added to establish a base skeleton for collecting telemetry data for the Cluster Toolkit.
  • Metric Collection: collector.go is introduced to gather essential metrics such as command name, test data status, runtime duration, and exit code for each CLI run.
  • Telemetry Flow Management: telemetry.go provides methods to construct the telemetry payload and manage the overall data collection and submission process.
  • Data Upload Mechanism: uploader.go includes a Flush() method responsible for sending the collected event payload via an HTTP POST request to the Clearcut internal server.
  • Toolkit Versioning: The toolkit's version retrieval mechanism has been updated to use a new GetToolkitVersion() function from the pkg/config package.
Changelog
  • cmd/root.go
    • Updated the Version field to dynamically retrieve the toolkit version using config.GetToolkitVersion().
  • pkg/config/config.go
    • Added a new constant latestToolkitVersion to define the current toolkit version.
    • Introduced GetToolkitVersion() function to provide the toolkit's version.
  • pkg/telemetry/collector.go
    • Added CollectPreMetrics to record command name and test data status at the start of execution.
    • Added CollectPostMetrics to record runtime duration and exit code at the end of execution.
    • Implemented helper functions getCommandName, getIsTestData, and getRuntime.
  • pkg/telemetry/collector_test.go
    • Added comprehensive unit tests for CollectPreMetrics, CollectPostMetrics, getCommandName, getIsTestData, and getRuntime functions.
  • pkg/telemetry/telemetry.go
    • Added Initialize and Finalize functions to orchestrate metric collection and payload flushing.
    • Implemented ConstructPayload to build the LogRequest structure, including event metadata and client information.
    • Added getEventMetadataKVPairs to convert collected metadata into key-value pairs for the payload.
  • pkg/telemetry/types.go
    • Defined constants for Clearcut production URL, HTTP server timeout, client types (CLUSTER_TOOLKIT, CONCORD), and specific CTK metrics (COMMAND_NAME, IS_TEST_DATA, RUNTIME_MS, EXIT_CODE).
    • Declared Go structs ClientInfo, LogEvent, and LogRequest to model the telemetry data structure.
  • pkg/telemetry/uploader.go
    • Added Flush function to marshal the LogRequest payload into JSON and send it via an HTTP POST request to the configured Clearcut URL.
    • Configured HTTP client with a timeout and set appropriate User-Agent and Content-Type headers.
  • pkg/telemetry/uploader_test.go
    • Added unit tests for the Flush function, including tests for successful requests, network timeouts, and invalid URL handling, using a mock HTTP server.
Activity
  • No human activity (comments, reviews, approvals) has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@kadupoornima kadupoornima marked this pull request as ready for review March 6, 2026 09:04
@kadupoornima kadupoornima requested review from a team and samskillman as code owners March 6, 2026 09:04
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a foundational framework for telemetry collection, aiming to gather usage insights. However, there are significant security concerns regarding the management of global state. Specifically, the use of global variables for storing telemetry data without proper synchronization or cleanup leads to memory leaks, incorrect data reporting, and potential runtime panics due to race conditions, especially if the package is used in concurrent contexts. Additionally, the review highlights the need for refactoring to avoid package-level global state for improved testability, fixing a bug in event metadata aggregation, improving error handling in data upload, and making configuration more flexible.

@kadupoornima kadupoornima changed the title [CTK Telemetry] Set up a base framework [Telemetry] Set up a base skeleton framework Mar 6, 2026
@kadupoornima kadupoornima marked this pull request as draft March 10, 2026 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-chore To not include into release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant