Skip to content

System metrics setup#1041

Merged
panavenue merged 11 commits intofeature/client-side-metricsfrom
system_metrics_setup
Dec 5, 2025
Merged

System metrics setup#1041
panavenue merged 11 commits intofeature/client-side-metricsfrom
system_metrics_setup

Conversation

@panavenue
Copy link
Collaborator

@panavenue panavenue commented Nov 4, 2025

Description - This pull request introduces a comprehensive system for collecting and exporting client-side metrics from the Cloud SQL Go Connector, providing valuable insights into the connector's performance and behavior. The key changes include the integration of OpenTelemetry for metrics collection and the introduction of a dedicated telemetry package.

  1. Setup Metric meter and records in tel with new metrics -
  • connect_latencies: A histogram that records the duration of dial operations, providing insight into connection latency.
  • open_connections: An up-down counter that tracks the number of currently open connections to a Cloud SQL instance.
  • closed_connection_count: A counter that records the total number of closed connections, which helps in understanding connection churn.
  1. Dialer Integration
  • A MetricRecorder is now lazily initialized for each Cloud SQL instance when a connection is first established
  • The Dial method has been enhanced to record metrics at various stages of a connection's lifecycle. For example, it records the connection latency upon a successful dial and increments the open connections counter
  • When a connection is closed, the open connections counter is decremented, and the closed connection count is incremented

@panavenue panavenue requested a review from a team as a code owner November 4, 2025 23:18
@panavenue panavenue force-pushed the system_metrics_setup branch from 8747506 to 1f835d8 Compare November 4, 2025 23:46
@panavenue panavenue force-pushed the system_metrics_setup branch from 1f835d8 to ad4a6e6 Compare November 4, 2025 23:55
@panavenue panavenue requested a review from enocom November 14, 2025 19:02
Copy link
Member

@enocom enocom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to see the AlloyDB implementation was helpful. Left a couple of comments.


// Resolve the instance connection name to a ConnName struct.
// Note: icn may be a domain name that resolves to an instance connection name.
cn, err := d.resolver.Resolve(ctx, icn)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should count DNS lookup successes and errors. Does this make more sense as a label on our connect count metric, or is DNS resolution + status its own separate counter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good call! I don't think I have any DNS lookup/resolve metric in my design doc yet. I will add this one to the doc.

I think having a separate counter would be better. WDYT?

Copy link
Member

@enocom enocom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- except you're missing the option to disable metrics.

@panavenue panavenue requested a review from kgala2 December 3, 2025 18:17
@panavenue panavenue merged commit beb63d8 into feature/client-side-metrics Dec 5, 2025
13 checks passed
@panavenue panavenue deleted the system_metrics_setup branch December 5, 2025 00:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants