feat(csharp): add telemetry tags to driver activities (WI-6.2)#177
Closed
jadewang-db wants to merge 18 commits intomainfrom
Closed
feat(csharp): add telemetry tags to driver activities (WI-6.2)#177jadewang-db wants to merge 18 commits intomainfrom
jadewang-db wants to merge 18 commits intomainfrom
Conversation
This was referenced Jan 22, 2026
d6e909c to
1813066
Compare
1813066 to
1f7cde0
Compare
jadewang-db
added a commit
that referenced
this pull request
Jan 23, 2026
## 🥞 Stacked PR Use this [link](https://github.com/adbc-drivers/databricks/pull/161/files) to review incremental changes. - [**stack/wi-1.2-tag-definition-system**](#161) [[Files changed](https://github.com/adbc-drivers/databricks/pull/161/files)] - [stack/wi-2.1-telemetry-data-models](#162) [[Files changed](https://github.com/adbc-drivers/databricks/pull/162/files/ab7fa964ff62f3fc9884034e17a7e57630fa8037..a566292aec78d19717c92e28f135535b09f25c80)] - [stack/wi-2.1-exception-classifier](#163) [[Files changed](https://github.com/adbc-drivers/databricks/pull/163/files/a566292aec78d19717c92e28f135535b09f25c80..baa7a2ae32662fddc65272e0264e8bb7d1644716)] - [stack/wi-3.1-circuit-breaker](#164) [[Files changed](https://github.com/adbc-drivers/databricks/pull/164/files/baa7a2ae32662fddc65272e0264e8bb7d1644716..03f7027e6731efe032c15555afe517ba49de3651)] - [stack/wi-3.1-feature-flag-cache](#165) [[Files changed](https://github.com/adbc-drivers/databricks/pull/165/files/03f7027e6731efe032c15555afe517ba49de3651..1d6e3d5b1c4c31ec91361337e574e6e5411fbbb6)] - [stack/wi-3.4-databricks-telemetry-exporter](#166) [[Files changed](https://github.com/adbc-drivers/databricks/pull/166/files/1d6e3d5b1c4c31ec91361337e574e6e5411fbbb6..eb382cb291c120a5f3cc3a1c38e0975b99c1369f)] - [stack/wi-3.5-metrics-aggregator](#167) [[Files changed](https://github.com/adbc-drivers/databricks/pull/167/files/eb382cb291c120a5f3cc3a1c38e0975b99c1369f..67723fabe6f62d7ed16591c3e88e96aa269daddd)] - [stack/wi-3.5-circuit-breaker-manager](#168) [[Files changed](https://github.com/adbc-drivers/databricks/pull/168/files/67723fabe6f62d7ed16591c3e88e96aa269daddd..6b66d37e9d97ca621d88c48a58ac60b2487425ea)] - [stack/e2e-feature-flag-cache-tests](#169) [[Files changed](https://github.com/adbc-drivers/databricks/pull/169/files/6b66d37e9d97ca621d88c48a58ac60b2487425ea..2a6fff2b9b91c7fd6cff7558d1d3b3596c0fa3c2)] - [stack/databricks-activity-listener](#170) [[Files changed](https://github.com/adbc-drivers/databricks/pull/170/files/2a6fff2b9b91c7fd6cff7558d1d3b3596c0fa3c2..39f6aed55278a533390e9aadf655f80dc11159c2)] - [stack/circuit-breaker-telemetry-exporter](#171) [[Files changed](https://github.com/adbc-drivers/databricks/pull/171/files/39f6aed55278a533390e9aadf655f80dc11159c2..4473de5ca3cfca8579818e6d58f8a2b12e869a47)] - [stack/telemetry-client-manager-wi-3.2](#172) [[Files changed](https://github.com/adbc-drivers/databricks/pull/172/files/4473de5ca3cfca8579818e6d58f8a2b12e869a47..94b678636d76a6d41a6612f76d00b4caccdab48a)] - [stack/telemetry-client-wi-5.5](#173) [[Files changed](https://github.com/adbc-drivers/databricks/pull/173/files/94b678636d76a6d41a6612f76d00b4caccdab48a..ce00998cbd0372d94303ad1d69e9711e4489fe96)] - [stack/telemetry-client-manager-e2e-wi-7](#174) [[Files changed](https://github.com/adbc-drivers/databricks/pull/174/files/ce00998cbd0372d94303ad1d69e9711e4489fe96..2646e86223ff1e7706b20d5970e556ec2f17867b)] - [stack/telemetry-client-e2e-tests-wi-7-standalone](#175) [[Files changed](https://github.com/adbc-drivers/databricks/pull/175/files/2646e86223ff1e7706b20d5970e556ec2f17867b..0b9ebd3867250d92d0d8007cb17d6ce471d5560a)] - [stack/wi-6.1-databricks-connection-telemetry-integration](#176) [[Files changed](https://github.com/adbc-drivers/databricks/pull/176/files/0b9ebd3867250d92d0d8007cb17d6ce471d5560a..4f553284c30eb7efcf67369c58dddd56675cd0be)] - [stack/wi-6.2-telemetry-tags-driver-activities](#177) [[Files changed](https://github.com/adbc-drivers/databricks/pull/177/files/4f553284c30eb7efcf67369c58dddd56675cd0be..1f7cde0c5642072b06588665b16ee3a30a90d256)] - [stack/wi-9-full-integration-e2e-tests](#178) [[Files changed](https://github.com/adbc-drivers/databricks/pull/178/files/1f7cde0c5642072b06588665b16ee3a30a90d256..c65e9fea7c65fa456f0114e95c867ee15f21bd87)] --------- --------- Co-authored-by: Jade Wang <jade.wang+data@databricks.com> Co-authored-by: Claude <noreply@anthropic.com>
1f7cde0 to
2364122
Compare
jadewang-db
added a commit
that referenced
this pull request
Jan 23, 2026
## 🥞 Stacked PR Use this [link](https://github.com/adbc-drivers/databricks/pull/162/files) to review incremental changes. - [**stack/wi-2.1-telemetry-data-models**](#162) [[Files changed](https://github.com/adbc-drivers/databricks/pull/162/files)] - [stack/wi-2.1-exception-classifier](#163) [[Files changed](https://github.com/adbc-drivers/databricks/pull/163/files/1e58d3c3785fa7ec1b83da01f80ddea1f6167851..0dac01831e7d9d313c67dc31e4aacceb17e74298)] - [stack/wi-3.1-circuit-breaker](#164) [[Files changed](https://github.com/adbc-drivers/databricks/pull/164/files/0dac01831e7d9d313c67dc31e4aacceb17e74298..59b0221cb4c9262d80a35041a2f1098376f6e19e)] - [stack/wi-3.1-feature-flag-cache](#165) [[Files changed](https://github.com/adbc-drivers/databricks/pull/165/files/59b0221cb4c9262d80a35041a2f1098376f6e19e..8c30fc0649b09bc38e09cfd4d6875d66963ff6c0)] - [stack/wi-3.4-databricks-telemetry-exporter](#166) [[Files changed](https://github.com/adbc-drivers/databricks/pull/166/files/8c30fc0649b09bc38e09cfd4d6875d66963ff6c0..a6e926c8017e9a3b3b6de31bbbafb367adaba884)] - [stack/wi-3.5-metrics-aggregator](#167) [[Files changed](https://github.com/adbc-drivers/databricks/pull/167/files/a6e926c8017e9a3b3b6de31bbbafb367adaba884..c53df5d3c0124c490b920e1e1a611dd9c24e02a4)] - [stack/wi-3.5-circuit-breaker-manager](#168) [[Files changed](https://github.com/adbc-drivers/databricks/pull/168/files/c53df5d3c0124c490b920e1e1a611dd9c24e02a4..de8757a697dd023628011d1aff9961896560bc95)] - [stack/e2e-feature-flag-cache-tests](#169) [[Files changed](https://github.com/adbc-drivers/databricks/pull/169/files/de8757a697dd023628011d1aff9961896560bc95..0b77f8373958342da429c20f7e30c02105402331)] - [stack/databricks-activity-listener](#170) [[Files changed](https://github.com/adbc-drivers/databricks/pull/170/files/0b77f8373958342da429c20f7e30c02105402331..9090bdefba63d6c7fbff45bf60c2c63668f3884e)] - [stack/circuit-breaker-telemetry-exporter](#171) [[Files changed](https://github.com/adbc-drivers/databricks/pull/171/files/9090bdefba63d6c7fbff45bf60c2c63668f3884e..0a0159524a429726078bd7340057672d6927d1cd)] - [stack/telemetry-client-manager-wi-3.2](#172) [[Files changed](https://github.com/adbc-drivers/databricks/pull/172/files/0a0159524a429726078bd7340057672d6927d1cd..75039c6574c2dc437f5d670e71b938b98719c06f)] - [stack/telemetry-client-wi-5.5](#173) [[Files changed](https://github.com/adbc-drivers/databricks/pull/173/files/75039c6574c2dc437f5d670e71b938b98719c06f..254cdc75487f3e9344d3df6fb9b9cbf49fd03228)] - [stack/telemetry-client-manager-e2e-wi-7](#174) [[Files changed](https://github.com/adbc-drivers/databricks/pull/174/files/254cdc75487f3e9344d3df6fb9b9cbf49fd03228..7371da59309d109e8d457f4c27edd13adfa38a2c)] - [stack/telemetry-client-e2e-tests-wi-7-standalone](#175) [[Files changed](https://github.com/adbc-drivers/databricks/pull/175/files/7371da59309d109e8d457f4c27edd13adfa38a2c..5ff7e96827faa69e8bae1d5b5da06a9f95b91a8c)] - [stack/wi-6.1-databricks-connection-telemetry-integration](#176) [[Files changed](https://github.com/adbc-drivers/databricks/pull/176/files/5ff7e96827faa69e8bae1d5b5da06a9f95b91a8c..7757345889dbfd0b1dcb22556e2e6c746d7fa0f0)] - [stack/wi-6.2-telemetry-tags-driver-activities](#177) [[Files changed](https://github.com/adbc-drivers/databricks/pull/177/files/7757345889dbfd0b1dcb22556e2e6c746d7fa0f0..2364122ad5402c9205008f39acaec6a400a4db98)] - [stack/wi-9-full-integration-e2e-tests](#178) [[Files changed](https://github.com/adbc-drivers/databricks/pull/178/files/2364122ad5402c9205008f39acaec6a400a4db98..698f3ea13f65a17b62385be8e8e4032497f88993)] --------- --------- Co-authored-by: Jade Wang <jade.wang+data@databricks.com> Co-authored-by: Claude <noreply@anthropic.com>
2364122 to
d07c210
Compare
d07c210 to
47c7ee7
Compare
b647e8c to
3f776ce
Compare
github-merge-queue bot
pushed a commit
that referenced
this pull request
Feb 10, 2026
## 🥞 Stacked PR Use this [link](https://github.com/adbc-drivers/databricks/pull/166/files) to review incremental changes. - [**stack/wi-3.4-databricks-telemetry-exporter**](#166) [[Files changed](https://github.com/adbc-drivers/databricks/pull/166/files)] - [stack/wi-3.5-metrics-aggregator](#167) [[Files changed](https://github.com/adbc-drivers/databricks/pull/167/files/90b20fcc9c10e7e7168957568dd2bb2b9bc233e5..33406c1ba7e85bc6e1840b42de1a1d4e57c4b84a)] - [stack/wi-3.5-circuit-breaker-manager](#168) [[Files changed](https://github.com/adbc-drivers/databricks/pull/168/files/33406c1ba7e85bc6e1840b42de1a1d4e57c4b84a..09a3ee956ad1423fdc8f181da667b305d9e3483f)] - [stack/e2e-feature-flag-cache-tests](#169) [[Files changed](https://github.com/adbc-drivers/databricks/pull/169/files/09a3ee956ad1423fdc8f181da667b305d9e3483f..4fe0016513d084b72c82a5195b0bf6f1e50f2f80)] - [stack/databricks-activity-listener](#170) [[Files changed](https://github.com/adbc-drivers/databricks/pull/170/files/4fe0016513d084b72c82a5195b0bf6f1e50f2f80..f8bb1099aa0e7e086e52eb8306f9edd18f2b8578)] - [stack/circuit-breaker-telemetry-exporter](#171) [[Files changed](https://github.com/adbc-drivers/databricks/pull/171/files/f8bb1099aa0e7e086e52eb8306f9edd18f2b8578..383d3efd09800816b7610563dd91d6686bcb1484)] - [stack/telemetry-client-manager-wi-3.2](#172) [[Files changed](https://github.com/adbc-drivers/databricks/pull/172/files/383d3efd09800816b7610563dd91d6686bcb1484..cda26685fbce49c0c96c365f9dd26c70bcc65c13)] - [stack/telemetry-client-wi-5.5](#173) [[Files changed](https://github.com/adbc-drivers/databricks/pull/173/files/cda26685fbce49c0c96c365f9dd26c70bcc65c13..695232b24ea01751613750f9fd5c9ad6717616eb)] - [stack/telemetry-client-manager-e2e-wi-7](#174) [[Files changed](https://github.com/adbc-drivers/databricks/pull/174/files/695232b24ea01751613750f9fd5c9ad6717616eb..017930e03ae1fa36b7b36a0ca2e9c81fcecd7fa6)] - [stack/telemetry-client-e2e-tests-wi-7-standalone](#175) [[Files changed](https://github.com/adbc-drivers/databricks/pull/175/files/017930e03ae1fa36b7b36a0ca2e9c81fcecd7fa6..abfc895fb1566a5f38b144a1bbbd506a90374560)] - [stack/wi-6.1-databricks-connection-telemetry-integration](#176) [[Files changed](https://github.com/adbc-drivers/databricks/pull/176/files/abfc895fb1566a5f38b144a1bbbd506a90374560..9ae97e732ed4bbd9d0667be7d9a724dee7380d6b)] - [stack/wi-6.2-telemetry-tags-driver-activities](#177) [[Files changed](https://github.com/adbc-drivers/databricks/pull/177/files/9ae97e732ed4bbd9d0667be7d9a724dee7380d6b..3f776cecad4c8deded27d877aed857d6da826baf)] - [stack/wi-9-full-integration-e2e-tests](#178) [[Files changed](https://github.com/adbc-drivers/databricks/pull/178/files/3f776cecad4c8deded27d877aed857d6da826baf..197a9ee8a9d746b68848ea8006f9669ecee875e3)] --------- --------- Co-authored-by: Jade Wang <jade.wang+data@databricks.com> Co-authored-by: Claude <noreply@anthropic.com>
3f776ce to
7fef380
Compare
7fef380 to
20145ab
Compare
Implement MetricsAggregator that aggregates Activity data by statement_id and handles exception buffering with terminal vs retryable classification. Key features: - ProcessActivity extracts tags and aggregates by statement_id using ConcurrentDictionary<string, StatementTelemetryContext> - CompleteStatement emits aggregated TelemetryEvent - RecordException flushes terminal exceptions immediately - RecordException buffers retryable exceptions until CompleteStatement - FlushAsync exports when batch size or time interval reached - Uses TelemetryTagRegistry to filter tags - Creates TelemetryFrontendLog wrapper with workspace_id - All exceptions swallowed and logged at TRACE level Implementation details: - Connection events emit immediately (no aggregation needed) - Statement events aggregate until CompleteStatement is called - Timer-based periodic flush using System.Threading.Timer - Thread-safe aggregation using ConcurrentDictionary - Nested StatementTelemetryContext holds aggregated metrics and buffered exceptions per statement Test coverage: - 29 unit tests covering all exit criteria - Tests for exception handling, tag filtering, frontend log wrapping - End-to-end statement lifecycle tests Co-Authored-By: Claude <noreply@anthropic.com>
20145ab to
c16537e
Compare
Implement CircuitBreakerManager as a singleton that manages circuit breakers per host. Each host gets its own circuit breaker instance for isolation, preventing one failing endpoint from affecting others. Key features: - Singleton pattern with GetInstance() method - Per-host circuit breaker isolation using ConcurrentDictionary - Thread-safe concurrent access - Case-insensitive host matching - Support for both default and custom configurations This follows the JDBC driver pattern in CircuitBreakerManager.java. Co-Authored-By: Claude <noreply@anthropic.com>
Update all CircuitBreaker tests to use the new constructor that takes a CircuitBreakerConfig object instead of individual parameters. Changes: - Add helper methods CreateDefaultConfig() and CreateConfig() for cleaner test setup - Replace `new CircuitBreaker()` with `new CircuitBreaker(CreateDefaultConfig())` - Replace `new CircuitBreaker(failureThreshold: N)` with `new CircuitBreaker(CreateConfig(failureThreshold: N))` - Replace `new CircuitBreaker(failureThreshold: N, timeout: T)` with `new CircuitBreaker(CreateConfig(failureThreshold: N, timeout: T))` - Add new tests for constructor with null config and config storage Co-Authored-By: Claude <noreply@anthropic.com>
Add comprehensive E2E tests for feature flag fetching from real Databricks endpoints and validate caching and reference counting behavior: - FeatureFlagCache_FetchFromRealEndpoint_ReturnsBoolean: Tests real endpoint - FeatureFlagCache_CachesValue_DoesNotRefetchWithinTTL: Validates caching - FeatureFlagCache_InvalidHost_ReturnsDefaultFalse: Tests error handling - FeatureFlagCache_RefCountingWorks_CleanupAfterRelease: Tests ref counting Additional tests cover: - Cache expiry and refetch behavior - Null/empty host handling - Unknown host behavior - Multiple hosts with independent ref counts - Concurrent reference counting thread safety - False value caching - Cancellation propagation Co-Authored-By: Claude <noreply@anthropic.com>
Rewrite E2E tests for FeatureFlagCache to use the actual production API: - Use MergePropertiesWithFeatureFlagsAsync() instead of non-existent IsTelemetryEnabledAsync() - Use HasContext(), TryGetContext(), RemoveContext(), Clear() for cache management testing - Remove tests for non-existent reference counting (RefCount property) - Remove tests for non-existent TelemetryEnabled, LastFetched, IsExpired - Fix namespace import from Telemetry to correct namespace - Add tests for TryGetHost protocol stripping and Uri extraction The previous tests were testing an API that didn't exist and would not compile. Co-Authored-By: Claude <noreply@anthropic.com>
Add DatabricksActivityListener that listens to 'Databricks.Adbc.Driver' ActivitySource, extracts metrics from activities, and delegates to MetricsAggregator. This implements Phase 5 of the telemetry design. Key features: - ShouldListenTo returns true for 'Databricks.Adbc.Driver' source - Sample callback respects feature flag (AllDataAndRecorded when enabled, None when disabled) - ActivityStopped callback delegates to MetricsAggregator.ProcessActivity - All callbacks wrapped in try-catch with TRACE logging - StopAsync flushes pending metrics via MetricsAggregator.FlushAsync - Supports dynamic feature flag checking via optional Func<bool> Co-Authored-By: Claude <noreply@anthropic.com>
Implement wrapper exporter that protects inner telemetry exporter with circuit breaker pattern. Key features: - Wraps ITelemetryExporter with circuit breaker protection - Uses CircuitBreakerManager.GetCircuitBreaker(host) for per-host isolation - Exports events when circuit is closed - Drops events silently when circuit is open (logged at DEBUG level) - Circuit breaker tracks failures BEFORE exceptions are swallowed This follows the design in Section 3.3 of the telemetry design document. Co-Authored-By: Claude <noreply@anthropic.com>
Implement per-host telemetry client management with reference counting to prevent rate limiting from concurrent connections. - ITelemetryClient: Interface for telemetry clients with ExportAsync and CloseAsync methods - TelemetryClientHolder: Holds client and reference count with atomic operations using Interlocked - TelemetryClientManager: Singleton factory managing one client per host using ConcurrentDictionary for thread-safety - TelemetryClientAdapter: Adapter bridging ITelemetryExporter to ITelemetryClient interface Key features: - GetInstance() returns singleton - GetOrCreateClient() creates/returns client and increments RefCount - ReleaseClientAsync() decrements RefCount, closes client when zero - Same host returns same client instance (case-insensitive) - Thread-safe with ConcurrentDictionary and atomic ref counting - All exceptions swallowed per telemetry design requirement Co-Authored-By: Claude <noreply@anthropic.com>
Implement TelemetryClient that coordinates listener, aggregator, and exporter. Manages background flush task and graceful shutdown. Changes: - Add TelemetryClient.cs implementing ITelemetryClient interface - Constructor initializes full telemetry pipeline: DatabricksTelemetryExporter → CircuitBreakerTelemetryExporter → MetricsAggregator → DatabricksActivityListener - ExportAsync delegates to the circuit breaker protected exporter - CloseAsync implements graceful shutdown per design doc Section 9.3: - Cancels background flush task - Stops listener (which flushes pending metrics) - Waits for background task with 5s timeout - Disposes all resources - All exceptions swallowed and logged at TRACE level - Background flush task periodically exports pending metrics - Update TelemetryClientManager.CreateClient() to use TelemetryClient - Add comprehensive unit tests (21 tests) covering: - Constructor initialization - Export delegation - Close with flush and cancellation - Exception swallowing during close - Background flush behavior - Thread safety Test file: csharp/test/Unit/Telemetry/TelemetryClientTests.cs Co-Authored-By: Claude <noreply@anthropic.com>
…ATE) Add end-to-end tests for TelemetryClientManager per-host client management: - TelemetryClientManager_SameHost_ReturnsSameClient: Validates same host returns same client - TelemetryClientManager_DifferentHosts_ReturnsDifferentClients: Validates different hosts get different clients - TelemetryClientManager_ConcurrentAccess_ThreadSafe: Validates thread-safety under concurrent access - TelemetryClientManager_LastRelease_ClosesClient: Validates reference counting and cleanup Additional tests for comprehensive coverage: - Case-insensitive host comparison - Mixed hosts handling - Concurrent release behavior - Concurrent get and release operations - Concurrent access to multiple hosts - Unknown host release handling - Exception swallowing during close Co-Authored-By: Claude <noreply@anthropic.com>
…2E GATE) Add comprehensive E2E tests for TelemetryClient in isolation against real Databricks endpoint. Tests validate: - Single event export to real endpoint - Batch event export to real endpoint - Authenticated endpoint usage (/telemetry-ext) - Circuit breaker opens after consecutive failures (default threshold 5) - Circuit breaker recovers after timeout period (half-open -> closed) - Graceful close flushes all pending events - Idempotent close behavior - Export after close doesn't throw The tests use mock exporters for circuit breaker behavior testing since the DatabricksTelemetryExporter swallows exceptions internally. This allows the CircuitBreakerTelemetryExporter wrapper to properly track failures. Co-Authored-By: Claude <noreply@anthropic.com>
Integrate telemetry components into DatabricksConnection lifecycle: - Initialize telemetry in HandleOpenSessionResponse after session is established - Check FeatureFlagCache.IsTelemetryEnabledAsync for server-side feature flag - Create TelemetryClient via TelemetryClientManager.GetOrCreateClient when enabled - Release telemetry resources in Dispose with proper reference counting - All telemetry exceptions are swallowed per requirement (Section 8.1) Implementation follows design doc sections 6.2 (Initialization) and 9.2 (Connection Close). Co-Authored-By: Claude <noreply@anthropic.com>
Add telemetry-specific tags to existing driver activities: Connection Activities: - driver.version, driver.os, driver.runtime - feature.cloudfetch, feature.lz4 Statement Activities: - result.format (inline/cloudfetch) in DatabricksCompositeReader - result.chunk_count, result.bytes_downloaded in CloudFetchDownloader - poll.count, poll.latency_ms in StatementExecutionStatement This implements Section 4.2 of telemetry-design.md for Activity Tags by Event Type, enabling Databricks telemetry service to track driver usage patterns and performance metrics. Co-Authored-By: Claude <noreply@anthropic.com>
c16537e to
35a4a01
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🥞 Stacked PR
Use this link to review incremental changes.