Skip to content

Add Distributed Tracing support #177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

slashmo
Copy link

@slashmo slashmo commented Aug 5, 2025

Motivation

To make the most of Distributed Tracing, users may want to view operations performed by their database clients as part of a larger request to gain more insight into their distributed systems.

Modifications

  • Add new DistributedTracingSupport trait, enabled by default
  • Create client span in single and pipelined command execution
  • Updated the command builder to generate a command name property, used in the db.operation.name span attribute
  • Add open-telemetry example to showcase Distributed Tracing support

When implementing the span and span attributes I closely followed the OpenTelemetry semantic conventions for Redis and general database client spans:

I believe that this provides the highest chance of the span being recognized as a database client operation by various observability backends.

Result

Users can now use Distributed Tracing spans automatically created by the library to more deeply observe their systems. They can also see an example set up of a Hummingbird server, the Valkey client, and Swift OTel to guide their Distributed Tracing usage.

Closes #176

@slashmo slashmo force-pushed the feature/distributed-tracing branch from 6c6fb74 to 5606d42 Compare August 5, 2025 21:08
Copy link

@czechboy0 czechboy0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excited for distributed tracing 👏

@@ -161,16 +176,48 @@ public final actor ValkeyConnection: ValkeyClientProtocol, Sendable {

@inlinable
func _execute<Command: ValkeyCommand>(command: Command) async throws -> RESPToken {
#if DistributedTracingSupport
let span = startSpan(Command.name, ofKind: .client)
defer { span.end() }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you wanted to get rid of the unstructured span, could also do something like:

func _execute<Command: ValkeyCommand>(command: Command) async throws -> RESPToken {
    #if DistributedTracingSupport
    try await _executeWithTracing(command: command)
    #else
    try await _execute0(command: command)
    #endif
}

func _execute0<Command: ValkeyCommand>(command: Command) async throws -> RESPToken {
    // existing execute logic
}

#if DistributedTracingSupport
func _executeWithTracing<Command: ValkeyCommand>(command: Command) async throws -> RESPToken {
    try await withSpan { span in
        // update the span
        do {
            try await _execute0(command: command)
        } catch {
            // extract info from the error into the span
        }
    }
#endif

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. This is similar to what I originally went with. I eventually switched to the more imperative API because I thought it was easier to read. I'm happy to explore withSpan again though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll let Adam/Fabian steer you here, I'm also happy with either.

@codecov-commenter
Copy link

codecov-commenter commented Aug 6, 2025

Codecov Report

❌ Patch coverage is 26.06838% with 346 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@48287fc). Learn more about missing BASE report.

Files with missing lines Patch % Lines
Sources/Valkey/Commands/ServerCommands.swift 7.04% 66 Missing ⚠️
Sources/Valkey/Commands/ClusterCommands.swift 0.00% 32 Missing ⚠️
Sources/Valkey/Commands/GenericCommands.swift 6.06% 31 Missing ⚠️
Sources/Valkey/Commands/SortedSetCommands.swift 11.42% 31 Missing ⚠️
Sources/Valkey/Commands/ConnectionCommands.swift 7.14% 26 Missing ⚠️
Sources/Valkey/Commands/SentinelCommands.swift 0.00% 25 Missing ⚠️
Sources/Valkey/Commands/ScriptingCommands.swift 0.00% 22 Missing ⚠️
Sources/Valkey/Commands/StreamCommands.swift 21.73% 18 Missing ⚠️
Sources/Valkey/Commands/StringCommands.swift 22.72% 17 Missing ⚠️
Sources/Valkey/Commands/HashCommands.swift 0.00% 16 Missing ⚠️
... and 9 more
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #177   +/-   ##
=======================================
  Coverage        ?   41.42%           
=======================================
  Files           ?       83           
  Lines           ?    13899           
  Branches        ?        0           
=======================================
  Hits            ?     5757           
  Misses          ?     8142           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@slashmo
Copy link
Author

slashmo commented Aug 6, 2025

The API breakage check rightfully complains about the added protocol requirement ValkeyCommand.name. Is this acceptable given we're pre-1.0 or would you like me to add a default in an extension, for example like this:

extension ValkeyCommand {
    public static var name: String { String(describing: Self.self) }
}

@slashmo slashmo force-pushed the feature/distributed-tracing branch from a67131f to c57a3b5 Compare August 6, 2025 08:29
@adam-fowler
Copy link
Collaborator

The API breakage check rightfully complains about the added protocol requirement ValkeyCommand.name. Is this acceptable given we're pre-1.0 or would you like me to add a default in an extension, for example like this:

extension ValkeyCommand {

    public static var name: String { String(describing: Self.self) }

}

API breakages are fine

Copy link
Collaborator

@adam-fowler adam-fowler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of initial comments.

It's a shame I haven't got the benchmarks running for non-members of the repo. It fails on add the comment to the PR. I imagine this will add allocations. Can you run the benchmarks locally (run main and compare this PR with main) and post the results as a comment, cheers.

@@ -40,6 +44,8 @@ public final actor ValkeyConnection: ValkeyClientProtocol, Sendable {
@usableFromInline
let channelHandler: ValkeyChannelHandler
let configuration: ValkeyConnectionConfiguration
@usableFromInline
let address: (hostOrSocketPath: String, port: Int?)?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If port isn't set you can default to 6379

@@ -189,8 +236,42 @@ public final actor ValkeyConnection: ValkeyClientProtocol, Sendable {
public func execute<each Command: ValkeyCommand>(
_ commands: repeat each Command
) async -> sending (repeat Result<(each Command).Response, Error>) {
#if DistributedTracingSupport
let span = startSpan("MULTI", ofKind: .client)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why MULTI? This implies you are running the Valkey command MULTI which starts a transaction.
Is it worthwhile building a string out of all of the commands eg SET,LPUSH,INCR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that we're running a MULTI followed by the commands here. If that's not the case I believe the correct thing to do is to create a span for each command, instead of one that groups them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a TestTracer that allowed you to run tests in parallel in Hummingbird. https://github.com/hummingbird-project/hummingbird/blob/main/Tests/HummingbirdTests/TracingTests.swift. It used Task Local variables to separate tracers for each test.

You could probably use a Testing trait to set it up for each test.

@slashmo
Copy link
Author

slashmo commented Aug 6, 2025

Here's the benchmark diff between main and the pr. I ran them with Swift 6.2, let me know if you'd prefer 6.1 and I can install it:

swift-driver version: 1.127.10 Apple Swift version 6.2 (swiftlang-6.2.0.14.8 clang-1700.3.14.6)
Target: arm64-apple-macosx15.0
===============================================================
Threshold deviations for ValkeyBenchmarks:Client: GET benchmark
===============================================================
╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Malloc (total) (K, %)                    │            main │              pr │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p25                                      │              74 │              81 │              10 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p50                                      │              77 │              83 │               8 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p75                                      │              78 │              85 │               8 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

=========================================================================================================
Threshold deviations for ValkeyBenchmarks:Client: GET benchmark | parallel 20 | 20 concurrent connections
=========================================================================================================
╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Malloc (total) (K, %)                    │            main │              pr │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p75                                      │             117 │             125 │               6 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

=========================================================================================================
Threshold deviations for ValkeyBenchmarks:Client: GET benchmark | parallel 50 | 20 concurrent connections
=========================================================================================================
╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Malloc (total) (K, %)                    │            main │              pr │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p25                                      │               5 │               7 │              30 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p50                                      │               8 │              11 │              45 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p75                                      │              11 │              18 │              59 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Instructions (M, %)                      │            main │              pr │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p25                                      │             247 │             269 │               8 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p50                                      │             249 │             271 │               9 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p75                                      │             250 │             272 │               9 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

===================================================================
Threshold deviations for ValkeyBenchmarks:Connection: GET benchmark
===================================================================
╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Throughput (# / s) (#, %)                │            main │              pr │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p25                                      │              29 │              27 │               6 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p50                                      │              29 │              27 │               6 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p75                                      │              29 │              27 │               6 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Time (wall clock) (ms, %)                │            main │              pr │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p25                                      │              34 │              37 │               6 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p50                                      │              35 │              37 │               6 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p75                                      │              35 │              38 │               8 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Time (total CPU) (ms, %)                 │            main │              pr │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p25                                      │              29 │              31 │               6 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p50                                      │              29 │              31 │               6 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p75                                      │              30 │              32 │               9 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Instructions (M, %)                      │            main │              pr │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p25                                      │             225 │             247 │               9 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p50                                      │             225 │             247 │               9 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p75                                      │             225 │             247 │               9 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

========================================================================
Threshold deviations for ValkeyBenchmarks:Connection: Pipeline benchmark
========================================================================
╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Throughput (# / s) (#, %)                │            main │              pr │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p25                                      │              17 │              16 │               5 │               5 │
├──────────────────────────────────────────┼─────────────────┼─────────────────┼─────────────────┼─────────────────┤
│ p50                                      │              17 │              16 │               5 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Malloc (total) (K, %)                    │            main │              pr │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p25                                      │              14 │              26 │              83 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

====================================================================
Threshold deviations for ValkeyBenchmarks:HashSlot – {user}.whatever
====================================================================
╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Throughput (# / s) (#, %)                │            main │              pr │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p75                                      │              57 │              53 │               7 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Time (wall clock) (ms, %)                │            main │              pr │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p75                                      │              18 │              19 │               7 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

=====================================================================================
Threshold deviations for ValkeyBenchmarks:ValkeyCommandEncoder – Command with 7 words
=====================================================================================
╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Time (wall clock) (μs, %)                │            main │              pr │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p75                                      │             420 │             441 │               5 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

╒══════════════════════════════════════════╤═════════════════╤═════════════════╤═════════════════╤═════════════════╕
│ Time (total CPU) (μs, %)                 │            main │              pr │    Difference % │     Threshold % │
╞══════════════════════════════════════════╪═════════════════╪═════════════════╪═════════════════╪═════════════════╡
│ p75                                      │             421 │             443 │               5 │               5 │
╘══════════════════════════════════════════╧═════════════════╧═════════════════╧═════════════════╧═════════════════╛

New baseline 'pr' is WORSE than the 'main' baseline thresholds.

error: benchmarkThresholdRegression

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Distributed Tracing Support
4 participants