Skip to content

Feat: Implementation of OpentTelemetry TracingΒ #868

@hortison

Description

@hortison

Current Behavior

We want to have each of the Golang components across the repos in the Meshery ecosystem to be instrumented with OpenTelemetry tracing. MeshKit (https://github.com/meshery/meshkit) is a central library designed to support a reusable, common approach to this instrumentation.

Desired Behavior

  1. Implement the Otel tracing plumbing (initialization, configuration, and middleware) in MeshKit.
  2. Standardization: A common substrate of tags like service.name, service.version, and environment should be ubiquitous across spans (and traces) to be consistent across participating software components (sourced from different repos). MeshKit is to ensure every service identifies itself the same way.
  3. Context Propagation: Ensure the Trace ID passes between Meshery Server (GraphQL/Handler/other) -> Remote Providers (HTTP requests) and Meshery Extensions (like Kanvas plugin) and vice-versa. MeshKit is to handle the HTTP middleware that injects/extracts W3C Trace Context headers automatically.

Implementation

MeshKit should provide a generic tracing package.

Required Functionality:

  • NewTracerProvider: A function that sets up the OTel SDK.
  • Middleware: A generic HTTP handler wrapper that starts a span for every incoming request.

Conceptual Implementation

// In meshery/meshkit/telemetry/tracing
package tracing

import (
    "context"
    "go.opentelemetry.io/otel"
    "go.opentelemetry.io/otel/sdk/trace"
    "go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
    "go.opentelemetry.io/otel/propagation"
)

// Config holds setup parameters
type Config struct {
    ServiceName    string
    ServiceVersion string
    Endpoint       string // e.g., "localhost:4317"
}

// InitTracer configures the global trace provider
func InitTracer(ctx context.Context, cfg Config) (*trace.TracerProvider, error) {
    exporter, err := otlptracegrpc.New(ctx, 
        otlptracegrpc.WithInsecure(), 
        otlptracegrpc.WithEndpoint(cfg.Endpoint),
    )
    if err != nil {
        return nil, err
    }

    tp := trace.NewTracerProvider(
        trace.WithBatcher(exporter),
        trace.WithResource(newResource(cfg)),
    )

    // Set globals so otel.Tracer("") works everywhere
    otel.SetTracerProvider(tp)
    
    // CRITICAL: Set the propagator for distributed tracing (W3C)
    otel.SetTextMapPropagator(propagation.NewCompositeTextMapPropagator(
        propagation.TraceContext{},
        propagation.Baggage{},
    ))

    return tp, nil
}
  • Prepare for use of this plumbing in Meshery Server (and other components). To get data flowing immediately without rewriting internal logic, focus on the boundaries (entry and exit points).

    • Where: server/main.go and the Router setup.
    • Action:
      • Call tracing.InitTracer at startup.
      • Wrap the main http.ServeMux router with MeshKit's tracing middleware.
    • Result: You immediately see latency and status codes for every API call hitting Meshery.
  • Prepare for use of this plumbing in Remote Providers.

    • Where: The main generic server entry point.
    • Action: Initialize the tracer with ServiceName: "meshery-cloud".
    • Crucial Step: Since Cloud calls Meshery Server (or vice versa), ensure the HTTP Client used for these calls is wrapped with otelhttp.NewTransport. Cloud uses Echo as a framework.
    • Result: You will see a trace start in Cloud and, if propagation works, continue seamlessly into Meshery Server.

Contributor Guides and Handbook

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions