Skip to content

Recording details as logs, span events, or span attributes #4414

@lmolkova

Description

@lmolkova

Extracting discussion points from #4333 and #4393 around how to record occurrences of things that are not spans.

Examples of controversial things being discussed:

  • HTTP request and HTTP response bodies
  • DB query inputs and results
  • LLM prompts and completions
  • Exceptions

1. Record them as logs

Pros:

  • logs can be reported with or without tracing instrumentation and sampled along with spans or not sampled at all
  • logs provide verbosity control out of the box
  • logs are familiar to and used by everyone. A lot of libs/infra out there report logs/events
  • structured logs are very flexible

Minor inconveniences:

  • Need to configure Logs SDK

Cons:

  • logs are exported separately from spans. Backends that want to store thing like HTTP request body along with the corresponding span need to do extra effort.
  • [UPDATE] Backends that don't support logs can't ingest them

Mitigation:

  • Log -> span event processor in the SDK could attach log to span (if there is a current one). We may explore options on how to make it simple for end users.
  • Ingestion on the backend side may append logs to span records in the internal storage if allows such mutations

2. Record them as span events

Pros:

  • exported along with the spans, it's easy to decide on the ingestion side how to store them

Cons:

  • impossible to record if tracer is disabled or span is sampled out
  • creates a separate stream of events (in addition to log-based events) - not clear when to use which
  • no verbosity control
  • can't export them before span ends

Mitigations:

  • span events -> logs pipeline could export events reported on unsampled spans
  • we can add severity to span events

Fundamental problems:

  • we stay with very similar, but independent span events and logs signals and without clear understanding when to use what

3. Record them as complex span attributes

E.g. rpc.message span event with attributes rpc.message.id = 1 and rpc.message.type=SENT, could be represented as a complex attribute

"rpc.message": [  {  "rpc.message.id" : 1,  "rpc.message.type": "SENT", "timestamp":...  }]

Alternative is flattening (including indexes):

  • rpc.message[0].timestamp = ...
  • rpc.message[0].message.id = 1
  • rpc.message[0].type = SENT

Pros - same as span events.

Cons - same as span events and more:

  • much harder to add severity to attributes
  • timestamp is an attribute
  • query language needs to support complex array attributes on spans

4. Record logs as zero duration spans

Listing it for completeness, I don't understand what does it solve, so maybe @adriangb who brought it up can provide some details

Metadata

Metadata

Assignees

No one assigned

    Labels

    triage:deciding:needs-infoNot enough information. Left open to provide the author with time to add more details

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions