Skip to content

Spans for evaluation results in addition to eventsΒ #2626

@singankit

Description

@singankit

Area(s)

area:gen-ai

What's missing?

As discussed in #2563 comment , a decision needs to be made on whether a span would be created for evaluation(to capture how scores were calculated) in addition to events representing the scores. There is consensus that such span is helpful however following details needs to be ironed out

  • Would span be optional?
  • How would span and events be associated?
  • Would evaluation scores be present on both event and span?
  • For case where events are not child of span? How would it be handled? as per this comment
  • Capturing input/output for evaluation as part of this comment .
  • Usage data discussion as per this comment

Describe the solution you'd like

Evaluation span could be optional to help debug on how an evaluation score was calculated but not necessary. Events will always be emitted on the span/trace being evaluated. Evaluation span can be linked to span/trace being evaluated via span links.

Other option could, be as suggested by @alexmojaki in #2563 comment , to have an attribute on event that points to the eval span.

Tip

React with πŸ‘ to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

Type

No type

Projects

Status

No status

Status

Need triage

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions