Skip to content

Missing traces on Hypertrace (RawSpansGrouper) #315

@13shivam

Description

@13shivam

Observation:

Traces dropping in one of our internal system running on php backend, this happens if span/nested span structure exceds rawSpansGrouperConfig.maxSpanCount which internally is used here: refer

we tried isolating the internal method line on which spans were getting dropped, count of spans per route generated are around 1100, including nested spans.

Progress so far:

  • we looked at the configs of jaeger-agent and otel-collector no significant spikes were seen on the pod memory+cpu,

  • next we checked the MTU limit in the pod, which uses Jumbo 9001 (reg: Network maximum transmission unit (MTU) for your EC2 instance) this too doesnt looked like a bottleneck,

  • next we tried taking dump of nw throughput on our pod using tshark and we were able to see our spans which confirmed that spans on the app side are started and closed properly,

  • we also ran our service for some routes pointing to jaeger:all-in-one in local, but were able to see 400+ spans,

finally,

  • we tried increasing the max.span.count on RSG to 2000 from 250, sample with ~1070 spans (1.3Mb file size), this time we were able to see the spans on our env.

the issue is that, once the max count is reached, RSG should truncate the spans, but in this case entire trace payload is getting dropped.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions