Skip to content

[Bug]: Jaeger v2 + Opensearch fails showing trace data older than about 7 days #8331

@feldentm-SAP

Description

@feldentm-SAP

What happened?

Note: The opensearch-backed demo instance (https://jaeger.demo.jaegertracing.io) is also affected (screenshots attached).

Symptom: If you look at traces older than ~7 days, search shows traces correctly, but opening them fails with 404 (see screenshot).

Another related issue is that ~30 days, search in jaeger seems to be unable to find traces correctly, even if they are still present in the backing opensearch instance.

Affected setups:
whatever is configured in (couldn't find the configs, but would be interested to see them / have them linked in docs)
https://jaeger.demo.jaegertracing.io

Grafana + Jaeger (query) + OpenSearch 2.x (multiple landscapes)

Steps to reproduce

  1. identify a service with activity some days ago (6 or 7 worked fine for me)
  2. select search timeframe accordingly
  3. select a trace (issue seems to occur unconditionally)

Expected behavior

Searching for traces and displaying traces should both work if the trace data is present in opensearch. Especially opening a trace by just traceID should work for at least 30days by default. If a configuration parameter is required here, it should be documented for the v2 API (=config.yaml).

Relevant log output

I failed to configure logs or self-monitoring traces to yield any relevant data. The traces are especially frustrating since the by-ID query seems to create no trace at all and the search is always successful.

Screenshot

jaeger demo:

Image Image

local test system (where I know the backing open search instance and can log into it):

Image Image Image Image Image Image

Additional context

Grafana, in our setup seems to be irrelevant. I mentioned it for the sake of completeness. Reproducing the 30day search issue is a lot harder since I do not see how I could share details here. I'd hope that whoever maintains the jaeger demo instance can check this aspect easily.

Jaeger backend version

v2.x (multiple recent versions)

SDK

separate v2 collector which works as expected and is forced to the same version as the queriers

Pipeline

ours:
multiple -> otel collector -> otel collector with jaeger collector -> opensearch -> otel collector with jaeger-query -> (jaeger ui; grafana (both affected))

Stogage backend

OpenSearch 2.x

Operating system

Linux

Deployment model

Kubernetes

Deployment configs

couldn't find any command line options that would work with the v2 setup; cannot tell how jaeger demo works

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions