Skip to content

Conversation

@westarle
Copy link
Contributor

No description provided.

@codecov
Copy link

codecov bot commented Nov 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.92%. Comparing base (5f97cb1) to head (566fd1d).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3822   +/-   ##
=======================================
  Coverage   95.92%   95.92%           
=======================================
  Files         156      156           
  Lines        5991     5991           
=======================================
  Hits         5747     5747           
  Misses        244      244           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@westarle westarle marked this pull request as ready for review November 17, 2025 17:01
@westarle westarle requested a review from a team as a code owner November 17, 2025 17:01
@westarle westarle force-pushed the feat/telemetry-e2e-test branch 3 times, most recently from 5bc4d2d to 8dbfcbd Compare November 19, 2025 20:27
@westarle westarle requested a review from dbolduc November 21, 2025 16:53
@westarle
Copy link
Contributor Author

westarle commented Nov 21, 2025

This passes now (needed retry on 500's and maybe less wonky retry policy.)

Copy link
Member

@dbolduc dbolduc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good, but some things are strange... mainly the initial sleep and the multiple flushes.

let root_span = tracing::info_span!("e2e_root", "otel.name" = span_name);
let trace_id = {
let _enter = root_span.enter();
let otel_ctx = root_span.context();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL: opentelemetry::trace::TraceContextExt

let mut trace = None;

for delay in backoff_delays {
println!("Waiting {}s before polling...", delay);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a comment saying why we are doing this. There is actually a good reason.

// Because we are limited by quota, start with a backoff.

@westarle westarle force-pushed the feat/telemetry-e2e-test branch from 207fce3 to 79bc61a Compare November 21, 2025 21:41
Copy link
Member

@dbolduc dbolduc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the initial headers for the interceptor needs more thought, but maybe not for this PR.

// Wait for the first refresh to complete.
// We ignore the result because if the sender is dropped (unlikely),
// the interceptor will just fail requests, which is the correct behavior.
let _ = rx.changed().await;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only send headers in the case of a success, so this hangs on bad credentials, right?

Err(e) => {
tracing::warn!("Failed to refresh GCP credentials: {e:?}");
sleep(ERROR_RETRY_DELAY).await;
}

So should we:

match headers {
  // omitted...
  Err(e) if e.is_transient() => {
     tracing::warn!("Failed to refresh GCP credentials: {e:?}"); 
     sleep(ERROR_RETRY_DELAY).await; 
  },
  Err(e) => break,
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, seems like this can't get hit anymore:

// If the first refresh hasn't completed yet, fail the request.
// The OTLP exporter is expected to handle this transient failure
// with its built-in retry mechanism.
Status::unauthenticated("GCP credentials not yet available")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting complicated enough that we should probably defer it in a separate PR.

);

// 4. Force flush to ensure spans are sent.
let _ = provider.force_flush();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: can we use ?. Are you trying to avoid workaround Timeout errors? https://docs.rs/opentelemetry_sdk/latest/opentelemetry_sdk/error/enum.OTelSdkError.html#variant.Timeout

@westarle westarle force-pushed the feat/telemetry-e2e-test branch from 79bc61a to 566fd1d Compare November 24, 2025 18:54
@westarle westarle enabled auto-merge (squash) November 24, 2025 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants