Failsafe RetryPolicy instrumentation added #15255

onurkybsi · 2025-11-10T06:01:57Z

Library instrumentation added for Failsafe's RetryPolicy.

laurit · 2025-11-14T16:33:39Z

.../library/src/main/java/io/opentelemetry/instrumentation/failsafe/v3_0/FailsafeTelemetry.java

+            .build();
+    LongHistogram attemptsHistogram =
+        meter
+            .histogramBuilder("failsafe.retry_policy.attempts")


I'm not sure using a histogram for this is justified. @trask could you provide guidance on this

trask · 2025-11-19T04:17:11Z

.../library/src/main/java/io/opentelemetry/instrumentation/failsafe/v3_0/FailsafeTelemetry.java

+            .setDescription("Histogram of number of attempts for each execution.")
+            .ofLongs()
+            .setExplicitBucketBoundariesAdvice(
+                LongStream.range(1, userConfig.getMaxAttempts() + 1)


@onurkybsi what's typical userConfig.getMaxAttempts()?

could you come up with a smallish static set, e.g. 1, 2, 5, 10, 20, 50?

also worth reading open-telemetry/semantic-conventions#316 (comment)

Hey @trask, userConfig.getMaxAttempts() returns the user configured maximum attempts allowed for the retry policy execution. So, if this value is 3, the possibilities would be like [1(execution succeeded without retry), 2(first retry), 3(last attempt as configured)]. And what is implemented is using this fact, i.e, one by one between 1 and the maximum attempt.

I didn't take having enormous numbers into the account maybe. Do you think we should? If so, I can refactor this part to build up a list which distributes the range(1 to maxAttempt) evenly considering a maximum number of buckets like 10. Maybe something like this:

private static List<Long> buildBoundaries(int maxNumOfBuckets, long maxNumOfAttempts) { List<Long> boundaries = new ArrayList<>(maxNumOfBuckets); boundaries.add(1L); double step = (double) (maxNumOfAttempts - 1) / (maxNumOfBuckets - 1); for (int i = 1; i < maxNumOfBuckets; i++) { long boundary = Math.min(Math.round(1 + step * i), maxNumOfAttempts); boundaries.add(boundary); } return boundaries.stream() .distinct() .sorted() .toList(); }

What do you think?

buckets are costly, so I'd try to keep the number small if possible, e.g. with gc duration metrics, we went with just 5 buckets: https://github.com/open-telemetry/semantic-conventions/blob/main/docs/runtime/jvm-metrics.md#metric-jvmgcduration

do you have any idea what are typical values for userConfig.getMaxAttempts()?

It's 3 as default in Failsafe and same for resilience4j. I think it wouldn't make sense to have a value more than 5 in most of the cases so maybe just [ 1, 2, 3, 5 ]. What do you say?

Sounds good

onurkybsi force-pushed the retry-policy branch from 17f361b to ee4ab3d Compare November 11, 2025 05:33

onurkybsi marked this pull request as ready for review November 11, 2025 05:33

onurkybsi requested a review from a team as a code owner November 11, 2025 05:33

laurit reviewed Nov 14, 2025

View reviewed changes

Failsafe RetryPolicy instrumentation added

e0715f9

onurkybsi force-pushed the retry-policy branch from ee4ab3d to e0715f9 Compare November 18, 2025 05:45

trask reviewed Nov 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Failsafe RetryPolicy instrumentation added #15255

Failsafe RetryPolicy instrumentation added #15255

onurkybsi commented Nov 10, 2025 •

edited

Loading

Uh oh!

laurit Nov 14, 2025

Uh oh!

trask Nov 19, 2025

Uh oh!

onurkybsi Nov 19, 2025 •

edited

Loading

Uh oh!

trask Nov 20, 2025

Uh oh!

onurkybsi Nov 20, 2025 •

edited

Loading

Uh oh!

trask Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Failsafe RetryPolicy instrumentation added #15255

Are you sure you want to change the base?

Failsafe RetryPolicy instrumentation added #15255

Conversation

onurkybsi commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

laurit Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

trask Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

onurkybsi Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

trask Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

onurkybsi Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

trask Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

onurkybsi commented Nov 10, 2025 •

edited

Loading

onurkybsi Nov 19, 2025 •

edited

Loading

onurkybsi Nov 20, 2025 •

edited

Loading