Skip to content

Try to reduce prometheus port conflict test flake#7086

Closed
jack-berg wants to merge 2 commits intoopen-telemetry:mainfrom
jack-berg:fix-prometheus-port-conflict
Closed

Try to reduce prometheus port conflict test flake#7086
jack-berg wants to merge 2 commits intoopen-telemetry:mainfrom
jack-berg:fix-prometheus-port-conflict

Conversation

@jack-berg
Copy link
Copy Markdown
Member

Example flake here: https://github.com/open-telemetry/opentelemetry-java/actions/runs/13184973874/job/36804837095?pr=7055#step:6:1659

I believe I've only noticed these on windows. Not sure why. Maybe the way windows assigns random available port is more prone to race conditions / more deterministic than other OS. I.e. perhaps between the time we ask for a available port and initialize a prometheus server with that port, another test initializes a different prometheus server with that same port.

Trying to mitigate that by adjusting tests to favor setting the port to 0, causing the implementation to assign an available port. And then adjusting test code flow to make assertions of this port.

@jack-berg jack-berg requested a review from a team as a code owner February 7, 2025 20:08
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 9, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.84%. Comparing base (19650df) to head (ce77eb4).
Report is 19 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #7086      +/-   ##
============================================
- Coverage     89.86%   89.84%   -0.02%     
+ Complexity     6613     6612       -1     
============================================
  Files           740      740              
  Lines         19991    19991              
  Branches       1966     1966              
============================================
- Hits          17964    17961       -3     
- Misses         1437     1439       +2     
- Partials        590      591       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@trask
Copy link
Copy Markdown
Member

trask commented Feb 9, 2025

heads up, I hit rerun, but it looks like it flaked beforehand: https://github.com/open-telemetry/opentelemetry-java/actions/runs/13207505490/job/36923855747?pr=7086

server -> {
assertThat(server.getAddress().getHostName()).isEqualTo("0:0:0:0:0:0:0:0");
assertThat(server.getAddress().getPort()).isEqualTo(9464);
assertThat(server.getAddress().getPort()).isPositive();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this fails, I'm going to have questions.. ;)

@jack-berg
Copy link
Copy Markdown
Member Author

heads up, I hit rerun, but it looks like it flaked beforehand

@trask what strategies does the instrumentation repo use for flaky tests? Ideally, I'd like to just be able to retry select tests a second time. The probability of failing twice in a row should be low enough to be tolerable.

@trask
Copy link
Copy Markdown
Member

trask commented Feb 10, 2025

@trask what strategies does the instrumentation repo use for flaky tests? Ideally, I'd like to just be able to retry select tests a second time. The probability of failing twice in a row should be low enough to be tolerable.

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/4395859daf0802344bb94e2b99dea67a6b69c01c/conventions/src/main/kotlin/otel.java-conventions.gradle.kts#L372-L375

@jack-berg jack-berg mentioned this pull request Feb 13, 2025
@jack-berg
Copy link
Copy Markdown
Member Author

Closing in favor of #7106

@jack-berg jack-berg closed this Feb 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants