Unmute IngestCommonClientYamlTestSuiteIT #119188

samxbr · 2024-12-20T15:15:55Z

The entire test suite was muted due to a bunch of java.net.SocketTimeoutException: 60,000 milliseconds timeout on connection http-outgoing-1 [ACTIVE], possibly due to transient network issue. Unmuting seems fine.

Closes #118215

mattc58

+1 LGTM

dakrone

LGTM assuming CI is happy

elasticsearchmachine · 2024-12-23T16:04:39Z

Pinging @elastic/es-data-management (Team:Data Management)

nielsbauman · 2024-12-23T22:04:37Z

FWIW, timeouts in tests are often caused by test clusters crashing, so they're more often than not caused by actual issues rather than infrastructure blips. For instance, in the linked test issue, the first build failed due to Failure running machine learning native code. which we've seen before. Infra blips definitely happen from time to time, but it's usually worth double-checking if it's a blip or if something else happened.

The other builds are from PRs, which can be tricky because a PR might have introduced breaking code (i.e. they were still a work in progress). For instance, the last and second to last builds contain a bunch of HTTP timeout exceptions, but those are builds of #117787 which changes a bunch of network stuff - and I am going to guess that that is no coincidence.

The second and third builds are even more tricky because they actually contain an ingest test failure: Failure at [ingest/310_reroute_processor:705]: field [hits.hits] doesn't have length [2]. Both builds originate from the same PR (#118143), so that makes it more likely that both of them are caused by the PR itself, but it could (theoretically) just be a coincidence so it's not a guarantee. If we look at all the occurrences in the last 30 days, we see one more failed build, but that turns out to be from the same PR too - it's the failed serverless check for that PR. Seeing as there are no other occurrences, we can say with reasonable confidence that these ingest failures were caused by the PR.

All that to say that I agree with the unmuting of the test suite 😄. But I hope this was still somewhat valuable - and otherwise I still had some fun investigating it.

Unmute IngestCommonClientYamlTestSuiteIT

72e303d

elasticsearchmachine added the v9.0.0 label Dec 20, 2024

samxbr added >test Issues or PRs that are addressing/adding tests :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP labels Dec 23, 2024

Merge branch 'main' into fix/test

7b41c64

mattc58 approved these changes Dec 23, 2024

View reviewed changes

dakrone approved these changes Dec 23, 2024

View reviewed changes

samxbr marked this pull request as ready for review December 23, 2024 16:04

elasticsearchmachine added the Team:Data Management Meta label for data/management team label Dec 23, 2024

samxbr merged commit d92233c into elastic:main Dec 23, 2024
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unmute IngestCommonClientYamlTestSuiteIT #119188

Unmute IngestCommonClientYamlTestSuiteIT #119188

Uh oh!

samxbr commented Dec 20, 2024 •

edited

Loading

Uh oh!

mattc58 left a comment

Uh oh!

dakrone left a comment

Uh oh!

elasticsearchmachine commented Dec 23, 2024

Uh oh!

Uh oh!

nielsbauman commented Dec 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Unmute IngestCommonClientYamlTestSuiteIT #119188

Unmute IngestCommonClientYamlTestSuiteIT #119188

Uh oh!

Conversation

samxbr commented Dec 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mattc58 left a comment

Choose a reason for hiding this comment

Uh oh!

dakrone left a comment

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Dec 23, 2024

Uh oh!

Uh oh!

nielsbauman commented Dec 23, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

samxbr commented Dec 20, 2024 •

edited

Loading