Skip to content

Conversation

@jan-elastic
Copy link
Contributor

@jan-elastic jan-elastic commented Jan 16, 2025

Some integration tests attempt to download ML models of >100MB from ml-models.elastic.co. This may fail for various reasons, leading to these tests being muted.

In order to fix this, this PR spins up a simple HTTP server on localhost, which serves tiny versions of these models, and uses that server in the integration tests.

In the process, two bugs are also fixed:

  • downloading models that are smaller than a few MB
  • dynamically changing the model repo URL

Closes: #113950 #113983 #114023 #114239 #114913 #115361 #116140 #116142

@jan-elastic jan-elastic added >test-failure Triaged test failures from CI :ml Machine learning Team:ML Meta label for the ML team v9.0.0 v8.18.0 labels Jan 16, 2025
@jan-elastic jan-elastic requested a review from davidkyle January 16, 2025 11:05
@jan-elastic jan-elastic marked this pull request as draft January 16, 2025 11:05
@jan-elastic jan-elastic force-pushed the test-ml-model-server branch 5 times, most recently from af8f5d2 to 6f54f4f Compare January 20, 2025 11:05
@jan-elastic jan-elastic marked this pull request as ready for review January 20, 2025 11:08
@elasticsearchmachine elasticsearchmachine added the needs:risk Requires assignment of a risk label (low, medium, blocker) label Jan 20, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@jan-elastic jan-elastic requested a review from wwang500 January 20, 2025 14:23
apply plugin: 'elasticsearch.internal-java-rest-test'

dependencies {
javaRestTestImplementation project(path: xpackModule('core'))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to get XPackSettings.ML_NATIVE_CODE_PLATFORMS into the model server

Copy link

@wwang500 wwang500 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, but I will let Dave give the final LGTM as my knowledge about integration test is limited here.

Just one question:

  • are those unmuted integration tests only running in locally, right? If those tests will run against a remote cluster, like in MKI or ECH, they will fail I guess.

@jan-elastic
Copy link
Contributor Author

are those unmuted integration tests only running in locally, right? If those tests will run against a remote cluster, like in MKI or ECH, they will fail I guess.

Yes, this is running locally or on Bulidkite, not vs remote clusters etc.

@davidkyle davidkyle added >test Issues or PRs that are addressing/adding tests and removed needs:risk Requires assignment of a risk label (low, medium, blocker) >test-failure Triaged test failures from CI labels Jan 21, 2025
Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jan-elastic jan-elastic added auto-backport Automatically create backport pull requests when merged v8.16.3 v8.17.2 labels Jan 22, 2025
@jan-elastic jan-elastic merged commit 6fd99c6 into main Jan 22, 2025
17 checks passed
@jan-elastic jan-elastic deleted the test-ml-model-server branch January 22, 2025 08:55
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.x Commit could not be cherrypicked due to conflicts
8.16 Commit could not be cherrypicked due to conflicts
8.17 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 120270

jan-elastic added a commit that referenced this pull request Jan 22, 2025
* Fix model downloading for very small models.

* Test MlModelServer

* Tiny ELSER

* unmute TextEmbeddingCrudIT and DefaultEndPointsIT

* update ELSER

* Improve MlModelServer

* tiny E5

* more logging

* improved E5 model

* tiny reranker

* scan for ports

* [CI] Auto commit changes from spotless

* Serve default models when optimized model is requested

* @ClassRule

* polish code

* Respect dynamic setting ML model repo

* fix metadata for optimized models

* improve logging

---------

Co-authored-by: elasticsearchmachine <[email protected]>
jan-elastic added a commit that referenced this pull request Jan 22, 2025
* Fix model downloading for very small models.

* Test MlModelServer

* Tiny ELSER

* unmute TextEmbeddingCrudIT and DefaultEndPointsIT

* update ELSER

* Improve MlModelServer

* tiny E5

* more logging

* improved E5 model

* tiny reranker

* scan for ports

* [CI] Auto commit changes from spotless

* Serve default models when optimized model is requested

* @ClassRule

* polish code

* Respect dynamic setting ML model repo

* fix metadata for optimized models

* improve logging

---------

Co-authored-by: elasticsearchmachine <[email protected]>
jan-elastic added a commit that referenced this pull request Jan 22, 2025
* Fix model downloading for very small models.

* Test MlModelServer

* Tiny ELSER

* unmute TextEmbeddingCrudIT and DefaultEndPointsIT

* update ELSER

* Improve MlModelServer

* tiny E5

* more logging

* improved E5 model

* tiny reranker

* scan for ports

* [CI] Auto commit changes from spotless

* Serve default models when optimized model is requested

* @ClassRule

* polish code

* Respect dynamic setting ML model repo

* fix metadata for optimized models

* improve logging

---------

Co-authored-by: elasticsearchmachine <[email protected]>
jan-elastic added a commit that referenced this pull request Jan 22, 2025
* Fix model downloading for very small models.

* Test MlModelServer

* Tiny ELSER

* unmute TextEmbeddingCrudIT and DefaultEndPointsIT

* update ELSER

* Improve MlModelServer

* tiny E5

* more logging

* improved E5 model

* tiny reranker

* scan for ports

* [CI] Auto commit changes from spotless

* Serve default models when optimized model is requested

* @ClassRule

* polish code

* Respect dynamic setting ML model repo

* fix metadata for optimized models

* improve logging

---------

Co-authored-by: elasticsearchmachine <[email protected]>
jan-elastic added a commit that referenced this pull request Jan 22, 2025
* Fix model downloading for very small models.

* Test MlModelServer

* Tiny ELSER

* unmute TextEmbeddingCrudIT and DefaultEndPointsIT

* update ELSER

* Improve MlModelServer

* tiny E5

* more logging

* improved E5 model

* tiny reranker

* scan for ports

* [CI] Auto commit changes from spotless

* Serve default models when optimized model is requested

* @ClassRule

* polish code

* Respect dynamic setting ML model repo

* fix metadata for optimized models

* improve logging

---------

Co-authored-by: elasticsearchmachine <[email protected]>
jan-elastic added a commit that referenced this pull request Jan 22, 2025
* Fix model downloading for very small models.

* Test MlModelServer

* Tiny ELSER

* unmute TextEmbeddingCrudIT and DefaultEndPointsIT

* update ELSER

* Improve MlModelServer

* tiny E5

* more logging

* improved E5 model

* tiny reranker

* scan for ports

* [CI] Auto commit changes from spotless

* Serve default models when optimized model is requested

* @ClassRule

* polish code

* Respect dynamic setting ML model repo

* fix metadata for optimized models

* improve logging

---------

Co-authored-by: elasticsearchmachine <[email protected]>
elasticsearchmachine pushed a commit that referenced this pull request Jan 22, 2025
* Test ML model server (#120270)

* Fix model downloading for very small models.

* Test MlModelServer

* Tiny ELSER

* unmute TextEmbeddingCrudIT and DefaultEndPointsIT

* update ELSER

* Improve MlModelServer

* tiny E5

* more logging

* improved E5 model

* tiny reranker

* scan for ports

* [CI] Auto commit changes from spotless

* Serve default models when optimized model is requested

* @ClassRule

* polish code

* Respect dynamic setting ML model repo

* fix metadata for optimized models

* improve logging

---------

Co-authored-by: elasticsearchmachine <[email protected]>

* backport HttpHeaderParser

---------

Co-authored-by: elasticsearchmachine <[email protected]>
elasticsearchmachine pushed a commit that referenced this pull request Jan 22, 2025
* Test ML model server (#120270)

* Fix model downloading for very small models.

* Test MlModelServer

* Tiny ELSER

* unmute TextEmbeddingCrudIT and DefaultEndPointsIT

* update ELSER

* Improve MlModelServer

* tiny E5

* more logging

* improved E5 model

* tiny reranker

* scan for ports

* [CI] Auto commit changes from spotless

* Serve default models when optimized model is requested

* @ClassRule

* polish code

* Respect dynamic setting ML model repo

* fix metadata for optimized models

* improve logging

---------

Co-authored-by: elasticsearchmachine <[email protected]>

* backport HttpHeaderParser

* Fix stripping platform

---------

Co-authored-by: elasticsearchmachine <[email protected]>
elasticsearchmachine pushed a commit that referenced this pull request Jan 22, 2025
* Test ML model server (#120270)

* Fix model downloading for very small models.

* Test MlModelServer

* Tiny ELSER

* unmute TextEmbeddingCrudIT and DefaultEndPointsIT

* update ELSER

* Improve MlModelServer

* tiny E5

* more logging

* improved E5 model

* tiny reranker

* scan for ports

* [CI] Auto commit changes from spotless

* Serve default models when optimized model is requested

* @ClassRule

* polish code

* Respect dynamic setting ML model repo

* fix metadata for optimized models

* improve logging

---------

Co-authored-by: elasticsearchmachine <[email protected]>

* backport HttpHeaderParser

* Fix stripping platform

---------

Co-authored-by: elasticsearchmachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending :ml Machine learning Team:ML Meta label for the ML team >test Issues or PRs that are addressing/adding tests v8.16.3 v8.17.2 v8.18.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] TextEmbeddingCrudIT testPutE5Small_withPlatformSpecificVariant failing

5 participants