Skip to content

Conversation

@kderusso
Copy link
Member

The boundary for chunking settings was set at a max chunk size of 300, which is too small for many models with large window sizes. This PR removes that cap, and instead enforces a cap for ELSER models so they are not truncated.

@elasticsearchmachine elasticsearchmachine added the Team:ML Meta label for the ML team label Aug 28, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @kderusso, I've created a changelog YAML for you.

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@kderusso kderusso requested review from davidkyle and jimczi August 28, 2025 14:09
ChunkingSettings chunkingSettings
) {
super(inferenceEntityId, taskType, service, serviceSettings, taskSettings, chunkingSettings);
if (chunkingSettings != null && chunkingSettings.maxChunkSize() != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the same check to MultilingualE5SmallModel so multilingual-e5-small will also pick up the restriction.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in f19f2ff

return field;
}

public static Integer extractRequiredPositiveIntegerGreaterThanOrEqualToMin(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add some tests for this new method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in f104004

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kderusso kderusso merged commit eb294f4 into elastic:main Aug 29, 2025
33 checks passed
JeremyDahlgren pushed a commit to JeremyDahlgren/elasticsearch that referenced this pull request Aug 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :ml Machine learning Team:ML Meta label for the ML team v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants