Skip to content

Guard embedder inputs against maxPositionEmbeddings#163

Open
alankessler wants to merge 1 commit intoml-explore:mainfrom
alankessler:fix-embedder-truncation
Open

Guard embedder inputs against maxPositionEmbeddings#163
alankessler wants to merge 1 commit intoml-explore:mainfrom
alankessler:fix-embedder-truncation

Conversation

@alankessler
Copy link
Copy Markdown
Contributor

Fixes #62.

BERT models crash when input exceeds maxPositionEmbeddings because the position embedding table is fixed-size. Per the discussion between @davidkoski and @anishbasu, this truncates with a warning rather than crashing.

Also exposes maxPositionEmbeddings on the EmbeddingModel protocol (default nil, non-breaking) so callers who want to chunk or split long inputs can check the limit themselves.

Qwen3 and Gemma3 use RoPE — no change needed.

BERT models crash when input exceeds maxPositionEmbeddings because
the position embedding table is fixed-size. Truncate with a warning
rather than crashing.

Expose maxPositionEmbeddings on the EmbeddingModel protocol (default
nil, non-breaking) so callers can check the limit and pre-truncate
or chunk as needed.

Fixes ml-explore#62.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] MLXEmbedders need to respect max token counts

1 participant