Skip to content

Conversation

gabe-l-hart
Copy link
Collaborator

Support setting token_type_count from "type_vocab_size"

This matches the key in common bert-based embedding models and may have a value other than 1 in it.

…vocab_size"

This matches the key in common bert-based embedding models and may have a
value other than 1 in it.

Branch: XLMRobertaTypeVocabSize

Signed-off-by: Gabe Goodhart <[email protected]>
@github-actions github-actions bot added the python python script changes label Nov 22, 2024
@ggerganov ggerganov merged commit 9336db4 into ggml-org:master Nov 24, 2024
9 checks passed
@gabe-l-hart gabe-l-hart deleted the XLMRobertaTypeVocabSize branch November 25, 2024 18:58
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Dec 20, 2024
This matches the key in common bert-based embedding models and may have a
value other than 1 in it.

Branch: XLMRobertaTypeVocabSize

Signed-off-by: Gabe Goodhart <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants