Skip to content

Conversation

tdakhran
Copy link
Contributor

@tdakhran tdakhran commented Aug 21, 2025

Batched minor improvements into the LFM2 family into a single PR

  • Add LFM2-VL into readme
  • Increase the maximum number of image tokens for LFM2-VL to 1024 to improve OCR
  • Support untied embeddings for LFM2 dense to unlock conversion of PTQ quantized checkpoints with untied embeddings

@github-actions github-actions bot added examples python python script changes labels Aug 21, 2025
output = create_tensor(tn(LLM_TENSOR_OUTPUT, "weight"), {n_embd, n_vocab}, TENSOR_NOT_REQUIRED);

if (output == NULL) {
output = create_tensor(tn(LLM_TENSOR_TOKEN_EMBD, "weight"), {n_embd, n_vocab}, TENSOR_DUPLICATED);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot to update llm_build_lfm2, it's still using model.tok_embd as output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice catch, probably got lost during preparing the upstream version, fixed in 9a8714b, thank you!

@CISC CISC merged commit e288693 into ggml-org:master Aug 22, 2025
39 of 50 checks passed
@tdakhran tdakhran deleted the tarek/lfm2_improvements branch August 22, 2025 07:42
qnixsynapse pushed a commit to menloresearch/llama.cpp that referenced this pull request Aug 25, 2025
* Support untied embeddings

* Increase number of image tokens to 1024

* Add LFM2-VL to readme

* Actually use untied embeddings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants