Skip to content

Conversation

DamonFool
Copy link
Contributor

Thanks @fairydreaming we can run T5 models with llama.cpp.
However, current implementation seems to only support equal encoder-decoder layers.
And we've seen T5 models with different encoder-decoder layers which fail to run with llama.cpp.

Since HF transformers can run T5 models with different encoder-decoder layers very well, it would be better to also support them in llama.cpp.

Thanks.

@github-actions github-actions bot added the python python script changes label Sep 10, 2025
@DamonFool
Copy link
Contributor Author

This PR is an extension of #8141 which adds support of T5 and FLAN-T5 model families with different encoder-decoder layers.

@CISC
Copy link
Collaborator

CISC commented Sep 10, 2025

Sooo, I started reviewing this, but when checking HF for encoder-only T5 models it turns out that all of them (the ones I could find at least) has num_decoder_layers set to the same value as num_layers and even is_encoder_decoder set to true, however they do not have any decoder layers!

I think the proper solution here is to just make the decoder layers not required and then simply check that they are set before doing the decoder part.

@DamonFool
Copy link
Contributor Author

Sooo, I started reviewing this, but when checking HF for encoder-only T5 models it turns out that all of them (the ones I could find at least) has num_decoder_layers set to the same value as num_layers and even is_encoder_decoder set to true, however they do not have any decoder layers!

I think the proper solution here is to just make the decoder layers not required and then simply check that they are set before doing the decoder part.

Thanks @CISC for your review.

We have LLM_ARCH_T5ENCODER for encoder-only T5 models and LLM_ARCH_T5 for encoder-decoder T5 models.
This pr aims at extending the support of LLM_ARCH_T5 for encoder-decoder T5 models.
It shouldn't change the runtime behavior of encoder-only T5 models after this patch (except that num_decoder_layers may be added in the gguf file).

The issue you mentioned above is about LLM_ARCH_T5ENCODER for encoder-only T5 models.
I would suggest fixing it in a separate PR if it really maters.

What do you think?
Thanks.

@CISC
Copy link
Collaborator

CISC commented Sep 10, 2025

The issue you mentioned above is about LLM_ARCH_T5ENCODER for encoder-only T5 models. I would suggest fixing it in a separate PR if it really maters.

I see, I'll continue the review then, do you have links to some example models?

@DamonFool
Copy link
Contributor Author

I see, I'll continue the review then, do you have links to some example models?

Sorry, I didn't find a T5 model with different encoder-decoder layers in the open source world.

DamonFool and others added 22 commits September 10, 2025 18:46
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
@CISC
Copy link
Collaborator

CISC commented Sep 10, 2025

🤣 Poor CI (it's faster/easier to add changes to batch, then commit).

Co-authored-by: Sigbjørn Skjæret <[email protected]>
@DamonFool
Copy link
Contributor Author

Thanks @CISC for your excellent review!
All your improvements have been applied.

And it seem much better now.

I will do some check and feed back here.

@CISC
Copy link
Collaborator

CISC commented Sep 10, 2025

I will do some check and feed back here.

Thank you.

@DamonFool
Copy link
Contributor Author

I will do some check and feed back here.

I tested t5-small, flan-t5-small and a local t5 model (4-encoder-layer & 2-decoder-layer).
All passed.

@CISC CISC merged commit 4f65885 into ggml-org:master Sep 10, 2025
52 checks passed
njsyw1997 pushed a commit to aizip/llama.cpp that referenced this pull request Sep 10, 2025
…rs (ggml-org#15909)

* Extend the support of T5 models with different encoder-decoder layers

Signed-off-by: Jie Fu <[email protected]>

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update gguf-py/gguf/constants.py

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update gguf-py/gguf/gguf_writer.py

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-arch.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-arch.h

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-hparams.h

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Update src/llama-model.cpp

Co-authored-by: Sigbjørn Skjæret <[email protected]>

* Rename n_dec_layer --> dec_n_layer

Signed-off-by: Jie Fu <[email protected]>

* Adapt to cases when dec_n_layer > n_layer

Signed-off-by: Jie Fu <[email protected]>

---------

Signed-off-by: Jie Fu <[email protected]>
Co-authored-by: Sigbjørn Skjæret <[email protected]>
@DamonFool
Copy link
Contributor Author

Thanks @CISC and @ggerganov for your help.

@DamonFool DamonFool deleted the t5-enhancement branch September 11, 2025 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants