T5Gemma

Support the [T5Gemma](https://arxiv.org/abs/2504.06225) architecture. Here's the basic idea from the [`transformers` T5Gemma documentation](https://huggingface.co/docs/transformers/main/model_doc/t5gemma):

> T5Gemma (aka encoder-decoder Gemma) was proposed in a [research paper](https://huggingface.co/papers/2504.06225) by Google. It is a family of encoder-decoder large language models, developed by adapting pretrained decoder-only models into encoder-decoder. T5Gemma includes pretrained and instruction-tuned variants. The architecture is based on transformer encoder-decoder design following T5, with improvements from Gemma 2: GQA, RoPE, GeGLU activation, RMSNorm, and interleaved local/global attention.

This architecture modernizes and improves upon T5, by blending the improved performance of modern Gemma models with the enhanced efficiency of the encoder-decoder architecture.

For reference, here are the PR that merged model support for this architecture into `transformers`:

- [T5Gemma PR](https://github.com/huggingface/transformers/pull/38332)

It might also be valuable to see how recent models were added this package:

- [Gemma3](https://github.com/OpenNMT/CTranslate2/pull/1936)
- [Qwen3](https://github.com/OpenNMT/CTranslate2/pull/1943)

Once this is implemented, a user should be able to use the converter to convert a model and run inference. The following model should be openly available for testing purposes:

https://huggingface.co/harshaljanjani/tiny-t5gemma-test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

T5Gemma #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

T5Gemma #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions