Skip to content

Commit 1ddc4fa

Browse files
authored
update d_kv'annotation in mt5'configuration (#27585)
* update d_kv'annotation in mt5'configuration * update d_kv'annotation in mt5'configuration * update d_kv'annotation in mt5'configuration
1 parent 8aca43b commit 1ddc4fa

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

src/transformers/models/mt5/configuration_mt5.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,8 +40,8 @@ class MT5Config(PretrainedConfig):
4040
d_model (`int`, *optional*, defaults to 512):
4141
Size of the encoder layers and the pooler layer.
4242
d_kv (`int`, *optional*, defaults to 64):
43-
Size of the key, query, value projections per attention head. `d_kv` has to be equal to `d_model //
44-
num_heads`.
43+
Size of the key, query, value projections per attention head. In the conventional context, it is typically expected that `d_kv` has to be equal to `d_model // num_heads`.
44+
But in the architecture of mt5-small, `d_kv` is not equal to `d_model //num_heads`. The `inner_dim` of the projection layer will be defined as `num_heads * d_kv`.
4545
d_ff (`int`, *optional*, defaults to 1024):
4646
Size of the intermediate feed forward layer in each `T5Block`.
4747
num_layers (`int`, *optional*, defaults to 8):

0 commit comments

Comments
 (0)