Commit b58b56e
fix GemmaBackbone.get_layout_map + test (#1669)
* fix to default GemmaBackbone.get_layout_map() to use fixed regexes as per keras-team/keras#19496 (comment)
* fix to default GemmaBackbone.get_layout_map() to use fixed regexes as per keras-team/keras#19496 (comment)
* Also fixing forgotten ffw_gating_2 in GemmaBackbone.get_layout_map. The sharding spec ("batch", "model") is the one that provides the best training performance. ("batch", "model") and (None, None) are slower (the first one by 40%, the second by 2%).
Fixing test too, including typo ffw_linearl => ffw_linear
* changed test_architecture_characteristics test to follow the 4->8 heads change necessary for the test to work on TPUs.
Also fixed formatting.
* Update gemma_backbone_test.py
Better test messages
---------
Co-authored-by: Matt Watson <[email protected]>1 parent e0efbc8 commit b58b56e
File tree
2 files changed
+42
-8
lines changed- keras_nlp/src/models/gemma
2 files changed
+42
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
255 | 255 | | |
256 | 256 | | |
257 | 257 | | |
258 | | - | |
| 258 | + | |
259 | 259 | | |
260 | 260 | | |
261 | 261 | | |
262 | 262 | | |
263 | | - | |
| 263 | + | |
264 | 264 | | |
265 | 265 | | |
266 | 266 | | |
267 | 267 | | |
268 | | - | |
269 | | - | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
270 | 271 | | |
271 | 272 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
28 | | - | |
| 27 | + | |
| 28 | + | |
29 | 29 | | |
30 | 30 | | |
31 | 31 | | |
| |||
82 | 82 | | |
83 | 83 | | |
84 | 84 | | |
85 | | - | |
| 85 | + | |
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
| |||
132 | 132 | | |
133 | 133 | | |
134 | 134 | | |
135 | | - | |
| 135 | + | |
136 | 136 | | |
137 | 137 | | |
138 | 138 | | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
0 commit comments