Commit 71f8631
committed
Clarifies mask/bias shapes; updates gradient docs
Corrects tensor shape examples to use 1-sized broadcastable dims for batch, heads, query, and key, improving clarity and avoiding invalid 0-length notation.
Removes “dbias” jargon from the gradient description for clearer wording.
Syncs English and Chinese documentation on shape semantics.1 parent 755e0a4 commit 71f8631
2 files changed
+5
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
| 21 | + | |
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| |||
236 | 236 | | |
237 | 237 | | |
238 | 238 | | |
239 | | - | |
| 239 | + | |
240 | 240 | | |
241 | | - | |
| 241 | + | |
242 | 242 | | |
243 | 243 | | |
244 | 244 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
| 21 | + | |
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| |||
236 | 236 | | |
237 | 237 | | |
238 | 238 | | |
239 | | - | |
| 239 | + | |
240 | 240 | | |
241 | 241 | | |
242 | 242 | | |
| |||
0 commit comments