-
Notifications
You must be signed in to change notification settings - Fork 649
Open
Description
This line here appears strange:
Line 1329 in e22a34b
| x = (resid_lambdas[0] + x0_lambdas[0]) * x + bigram_lambdas[0] * x0_bigram |
Does it really need 2 lambdas for input x on layer 0?
What about using just 1 lambda?
x = x0_lambdas[0] * x + bigram_lambdas[0] * x0_bigram Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels