Replies: 1 comment
-
Ahh, it seems to work on GPU, so it could be that |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there,
I'm have a custom model that uses a TransformerDecoder module and a boolean causal mask, as follows:
This works fine on both CPU and GPU, however, I'm trying to use mixed precision for training, as follows:
and I'm running into the following error:
RuntimeError: Expected attn_mask dtype to be bool or to match query dtype, but got attn_mask.dtype: float and query.dtype: c10::BFloat16 instead.
.As can be seen in the code above, I've tried casting the mask to
bool
everywhere, as I think that's compatible withbfloat16
(I may be mistaken).I've also tried using:
but getting the same error.
Any ideas appreciated, thanks!
Beta Was this translation helpful? Give feedback.
All reactions