[Fix] Move attention mask to the model device type (#180)

BaiqingL · web-flow · commit 5021d941d95d · 2023-05-26T09:40:58.000-07:00
The attention mask needs to be on the same device as the rest of the model and inputs, or else there will be a device mismatch.
diff --git a/training/generate.py b/training/generate.py
@@ -136,7 +136,7 @@ def _forward(self, model_inputs, **generate_kwargs):
 
         generated_sequence = self.model.generate(
             input_ids=input_ids.to(self.model.device),
-            attention_mask=attention_mask,
+            attention_mask=attention_mask.to(self.model.device),
             pad_token_id=self.tokenizer.pad_token_id,
             **generate_kwargs,
         )

Original file line number	Diff line number	Diff line change
`@@ -136,7 +136,7 @@ def _forward(self, model_inputs, **generate_kwargs):`
`136`	`136`
`137`	`137`	`generated_sequence = self.model.generate(`
`138`	`138`	`input_ids=input_ids.to(self.model.device),`
`139`		`- attention_mask=attention_mask,`
	`139`	`+ attention_mask=attention_mask.to(self.model.device),`
`140`	`140`	`pad_token_id=self.tokenizer.pad_token_id,`
`141`	`141`	`**generate_kwargs,`
`142`	`142`	`)`