Skip to content

Commit 88e3a3b

Browse files
xgwangxgwNathanHB
authored
change tokenizer to pad to 'longest' sequence, instead of 'max_length' (#669)
otherwise, the response length is always 1 which is unexpected Co-authored-by: xgw <[email protected]> Co-authored-by: Nathan Habib <[email protected]>
1 parent 5120c58 commit 88e3a3b

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/lighteval/models/transformers/transformers_model.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -578,7 +578,7 @@ def greedy_until(
578578
tokenized = self.tokenizer(
579579
context,
580580
truncation="longest_first", # we truncate to the model max length if needed
581-
padding="max_length", # we pad to the longest sequence
581+
padding="longest", # we pad to the longest sequence
582582
return_tensors="pt",
583583
max_length=max_context_continuation_size_allowed, # we always allow minimum one token of generation
584584
add_special_tokens=self.add_special_tokens,

0 commit comments

Comments
 (0)