Skip to content

Commit 0d4ef2f

Browse files
Update phishing_email_detection_gpt2.py
Workaround to fix mismatch between gp2_tokenizer.tokenizer.vocab_size (which is misconfigured) and len(gp2_tokenizer.tokenizer)...
1 parent 1e6c409 commit 0d4ef2f

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

phishing_email_detection_gpt2.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -408,7 +408,7 @@ def from_config(cls, config):
408408

409409
inp = tf.keras.layers.Input(shape=(), dtype=tf.string)
410410
gp2_tokenizer = NewTokenizerLayer(max_seq_length=max_seq_length,tokenizer_checkpoint=tokenizer_checkpoint)
411-
VOCABULARY_SIZE = gp2_tokenizer.tokenizer.vocab_size
411+
VOCABULARY_SIZE = len(gp2_tokenizer.tokenizer)
412412
tokens = gp2_tokenizer(inp)
413413

414414
# On larger hardware, this could probably be increased considerably and

0 commit comments

Comments
 (0)