Skip to content

Commit 51dc82e

Browse files
Update generative-proof-of-concept-CPU-preprocessing-in-memory.py
Fine details with tf.data.Dataset.
1 parent 408b915 commit 51dc82e

File tree

1 file changed

+5
-2
lines changed

1 file changed

+5
-2
lines changed

generative-proof-of-concept-CPU-preprocessing-in-memory.py

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1372,11 +1372,14 @@ def create_dataset(raw_text_samples, tokenizer, sample_expansion_batch_size=10)
13721372
tf.TensorSpec(shape=(1, VOCABULARY_SIZE), dtype=tf.float32) # Nested one-hot label
13731373
)
13741374
)
1375+
# Set dataset to allow multiple epochs:
1376+
dataset = dataset.repeat()
1377+
# Batch it
1378+
dataset = dataset.batch(batch_size)
13751379
return dataset
13761380

13771381
phase_i_b_dataset = create_dataset(raw_text_samples=phase_i_b_samples, tokenizer=tokenizer, sample_expansion_batch_size=10)
1378-
dataset = dataset.repeat()
1379-
dataset = dataset.batch(batch_size)
1382+
13801383

13811384

13821385
phase_i_b_history =\

0 commit comments

Comments
 (0)