Skip to content
This repository was archived by the owner on Jan 21, 2025. It is now read-only.

Commit ecdee99

Browse files
ConchylicultorMesh TensorFlow Team
authored andcommitted
[tfds] Remove config name text datasets
As TFDS text encoding API has many bug, performance issues and is unsupported, we are cleaning up our datasets to remove all encoder builder config and only keep the plain text version of the datasets. To use encoder, please use `tensorflow_text` which is a more flexible, more supported, performant text encoding API. For forward compatibility, `tfds.load('ds/plain_text')` -> `tfds.load('ds')` will load the default plain text dataset. PiperOrigin-RevId: 326198682
1 parent 5a9d503 commit ecdee99

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

mesh_tensorflow/transformer/gin/problems/lm1b_untok.gin

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
import mesh_tensorflow.transformer.t2t_vocabulary
66

77
# Dataset
8-
dataset_name = "lm1b/plain_text"
8+
dataset_name = "lm1b"
99
utils.run.train_dataset_fn = @mesh_tensorflow.transformer.dataset.untokenized_tfds_dataset
1010
dataset.untokenized_tfds_dataset.dataset_name = %dataset_name
1111
dataset.untokenized_tfds_dataset.text2self = True

0 commit comments

Comments
 (0)