Skip to content

Commit 6cabb37

Browse files
Merge pull request #2156 from mohantym:new_api_load_data
PiperOrigin-RevId: 491664565
2 parents e4445ee + 1b57c12 commit 6cabb37

File tree

1 file changed

+5
-6
lines changed

1 file changed

+5
-6
lines changed

site/en/tutorials/load_data/csv.ipynb

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1066,7 +1066,7 @@
10661066
"source": [
10671067
"There is some overhead to parsing the CSV data. For small models this can be the bottleneck in training.\n",
10681068
"\n",
1069-
"Depending on your use case, it may be a good idea to use `Dataset.cache` or `tf.data.experimental.snapshot`, so that the CSV data is only parsed on the first epoch.\n",
1069+
"Depending on your use case, it may be a good idea to use `Dataset.cache` or `tf.data.Dataset.snapshot`, so that the CSV data is only parsed on the first epoch.\n",
10701070
"\n",
10711071
"The main difference between the `cache` and `snapshot` methods is that `cache` files can only be used by the TensorFlow process that created them, but `snapshot` files can be read by other processes.\n",
10721072
"\n",
@@ -1120,7 +1120,7 @@
11201120
"id": "wN7uUBjmgNZ9"
11211121
},
11221122
"source": [
1123-
"Note: The `tf.data.experimental.snapshot` files are meant for *temporary* storage of a dataset while in use. This is *not* a format for long term storage. The file format is considered an internal detail, and not guaranteed between TensorFlow versions."
1123+
"Note: The `tf.data.Dataset.snapshot` files are meant for *temporary* storage of a dataset while in use. This is *not* a format for long term storage. The file format is considered an internal detail, and not guaranteed between TensorFlow versions."
11241124
]
11251125
},
11261126
{
@@ -1132,8 +1132,7 @@
11321132
"outputs": [],
11331133
"source": [
11341134
"%%time\n",
1135-
"snapshot = tf.data.experimental.snapshot('titanic.tfsnap')\n",
1136-
"snapshotting = traffic_volume_csv_gz_ds.apply(snapshot).shuffle(1000)\n",
1135+
"snapshotting = traffic_volume_csv_gz_ds.snapshot('titanic.tfsnap').shuffle(1000)\n",
11371136
"\n",
11381137
"for i, (batch, label) in enumerate(snapshotting.shuffle(1000).repeat(20)):\n",
11391138
" if i % 40 == 0:\n",
@@ -1147,7 +1146,7 @@
11471146
"id": "fUSSegnMCGRz"
11481147
},
11491148
"source": [
1150-
"If your data loading is slowed by loading CSV files, and `Dataset.cache` and `tf.data.experimental.snapshot` are insufficient for your use case, consider re-encoding your data into a more streamlined format."
1149+
"If your data loading is slowed by loading CSV files, and `Dataset.cache` and `tf.data.Dataset.snapshot` are insufficient for your use case, consider re-encoding your data into a more streamlined format."
11511150
]
11521151
},
11531152
{
@@ -1862,7 +1861,7 @@
18621861
"source": [
18631862
"For another example of increasing CSV performance by using large batches, refer to the [Overfit and underfit tutorial](../keras/overfit_and_underfit.ipynb).\n",
18641863
"\n",
1865-
"This sort of approach may work, but consider other options like `Dataset.cache` and `tf.data.experimental.snapshot`, or re-encoding your data into a more streamlined format."
1864+
"This sort of approach may work, but consider other options like `Dataset.cache` and `tf.data.Dataset.snapshot`, or re-encoding your data into a more streamlined format."
18661865
]
18671866
}
18681867
],

0 commit comments

Comments
 (0)