Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 20 additions & 20 deletions docs/cli.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -26,20 +26,20 @@
"id": "grQeV-PZroqn"
},
"source": [
"\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://www.tensorflow.org/datasets/cli\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" /\u003eView on TensorFlow.org\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/cli.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" /\u003eRun in Google Colab\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://github.com/tensorflow/datasets/blob/master/docs/cli.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" /\u003eView source on GitHub\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca href=\"https://storage.googleapis.com/tensorflow_docs/datasets/docs/cli.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/download_logo_32px.png\" /\u003eDownload notebook\u003c/a\u003e\n",
" \u003c/td\u003e\n",
"\u003c/table\u003e"
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://www.tensorflow.org/datasets/cli\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
" </td>\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/cli.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
" </td>\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://github.com/tensorflow/datasets/blob/master/docs/cli.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
" </td>\n",
" <td>\n",
" <a href=\"https://storage.googleapis.com/tensorflow_docs/datasets/docs/cli.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",
" </td>\n",
"</table>"
]
},
{
Expand Down Expand Up @@ -115,7 +115,7 @@
"## `tfds new`: Implementing a new Dataset\n",
"\n",
"This command will help you kickstart writing your new Python dataset by creating\n",
"a `\u003cdataset_name\u003e/` directory containing default implementation files.\n",
"a `<dataset_name>/` directory containing default implementation files.\n",
"\n",
"Usage:"
]
Expand Down Expand Up @@ -185,20 +185,20 @@
"source": [
"## `tfds build`: Download and prepare a dataset\n",
"\n",
"Use `tfds build \u003cmy_dataset\u003e` to generate a new dataset. `\u003cmy_dataset\u003e` can be:\n",
"Use `tfds build <my_dataset>` to generate a new dataset. `<my_dataset>` can be:\n",
"\n",
"* A path to `dataset/` folder or `dataset.py` file (empty for current directory):\n",
" * `tfds build datasets/my_dataset/`\n",
" * `cd datasets/my_dataset/ \u0026\u0026 tfds build`\n",
" * `cd datasets/my_dataset/ \u0026\u0026 tfds build my_dataset`\n",
" * `cd datasets/my_dataset/ \u0026\u0026 tfds build my_dataset.py`\n",
" * `cd datasets/my_dataset/ && tfds build`\n",
" * `cd datasets/my_dataset/ && tfds build my_dataset`\n",
" * `cd datasets/my_dataset/ && tfds build my_dataset.py`\n",
"\n",
"* A registered dataset:\n",
"\n",
" * `tfds build mnist`\n",
" * `tfds build my_dataset --imports my_project.datasets`\n",
"\n",
"Note: `tfds build` has useful flags to help prototyping and debuging. See the `Debug \u0026 tests:` section bellow.\n",
"Note: `tfds build` has useful flags to help prototyping and debuging. See the `Debug & tests:` section bellow.\n",
"\n",
"Available options:"
]
Expand Down
28 changes: 14 additions & 14 deletions docs/dataset_collections.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -46,20 +46,20 @@
"id": "LpO0um1nez_q"
},
"source": [
"\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://www.tensorflow.org/datasets/dataset_collections\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" /\u003eView on TensorFlow.org\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/dataset_collections.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" /\u003eRun in Google Colab\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://github.com/tensorflow/datasets/blob/master/docs/dataset_collections.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" /\u003eView on GitHub\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca href=\"https://storage.googleapis.com/tensorflow_docs/datasets/docs/dataset_collections.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/download_logo_32px.png\" /\u003eDownload notebook\u003c/a\u003e\n",
" \u003c/td\u003e\n",
"\u003c/table\u003e"
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://www.tensorflow.org/datasets/dataset_collections\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
" </td>\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/dataset_collections.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
" </td>\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://github.com/tensorflow/datasets/blob/master/docs/dataset_collections.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View on GitHub</a>\n",
" </td>\n",
" <td>\n",
" <a href=\"https://storage.googleapis.com/tensorflow_docs/datasets/docs/dataset_collections.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",
" </td>\n",
"</table>"
]
},
{
Expand Down
34 changes: 17 additions & 17 deletions docs/determinism.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -46,20 +46,20 @@
"id": "gLgkbSCbTHGT"
},
"source": [
"\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://www.tensorflow.org/datasets/determinism\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" /\u003eView on TensorFlow.org\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/determinism.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" /\u003eRun in Google Colab\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://github.com/tensorflow/datasets/blob/master/docs/determinism.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" /\u003eView on GitHub\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca href=\"https://storage.googleapis.com/tensorflow_docs/datasets/docs/determinism.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/download_logo_32px.png\" /\u003eDownload notebook\u003c/a\u003e\n",
" \u003c/td\u003e\n",
"\u003c/table\u003e"
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://www.tensorflow.org/datasets/determinism\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
" </td>\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/determinism.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
" </td>\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://github.com/tensorflow/datasets/blob/master/docs/determinism.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View on GitHub</a>\n",
" </td>\n",
" <td>\n",
" <a href=\"https://storage.googleapis.com/tensorflow_docs/datasets/docs/determinism.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",
" </td>\n",
"</table>"
]
},
{
Expand Down Expand Up @@ -171,7 +171,7 @@
" take: int,\n",
" skip: int = None,\n",
" **as_dataset_kwargs,\n",
") -\u003e None:\n",
") -> None:\n",
" \"\"\"Print the example ids from the given dataset split.\"\"\"\n",
" ds = load_dataset(builder, **as_dataset_kwargs)\n",
" if skip:\n",
Expand All @@ -181,7 +181,7 @@
" exs = [id_to_int(tfds_id, builder=builder) for tfds_id in exs]\n",
" print(exs)\n",
"\n",
"def id_to_int(tfds_id: str, builder) -\u003e str:\n",
"def id_to_int(tfds_id: str, builder) -> str:\n",
" \"\"\"Format the tfds_id in a more human-readable.\"\"\"\n",
" match = re.match(r'\\w+-(\\w+).\\w+-(\\d+)-of-\\d+__(\\d+)', tfds_id)\n",
" split_name, shard_id, ex_id = match.groups()\n",
Expand Down Expand Up @@ -319,7 +319,7 @@
"id": "gAJTLLsuFeuP"
},
"source": [
"Note: Setting `shuffle_files=True` also [disable](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/core/dataset_builder.py?l=676\u0026rcl=354322021) `deterministic` in [`tf.data.Options`](https://www.tensorflow.org/api_docs/python/tf/data/Options) to give some performance boost. So even small datasets which only have a single shard (like mnist), become non-deterministic.\n",
"Note: Setting `shuffle_files=True` also [disable](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/core/dataset_builder.py?l=676&rcl=354322021) `deterministic` in [`tf.data.Options`](https://www.tensorflow.org/api_docs/python/tf/data/Options) to give some performance boost. So even small datasets which only have a single shard (like mnist), become non-deterministic.\n",
"\n",
"See recipe below to get deterministic file shuffling."
]
Expand Down
34 changes: 17 additions & 17 deletions docs/keras_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -26,20 +26,20 @@
"id": "OGw9EgE0tC0C"
},
"source": [
"\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://www.tensorflow.org/datasets/keras_example\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" /\u003eView on TensorFlow.org\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/keras_example.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" /\u003eRun in Google Colab\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://github.com/tensorflow/datasets/blob/master/docs/keras_example.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" /\u003eView source on GitHub\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca href=\"https://storage.googleapis.com/tensorflow_docs/datasets/docs/keras_example.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/download_logo_32px.png\" /\u003eDownload notebook\u003c/a\u003e\n",
" \u003c/td\u003e\n",
"\u003c/table\u003e"
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://www.tensorflow.org/datasets/keras_example\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
" </td>\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/keras_example.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
" </td>\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://github.com/tensorflow/datasets/blob/master/docs/keras_example.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
" </td>\n",
" <td>\n",
" <a href=\"https://storage.googleapis.com/tensorflow_docs/datasets/docs/keras_example.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",
" </td>\n",
"</table>"
]
},
{
Expand Down Expand Up @@ -109,9 +109,9 @@
"Apply the following transformations:\n",
"\n",
"* `tf.data.Dataset.map`: TFDS provide images of type `tf.uint8`, while the model expects `tf.float32`. Therefore, you need to normalize images.\n",
"* `tf.data.Dataset.cache` As you fit the dataset in memory, cache it before shuffling for a better performance.\u003cbr/\u003e\n",
"* `tf.data.Dataset.cache` As you fit the dataset in memory, cache it before shuffling for a better performance.<br/>\n",
"__Note:__ Random transformations should be applied after caching.\n",
"* `tf.data.Dataset.shuffle`: For true randomness, set the shuffle buffer to the full dataset size.\u003cbr/\u003e\n",
"* `tf.data.Dataset.shuffle`: For true randomness, set the shuffle buffer to the full dataset size.<br/>\n",
"__Note:__ For large datasets that can't fit in memory, use `buffer_size=1000` if your system allows it.\n",
"* `tf.data.Dataset.batch`: Batch elements of the dataset after shuffling to get unique batches at each epoch.\n",
"* `tf.data.Dataset.prefetch`: It is good practice to end the pipeline by prefetching [for performance](https://www.tensorflow.org/guide/data_performance#prefetching)."
Expand All @@ -126,7 +126,7 @@
"outputs": [],
"source": [
"def normalize_img(image, label):\n",
" \"\"\"Normalizes images: `uint8` -\u003e `float32`.\"\"\"\n",
" \"\"\"Normalizes images: `uint8` -> `float32`.\"\"\"\n",
" return tf.cast(image, tf.float32) / 255., label\n",
"\n",
"ds_train = ds_train.map(\n",
Expand Down
38 changes: 19 additions & 19 deletions docs/overview.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -30,20 +30,20 @@
"id": "OGw9EgE0tC0C"
},
"source": [
"\u003ctable class=\"tfo-notebook-buttons\" align=\"left\"\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://www.tensorflow.org/datasets/overview\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" /\u003eView on TensorFlow.org\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/overview.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" /\u003eRun in Google Colab\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca target=\"_blank\" href=\"https://github.com/tensorflow/datasets/blob/master/docs/overview.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" /\u003eView source on GitHub\u003c/a\u003e\n",
" \u003c/td\u003e\n",
" \u003ctd\u003e\n",
" \u003ca href=\"https://storage.googleapis.com/tensorflow_docs/datasets/docs/overview.ipynb\"\u003e\u003cimg src=\"https://www.tensorflow.org/images/download_logo_32px.png\" /\u003eDownload notebook\u003c/a\u003e\n",
" \u003c/td\u003e\n",
"\u003c/table\u003e"
"<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://www.tensorflow.org/datasets/overview\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
" </td>\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/overview.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
" </td>\n",
" <td>\n",
" <a target=\"_blank\" href=\"https://github.com/tensorflow/datasets/blob/master/docs/overview.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
" </td>\n",
" <td>\n",
" <a href=\"https://storage.googleapis.com/tensorflow_docs/datasets/docs/overview.ipynb\"><img src=\"https://www.tensorflow.org/images/download_logo_32px.png\" />Download notebook</a>\n",
" </td>\n",
"</table>"
]
},
{
Expand Down Expand Up @@ -276,8 +276,8 @@
"\n",
"Uses `tfds.as_numpy` to convert:\n",
"\n",
"* `tf.Tensor` -\u003e `np.array`\n",
"* `tf.data.Dataset` -\u003e `Iterator[Tree[np.array]]` (`Tree` can be arbitrary nested `Dict`, `Tuple`)\n",
"* `tf.Tensor` -> `np.array`\n",
"* `tf.data.Dataset` -> `Iterator[Tree[np.array]]` (`Tree` can be arbitrary nested `Dict`, `Tuple`)\n",
"\n"
]
},
Expand Down Expand Up @@ -549,7 +549,7 @@
"source": [
"print(info.features[\"label\"].num_classes)\n",
"print(info.features[\"label\"].names)\n",
"print(info.features[\"label\"].int2str(7)) # Human readable version (8 -\u003e 'cat')\n",
"print(info.features[\"label\"].int2str(7)) # Human readable version (8 -> 'cat')\n",
"print(info.features[\"label\"].str2int('7'))"
]
},
Expand Down Expand Up @@ -675,10 +675,10 @@
"\n",
"To find out which urls to download, look into:\n",
"\n",
" * For new datasets (implemented as folder): [`tensorflow_datasets/`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/)`\u003ctype\u003e/\u003cdataset_name\u003e/checksums.tsv`. For example: [`tensorflow_datasets/datasets/bool_q/checksums.tsv`](https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/datasets/bool_q/checksums.tsv).\n",
" * For new datasets (implemented as folder): [`tensorflow_datasets/`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/)`<type>/<dataset_name>/checksums.tsv`. For example: [`tensorflow_datasets/datasets/bool_q/checksums.tsv`](https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/datasets/bool_q/checksums.tsv).\n",
"\n",
" You can find the dataset source location in [our catalog](https://www.tensorflow.org/datasets/catalog/overview).\n",
" * For old datasets: [`tensorflow_datasets/url_checksums/\u003cdataset_name\u003e.txt`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/url_checksums)\n",
" * For old datasets: [`tensorflow_datasets/url_checksums/<dataset_name>.txt`](https://github.com/tensorflow/datasets/tree/master/tensorflow_datasets/url_checksums)\n",
"\n",
"### Fixing `NonMatchingChecksumError`\n",
"\n",
Expand Down
Loading