Skip to content

Commit 4a480ea

Browse files
marcenacpThe TensorFlow Datasets Authors
authored andcommitted
Update documentation about data source signatures.
PiperOrigin-RevId: 628361784
1 parent cae03e1 commit 4a480ea

File tree

1 file changed

+4
-7
lines changed

1 file changed

+4
-7
lines changed

docs/data_source.ipynb

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -172,21 +172,18 @@
172172
"following protocol:\n",
173173
"\n",
174174
"```python\n",
175+
"from typing import SupportsIndex\n",
176+
"\n",
175177
"class RandomAccessDataSource(Protocol):\n",
176178
" \"\"\"Interface for datasources where storage supports efficient random access.\"\"\"\n",
177179
"\n",
178180
" def __len__(self) -> int:\n",
179181
" \"\"\"Number of records in the dataset.\"\"\"\n",
180182
"\n",
181-
" def __getitem__(self, record_key: int) -> Sequence[Any]:\n",
182-
" \"\"\"Retrieves records for the given record_keys.\"\"\"\n",
183+
" def __getitem__(self, key: SupportsIndex) -> Any:\n",
184+
" \"\"\"Retrieves the record for the given key.\"\"\"\n",
183185
"```\n",
184186
"\n",
185-
"**Warning**: the API is still under active development. Notably, at this point,\n",
186-
"`__getitem__` must support both `int` and `list[int]` in input. In the future,\n",
187-
"it will probably only support `int` as per\n",
188-
"[the standard](https://docs.python.org/3/reference/datamodel.html#object.__getitem__).\n",
189-
"\n",
190187
"The underlying file format needs to support efficient random access. At the\n",
191188
"moment, TFDS relies on [`array_record`](https://github.com/google/array_record).\n",
192189
"\n",

0 commit comments

Comments
 (0)